Rosetta Stone 300

The Rosetta Stone company has a new project out. Here’s the details, copied verbatim from the Linguist List email.

If you have 10-30 minutes and a keyboard or a microphone, please
consider making a submission to The Rosetta Project’s latest
volunteer-based linguistic documentation project:

The 300 Languages Project is a special effort by The Rosetta Project
(, part of The Long Now Foundation
(, to begin the construction of a universal corpus of
human language by collecting parallel text and audio in the world’s 300
most widely-spoken languages. The resulting collection will contain
thousands of volunteer-contributed public domain text documents and
audio recordings which will be made available to researchers and the
public alike via The Internet Archive, a free online digital library.

The 300 Languages Project seeks to develop an extensible protocol
and a set of scalable, low-cost (i.e., volunteer-based) methods and
standards for language documentation via the building of a “seed
corpus” – a corpus which starts small but is designed to grow.

The 300 Languages Project is collecting translations and recordings of
three important texts: the Swadesh List, the Universal Declaration of
Human Rights, and Genesis chapters 1-3. These texts were chosen
primarily for usefulness in research (e.g., the Swadesh list) and
breadth of existing translation (Genesis and the UDHR).

The 300 Languages Project is made possible through the support and
sponsorship of Distinguished Career Professor and speech technology
expert Dr. James K. Baker and is conducted in partnership with the
ALLOW initiative of the Center for Innovations in Speech and
Language at theLanguage Technologies Institute.

Looks quite cool, if I do say so myself.

