Springe direkt zu Inhalt

Bilingual prosody: Meter, rhythm, and intonation in multilingual contact situations


(este página em português / en español)


About the project

The project has two main objectives. On the one hand, we provide acoustic speech data of American languages in a way that will enable linguists worldwide to integrate these languages into their research perspectives. In this way, we aim to contribute to broadening the typological basis of theory formation. At the same time, by giving them a more prominent position in the scientific linguistic discourse, we intend to make those languages and the unique forms of human cultural and social experience they express more visible in times when they are under extreme threat (http://gbs.uni-koeln.de/wordpress/).

The second objective concerns the focus of our own descriptive and theoretical work. We want to achieve a better understanding of prosody under the conditions of multilingual competence. The phonological form of an utterance is to be explained from the interplay of metrical, rhythmical and intonational rules or constraints, which are determined in different ways by the individual and universal competencies of multilingual speakers. We achieve a high degree of comparability of the data by using the same recording procedures in all communities, which aim to elicit typical conversational moves for processing the common ground in a controlled fashion. We give priority to communicative games, in which the type of discourse, the incremental development of the context and lexical material are controlled, over procedures that exert a very tight control at the sentence level, such as elicitation from completely scripted utterances or Discourse Completion Tasks, since such experiments constitute a highly unfamiliar situation for the speakers and we want to create speech data that is as natural as possible.

In the project's first phase (2015-2017), we recorded spoken Conchucos Quechua in the Ancash province in Peru, explored its prosody empirically, and described it theoretically.

For the second, currently running phase (2018-2021), we are expanding the project through our network of French, Chilean, and Mexican partners, who work in regions where other American indigenous languages as well as Spanish or Portuguese are spoken.

The resulting data from all of these projects is morphologically transcribed and annotated with metadata in English, Spanish, and Portuguese, and it will be made freely available in an online repository.

The project is led by Prof. Dr. Uli Reich and supported by the Deutsche Forschungsgemeinschaft (DFG project number 274614727).



Language Country Project headed by Collaborators

Coordination and repository


Uli Reich

Raúl Italo Bendezú Araujo, Timo Buchholz, Elizabeth Pankratz



Nadiezdha Torres




Alonso Guerrero Galván

Ewald Hekking, Aurelio Nuñez López, Lorena Gamper

Maya Yucateco


to be determined

to be determined



Uli Reich

Antônio Lessa

Quechua de Conchucos


Uli Reich

Raúl Italo Bendezú Araujo, Timo Buchholz



Élodie Blestel

Hedy Penner, Uli Reich



Magaly Ruiz

Aldo Olate, Jaqueline Caniguan


Type and structure of the recordings

The language data gathered in this project are recordings of elicitation experiments. In each experiment, the speakers are recorded while solving different communicative tasks in the form of a game. All bilingual speakers carry out each experiment twice, once in their local non-Romance variety and once in their local Romance variety. In none of these elicitation experiments, the utterances are productions of a scripted text. By selecting the materials (see "Metrical control for experiment materials" in Spanish or English), which are carefully adapted for each language and region with the indispensable help of local experts, as well as through the rules of the different games, we retain some control over the content of the conversation as a whole, but the content and form of each individual utterance are spontaneously chosen by the speakers. Together with our cooperation partners in the respective countries and regions, we have agreed on a core of common experiments. Partly, these are known from the literature, and all of them have been tested through our own experiences in Conchucos. They are to be carried out in all of the cooperating projects in order to achieve a very high degree of comparability. At the same time, here you will also find recordings of particular experiments that are carried out only in one or more specific regions, but not in all of them. This is due to the different specific research interests of the cooperation partners as well as local conditions.

General information and guidelines for a smooth execution of the recordings can be found here, in Spanish or English.

The individual experiments are briefly presented below. More detailed experiment descriptions (in Spanish and English) are given in the respective links.

Common Experiments

Imagenes (spa / eng): The speakers name objects that are shown to them on picture cards.

Memoria (spa / eng): The speakers play a version of the well-known game Memory, in which they have to identify and remember the positions of cards with certain images.

Maptask (spa / eng): The speakers conduct a conversation that simulates the giving and receiving of directions, but the maps they use do not match. The Maptask experiment was originally developed by Anderson et al. (1991), whose English-language corpus is located at the University of Edinburgh: http://groups.inf.ed.ac.uk/maptask/

Cuento (spa / eng): In an adapted version of the game Chinese Whispers, the speakers tell and retell a story invented by the researchers.

Quién (spa / eng): The speakers play a version of the game "Who am I?", in which one of them has to guess the identity of a person only the other knows about.

Particular Experiments

Cajas (spa / eng): The speakers try to guess the contents of several boxes without opening them. They discuss and negotiate until a consensus is reached on what is inside. (Conchucos)

Condir (spa / eng): An adapted version of a sociolinguistic interview with a monolingual speaker in which the interviewer contradicts what the speaker says. (Conchucos Quechua)

Available Corpora

The speech recordings produced as part of this research project are stored centrally in the online repository of the Freie Universität Berlin (Refubium), together with transcriptions and translations as well as metadata, and are thus freely accessible to the public as well as free for non-commercial use (Creative Commons License CC BY-NC-SA 4.0). You are very welcome to conduct your own linguistic research with our data and we would be pleased to hear from you also regarding any other feedback.

Each individual corpus is represented by 4 files in the repository:

1. The speech recording itself, usually the recording of a single experiment, in 16-bit PCM .wav format.

2. A file in .eaf format containing a transcription and morphological glosses aligned with the audio recording at the utterance level as well as a translation into Spanish and English (if the recording itself is in Spanish or Portuguese, no glosses are provided). The .eaf format belongs to the ELAN annotation programme, which was developed by the Max Planck Institute for Psycholinguistics and is freely available here: https://tla.mpi.nl/tools/tla-tools/elan/ There you can also find tutorials and guides on how to use the programme.

3. A file with the same information in .TextGrid format. The TextGrid format belongs to the Praat programme, which was developed by Paul Boersma and David Weenink at the University of Amsterdam and which is the most popular software for analyzing speech data in phonetics and phonology. Praat is also freely available at http://www.fon.hum.uva.nl/praat/ There you can also find tutorials and guides on how to use the programme.

4. A file in .pdf format with metadata on the recording, containing information on the experiment and the speaker.

In the following table you will find all already availabe published corpora. Their number will steadily increase as the project progresses. By clicking on the name of the corpus you will get directly to the corresponding page in the repository of the FU, where you can download the files for each experiment individually.

Corpus Region Researchers Experiment types included language(s)
Quechua 1 Conchucos, Peru

Bendezú Araujo, Raúl


Reich, Uli


Memoria (7x)

Maptask (7x)

Cuento (7x)

Quién (4 x)


Cajas (4x)

Condir (1x)


Recommendation for how to cite (using the example of Quechua 1):

Bendezú Araujo, Raúl, Timo Buchholz & Uli Reich. 2019. Corpora of American languages: Interactive language games from multilingual Latin America (Quechua 1). Berlin: Freie Universität. http://dx.doi.org/10.17169/refubium-25510

What's happening: News and publications

  • Octobre 28, 2019: The first part of the Conchucos data (Quechua 1) is now online in the Refubium and freely available for all
  • August-Octobre, 2019: Uli Reich was in Asunción (Paraguay) together with Élodie Blestel (Paris III), and in São Gabriel da Cachoeira (Brazil) with Antônio Lessa, to make recordings of Guaraní and Nheengatu for this project
  • Buchholz, Timo & Uli Reich. 2018. The realizational coefficient: Devising a method for empirically determining prominent positions in Conchucos Quechua. In Ingo Feldhausen, Jan Fliessbach & Maria d. M. Vanrell (eds.), Methods in prosody: A Romance language perspective (Studies in Laboratory Phonology 6), 123–164. Berlin: Language Science Press.

     (available here) (publication website)

  • Reich, Uli. 2018. Presupposed Modality. In Marco García García & Melanie Uth (eds.), Focus realization in Romance and beyond (Studies in language companion series Volume 201), 203–227. Amsterdam, Philadelphia: John Benjamins.

     (publication website)

Project staff

Timo BuchholzRaúl Italo Bendezu Araujo, Elizabeth Pankratz

The creation and publication of language corpora is time-consuming and requires many steps: the experiments have to be designed and tested and adapted with the help of local experts. Materials have to be selected and created. Care must be taken to produce technically clean recordings. Then follows the linguistic processing of the raw data: transcription, translation and morphological glossing, according to uniform and comprehensible criteria (as far as possible based on the Leipzig Glossing Rules). Finally, the data must be technically processed in such a way that they are suitable for online publication. All of this requires many different participants who all make their contribution. Our thanks go to everyone!

Contributors Conchucos Quechua and Spanish

Academic direction Technical direction Local cooperation (Huaraz and Huari) Transcription and translation Quechua (Huaraz) Glosses Quechua (Lima) Transcription and translation Spanish (Berlin)

Raúl Bendezú Araujo

Timo Buchholz

Uli Reich

Elizabeth Pankratz

Gabriel Barreto

Leonel Menacho Lopez

Yuli Alicia Cadillo Tarazona

Merlín de la Cruz Huayanay

Efraín Rodolfo Montes Palacios

Leidy Felyna Rosales Gonzales

Jeny Elvira Rosas Julca

Nelson Yonatan Sánchez Evaristo

Marco Antonio Trigoso Aching

Loreta Alva Mansilla

Claudia Arbaiza Varela

Minerva Lucero Cerna Maguiña

Freyda Nisbeth Schuler Tovar

Alonso Vásquez Aguilar

Magalí Bertola

Catalina Torres Orjuela