Bilingual prosody: Meter, rhythm, and intonation in multilingual contact situations
(este página em português / en español)
About the project
The project has two main objectives. On the one hand, we provide acoustic speech data of American languages in a way that will enable linguists worldwide to integrate these languages into their research perspectives. In this way, we aim to contribute to broadening the typological basis of theory formation. At the same time, by giving them a more prominent position in the scientific linguistic discourse, we intend to make those languages and the unique forms of human cultural and social experience they express more visible in times when they are under extreme threat (http://gbs.uni-koeln.de/wordpress/).
The second objective concerns the focus of our own descriptive and theoretical work. We want to achieve a better understanding of prosody under the conditions of multilingual competence. The phonological form of an utterance is to be explained from the interplay of metrical, rhythmical and intonational rules or constraints, which are determined in different ways by the individual and universal competencies of multilingual speakers. We achieve a high degree of comparability of the data by using the same recording procedures in all communities, which aim to elicit typical conversational moves for processing the common ground in a controlled fashion. We give priority to communicative games, in which the type of discourse, the incremental development of the context and lexical material are controlled, over procedures that exert a very tight control at the sentence level, such as elicitation from completely scripted utterances or Discourse Completion Tasks, since such experiments constitute a highly unfamiliar situation for the speakers and we want to create speech data that is as natural as possible.
In the project's first phase (2015-2017), we recorded spoken Conchucos Quechua in the Ancash province in Peru, explored its prosody empirically, and described it theoretically.
For the second, currently running phase (2018-2021), we are expanding the project through our network of French, Chilean, and Mexican partners, who work in regions where other American indigenous languages as well as Spanish or Portuguese are spoken.
The resulting data from all of these projects is morphologically transcribed and annotated with metadata in English, Spanish, and Portuguese, and it will be made freely available in an online repository.
The project is led by Prof. Dr. Uli Reich and supported by the Deutsche Forschungsgemeinschaft (DFG project number 274614727).
Cooperations
Language | Country | Project headed by | Collaborators |
Coordination and repository |
Germany |
Raúl Italo Bendezú Araujo, Timo Buchholz, Elizabeth Pankratz |
|
Tepehuano |
Mexico |
Nadiezdha Torres |
|
Otomí |
Mexico |
Alonso Guerrero Galván |
Ewald Hekking, Aurelio Nuñez López, Lorena Gamper |
Maya Yucateco |
Mexico |
to be determined |
to be determined |
Nheengatú |
Brazil |
Uli Reich |
Antônio Lessa |
Quechua de Conchucos |
Peru |
Uli Reich |
Raúl Italo Bendezú Araujo, Timo Buchholz |
Guaraní |
Paraguay |
Élodie Blestel |
Hedy Penner, Uli Reich |
Mapudungún |
Chile |
Magaly Ruiz |
Aldo Olate, Jaqueline Caniguan |
Type and structure of the recordings
The language data gathered in this project are recordings of elicitation experiments. In each experiment, the speakers are recorded while solving different communicative tasks in the form of a game. All bilingual speakers carry out each experiment twice, once in their local non-Romance variety and once in their local Romance variety. In none of these elicitation experiments, the utterances are productions of a scripted text. By selecting the materials (see "Metrical control for experiment materials" in Spanish or English), which are carefully adapted for each language and region with the indispensable help of local experts, as well as through the rules of the different games, we retain some control over the content of the conversation as a whole, but the content and form of each individual utterance are spontaneously chosen by the speakers. Together with our cooperation partners in the respective countries and regions, we have agreed on a core of common experiments. Partly, these are known from the literature, and all of them have been tested through our own experiences in Conchucos. They are to be carried out in all of the cooperating projects in order to achieve a very high degree of comparability. At the same time, here you will also find recordings of particular experiments that are carried out only in one or more specific regions, but not in all of them. This is due to the different specific research interests of the cooperation partners as well as local conditions.
General information and guidelines for a smooth execution of the recordings can be found here, in Spanish or English.
The individual experiments are briefly presented below. More detailed experiment descriptions (in Spanish and English) are given in the respective links.
Common Experiments
Imagenes (spa / eng): The speakers name objects that are shown to them on picture cards.
Memoria (spa / eng): The speakers play a version of the well-known game Memory, in which they have to identify and remember the positions of cards with certain images.
Maptask (spa / eng): The speakers conduct a conversation that simulates the giving and receiving of directions, but the maps they use do not match. The Maptask experiment was originally developed by Anderson et al. (1991), whose English-language corpus is located at the University of Edinburgh: http://groups.inf.ed.ac.uk/maptask/
Cuento (spa / eng): In an adapted version of the game Chinese Whispers, the speakers tell and retell a story invented by the researchers.
Quién (spa / eng): The speakers play a version of the game "Who am I?", in which one of them has to guess the identity of a person only the other knows about.
Particular Experiments
Cajas (spa / eng): The speakers try to guess the contents of several boxes without opening them. They discuss and negotiate until a consensus is reached on what is inside. (Conchucos)
Condir (spa / eng): An adapted version of a sociolinguistic interview with a monolingual speaker in which the interviewer contradicts what the speaker says. (Conchucos Quechua)
Available Corpora
The speech recordings produced as part of this research project are stored centrally in the online repository of the Freie Universität Berlin (Refubium), together with transcriptions and translations as well as metadata, and are thus freely accessible to the public as well as free for non-commercial use (Creative Commons License CC BY-NC-SA 4.0). You are very welcome to conduct your own linguistic research with our data and we would be pleased to hear from you also regarding any other feedback.
Each individual corpus is represented by 4 files in the repository:
1. The speech recording itself, usually the recording of a single experiment, in 16-bit PCM .wav format.
2. A file in .eaf format containing a transcription and morphological glosses aligned with the audio recording at the utterance level as well as a translation into Spanish and English (if the recording itself is in Spanish or Portuguese, no glosses are provided). The .eaf format belongs to the ELAN annotation programme, which was developed by the Max Planck Institute for Psycholinguistics and is freely available here: https://tla.mpi.nl/tools/tla-tools/elan/ There you can also find tutorials and guides on how to use the programme.
3. A file with the same information in .TextGrid format. The TextGrid format belongs to the Praat programme, which was developed by Paul Boersma and David Weenink at the University of Amsterdam and which is the most popular software for analyzing speech data in phonetics and phonology. Praat is also freely available at http://www.fon.hum.uva.nl/praat/ There you can also find tutorials and guides on how to use the programme.
4. A file in .pdf format with metadata on the recording, containing information on the experiment and the speaker.
In the following table you will find all already availabe published corpora. Their number will steadily increase as the project progresses. By clicking on the name of the corpus you will get directly to the corresponding page in the repository of the FU, where you can download the files for each experiment individually.
Corpus | Region | Researchers | Experiment types included | language(s) |
Quechua 1 | Conchucos, Peru |
Bendezú Araujo, Raúl Buchholz,Timo Reich, Uli |
common: Memoria (7x) Maptask (7x) Cuento (7x) Quién (4 x) particular: Cajas (4x) Condir (1x) |
Quechua |
Recommendation for how to cite (using the example of Quechua 1):
Bendezú Araujo, Raúl, Timo Buchholz & Uli Reich. 2019. Corpora of American languages: Interactive language games from multilingual Latin America (Quechua 1). Berlin: Freie Universität. http://dx.doi.org/10.17169/refubium-25510
What's happening: News and publications
- Octobre 28, 2019: The first part of the Conchucos data (Quechua 1) is now online in the Refubium and freely available for all
- August-Octobre, 2019: Uli Reich was in Asunción (Paraguay) together with Élodie Blestel (Paris III), and in São Gabriel da Cachoeira (Brazil) with Antônio Lessa, to make recordings of Guaraní and Nheengatu for this project
- April 23, 2019: Publication in the FU-Tagesspiegel-Beilage (in German) about indigenous languages and the work done by this project
- Buchholz, Timo & Uli Reich. 2018. The realizational coefficient: Devising a method for empirically determining prominent positions in Conchucos Quechua. In Ingo Feldhausen, Jan Fliessbach & Maria d. M. Vanrell (eds.), Methods in prosody: A Romance language perspective (Studies in Laboratory Phonology 6), 123–164. Berlin: Language Science Press.
(available here) (publication website)
- Reich, Uli. 2018. Presupposed Modality. In Marco García García & Melanie Uth (eds.), Focus realization in Romance and beyond (Studies in language companion series Volume 201), 203–227. Amsterdam, Philadelphia: John Benjamins.
Project staff
Timo Buchholz, Raúl Italo Bendezu Araujo, Elizabeth Pankratz
The creation and publication of language corpora is time-consuming and requires many steps: the experiments have to be designed and tested and adapted with the help of local experts. Materials have to be selected and created. Care must be taken to produce technically clean recordings. Then follows the linguistic processing of the raw data: transcription, translation and morphological glossing, according to uniform and comprehensible criteria (as far as possible based on the Leipzig Glossing Rules). Finally, the data must be technically processed in such a way that they are suitable for online publication. All of this requires many different participants who all make their contribution. Our thanks go to everyone!
Contributors Conchucos Quechua and Spanish
Academic direction | Technical direction | Local cooperation (Huaraz and Huari) | Transcription and translation Quechua (Huaraz) | Glosses Quechua (Lima) | Transcription and translation Spanish (Berlin) |
Raúl Bendezú Araujo Timo Buchholz Uli Reich |
Elizabeth Pankratz |
Gabriel Barreto Leonel Menacho Lopez |
Yuli Alicia Cadillo Tarazona Merlín de la Cruz Huayanay Efraín Rodolfo Montes Palacios Leidy Felyna Rosales Gonzales Jeny Elvira Rosas Julca Nelson Yonatan Sánchez Evaristo Marco Antonio Trigoso Aching |
Loreta Alva Mansilla Claudia Arbaiza Varela Minerva Lucero Cerna Maguiña Freyda Nisbeth Schuler Tovar Alonso Vásquez Aguilar |
Magalí Bertola Catalina Torres Orjuela |
Resources
- List of morphological glosses (Spanish and English)
- General considerations about the recording procedure (Spanish / English)
- Manuals on how to conduct the experiments
- Metrical control for experiment materials (Spanish / English)