Using the LARA platform to crowdsource a multilingual, multimodal Little Prince

Authors

DOI:

https://doi.org/10.26881/bp.2022.1.09

Keywords:

Computer Assisted Language Learning, multimedia, crowdsourcing, English, Farsi, French, Icelandic, Irish, Italian, Japanese, Mandarin, Polish

Abstract

We describe an ongoing project, in which an informally organised international consortium is using the open source LARA platform to create multimodal annotated editions of Antoine de Saint-Exupéry’s Le petit prince in multiple languages, so far French, English, Italian, Icelandic, Irish, Japanese, Polish, Farsi and Mandarin. LARA versions of the book include integrated audio and translations and an automatically generated lemma-based concordance, and are freely available online. We describe the methods used to construct the various versions. In some cases, work for a given language was simply divided by type, typically with one person adding translations and another recording audio. In other languages, we experimented with crowdsourcing methods, splitting the text into chapter-sized units and using the LARA platform to distribute these to multiple annotators, then combining the results at the end. Finally, we report an initial classroom study, where the French version was used by intermediate-level Australian students of French.

Downloads

Download data is not yet available.

References

ABAIR: An Sintéiseoir Gaeilge – The Irish Language Synthesiser ABAIR (2021). http://www.abair.ie. Accessed 20 July 2022.

Akhlaghi, Elham, Branislav Bédi, Matt Butterweck, Cathy Chua, Johanna Gerlach, Hanieh Habibi, Junta Ikeda, Manny Rayner, Sabina Sestigiani, Ghil'ad Zuckermann (2019). “Overview of LARA: A learning and reading assistant”. In: Proceedings of SLaTE 2019, 99-103.

Akhlaghi, Elham, Branislav Bédi,, Fatih Bektaş, Harald Berthelsen, Matt Butterweck, Cathy Chua, Catia Cucchiarini, Gülşen Eryiğit, Johanna Gerlach, Hanieh Habibi, Neasa Ní Chiaráin, Manny Rayner, Steinþór Steingrímsson, Helmer Strik (2020). “Constructing multimodal language learner texts using LARA: Experiences with nine languages”. In: Proceeding of the 12th Conference on Language Resources and Evaluation (LREC 2020), 323-331.

Akhlaghi, Elham, Anna Bączkowska, Harald Berthelsen, Branislav Bédi, Cathy Chua, Catia Cucchiarini, Hanieh Habibi, Ivana Horváthová, Pernille Hvalsøe, Roy Lotz, Christèle Maizonniaux, Neasa Ní Chiaráin, Manny Rayner, Nikos Tsourakis, Chunlin Yao (2021). “Assessing the quality of TTS audio in the LARA learning-by-read-ing platform”. In: Naouel Zoghlami, Cédric Brudermann, Muriel Grosbois, Linda Bradley, Sylvie Thouësny (eds.). CALL and professionalisation: short papers from EUROCALL 2021, 1-5.

Bédi, Branislav, Matt Butterweck, Cathy Chua, Johanna Gerlach, Birgitta Björg Guðmarsdóttir, Hanieh Habibi, Bjartur Örn Jónsson, Manny Rayner, Sigurður Vigfússon (2020). “LARA: An extensible open source platform for learning languages by reading”. In: Karen-Margrete Frederiksen, Sanne Larsen, Linda Bradley, Sylvie Thouësny (eds.). CALL for widening participation: short papers from EUROCALL 2020, 27-35.

Bédi, Branislav, Haraldur Bernharðsson, Cathy Chua, Birgitta Björg Guðmarsdóttir, Hanieh Habibi, Manny Rayner (2020). “Constructing an interactive Old Norse text with LARA”. In: Karen-Margrete Frederiksen, Sanne Larsen, Linda Bradley, Sylvie Thouësny (eds.). CALL for widening participation: short papers from EURO-CALL 2020, 20-26.

Butterweck, Matt, Cathy Chua, Hanieh Habibi, Manny Rayner, Ghil’ad Zuckermann (2019). “Easy construction of multimedia online language textbooks and linguistics papers with LARA”. In: Proceedings of ICERI 2019, 7302-7310.

Lyding, Verena, Lionel Nicolas, Branislav Bédi, Karen Fort (2018). “Introducing the European NETwork for Combining Language Learning and Crowdsourcing Techniques (enetCollect)”. In: Peppi Taalas, Juha Jalkanen, Linda Bradley, Sylvie Thouësny (eds.). Future-proof CALL: language learning as exploration and encounters – short papers from EUROCALL 2018, 176-181.

Masuda, Kyoko (2018). Cognitive Linguistics and Japanese Pedagogy: A Usage-based Approach to Language Learning and Instruction. De Gruyter.

Rayner, Manny, Matt Butterweck, Hanieh Habibi, Cathy Chua (2021). Constructing LARA Content. Online documentation. https://www. issco.unige.ch/en/research/projects/callector/LARADoc/build/h tml/index.html

Schmid, Helmut (1994) “Probabilistic Part-of-Speech Tagging Using Decision Trees”. In: Proceedings of the International Conference on New Methods in Language Processing, 154-162.

Steingrímsson, Steinþór, Örvar Kárason, Hrafn Loftsson (2019). “Augmenting a BiLSTM tagger with a morphological lexicon and a lexical category identification step”. In: Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2019, 1161-1168.

Downloads

Published

2022-03-14

How to Cite

Akhlaghi, E., Bączkowska, A., Bédi, B., Beedar, H., Chua, C., Cucchiarini, C., … Yao, C. (2022). Using the LARA platform to crowdsource a multilingual, multimodal Little Prince. Beyond Philology An International Journal of Linguistics, Literary Studies and English Language Teaching, (19/1), 245–278. https://doi.org/10.26881/bp.2022.1.09

Issue

Section

Articles