Beyond MT metrics in specialised translation: Automated and manual evaluation of machine translation output for freelance translators and small LSPs in the context of EU documents

Authors

DOI:

https://doi.org/10.26881/bp.2020.4.02

Keywords:

machine translation, neural MT, institutional translation, MT evaluation, specialised translation

Abstract

This paper discusses simplified methods of translation evaluation in two seemingly disparate areas: machine translation (MT) technology and translation for EU institutions. It provides a brief overview of methods for evaluating MT output and proposes simplified solutions for small LSPs and freelancers dealing with specialised translation of this kind. After discussing the context of the study and the process of machine translation, an analysis of fragments of the selected specialist text (an EU regulation) is carried out. The official English and Polish versions of this document provide the basis for a comparative evaluation of raw machine translation output obtained with selected commercially available (paid) neural machine translation engines (NMT). Quantitative analysis, including the Damerau-Levenshstein edit distance parameters and the number of erroneous segments in the text, combined with a manual qualitative analysis of errors and terminology can be a serviceable method for small LSPs and freelance translators to evaluate the usefulness of neural machine translation engines.

Downloads

Download data is not yet available.

References

Aziz, Wilker, Sheila C. M. de Sousa, Lucia Specia (2012). “PET: A tool for post-editing and assessing machine translation”. In: Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis (eds.). Proceedings of the 16th Annual Conference of the European Association for Machine Translation. Istanbul: European Language Resources Association (ELRA), 3982–3987.

Bogucki, Łukasz (2009). Tłumaczenie wspomagane komputerowo. Warszawa: PWN.

Bojar, Ondřej (2017). English-to-Czech MT: Large Data and Beyond. Habilitation Thesis. Prague: Institute of Formal and Applied Linguistics, Charles University.

Chesterman, Andrew (1997). Memes of Translation. The Spread of Ideas in Translation Theory. Amsterdam – Philadelphia: John Benjamins.

Damerau, Frederick J. (1964). “A technique for computer detection and correction of spelling errors”. Communications of the ACM 7 (3): 171–176.

Daems, Joke, Sonia Vandepitte, Robert J. Hartsuiker, Lieve Macken (2017). “Translation methods and experience: A comparative analysis of human translation and post-editing with students and professional translators”. Meta 62/2: 245–70.

European Union (2013). “Regulation (EU) No 1308/2013 of the European Parliament and of the Council of 17 December 2013 establishing a common organisation of the markets in agricultural products and repealing Council Regulations (EEC) No 922/72, (EEC) No 234/79, (EC) No 1037/2001 and (EC) No 1234/2007”. Official Journal of the European Union, OJ L 347, 20.12.2013, 671–854.

European Union (2021). “Commission Implementing Regulation (EU) 2021/28 of 14 January 2021 amending Council Regulation (EC) No 1362/2000 as regards the Union tariff quota for bananas originating in Mexico”. Official Journal of the European Union. OJ L 12, 15.1.2021, 1–2.

Farrell, Michael (2018). “Raw output evaluator, a freeware tool for manually assessing raw outputs from different machine translation engines”. In: David Chambers, Joanna Drugan, João Esteves-Ferreira, Juliet Margaret Macan, Ruslan Mitkov, Olaf-Michael Stefanov (eds.). Proceedings of the 40th Conference Translating and the Computer, London, UK, November 15-16, 2018. London: International Society for Advancement in Language Technology Asling, 38–49.

Fischer, Lukas, Samuel Läubli (2020). “What’s the difference between professional human and machine translation? A blind multi-language study on domain-specific MT”. In: André Martins, Helena Moniz, Sara Fumega, Bruno Martins, Fernando Batista, Luisa Coheur, Carla Parra, Isabel Trancoso, Marco Turchi, Arianna Bisazza, Joss Moorkens, Ana Guerberof, Mary Nurminen, Lena Marg, Mikel L. Forcada (eds.). Proceedings of the 22nd Annual Conference of the European Association for Machine Translation. Lisboa: European Association for Machine Translation, 215–224.

Krings, Hans P. (2001). Repairing Texts: Empirical Investigations of Machine Translation Postediting Processes. Kent: The Kent State University Press.

Kur, Maciej (2020). Feasibility of DeepL, Google and Microsoft MT Systems Implementation into the Translation Process. Gdańsk: Wydawnictwo Uniwersytetu Gdańskiego.

Lardilleux, Adrien, Yves Lepage (2018). “CHARCUT: Human-targeted character-based MT evaluation with loose differences”. In: Sakriani Sakti, Masao Utiyama (eds.). Proceedings of the 14th International Workshop on Spoken Language Translation, Tokyo, Japan, December 14th-15th, 2017. Tokyo: IWSLT, 146–153.

Läubli, Samuel, Sheila Castilho, Graham Neubig, Rico Sennrich, Qinlan Shen, Antonio Toral (2020). “A set of recommendations for assessing human–machine parity in language translation”. Journal of Artificial Intelligence Research 67: 653–672.

Łoboda, Krzysztof (2012). “Praktyczne i dydaktyczne aspekty przekładu dokumentów instytucji Unii Europejskiej: charakterystyka tekstów, narzędzi i problemów terminologicznych”. In: Maria Piotrowska, Joanna Dybiec-Gajer (eds.) Przekład – teorie, terminy, terminologia. Język a komunikacja 30. Kraków: Tertium, 161–169.

Martínez Mateo, Roberto (2014). “A deeper look into metrics for translation quality assessment (TQA): A case study”. Miscelánea: A Journal of English and American Studies 49: 73–94.

Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean (2013). “Distributed representations of words and phrases and their compositionality”. In: Christopher J. C. Burges, Léon Bottou, Max Welling (eds.). Proceedings of the 26th International Conference on Neural Information Processing Systems – Vol. 2, December 2013.New York: Curran Associates, 3111–3119.

Moorkens, Joss, Sharon O’Brien (2017). “Assessing user interface needs of post-editors of machine translation”. In: Dorothy Kenny (ed.). Human Issues in Translation Technology: The IATIS Yearbook. Florence: Taylor and Francis, 110–130.

Papineni, Kishore, Salim Roukos, Todd Ward, Wei-Jing Zhu (2002). “BLEU: a method for automatic evaluation of machine translation”. In: Pierre Isabelle, Eugene Charniak, Dekang Lin (eds.). ACL-2002: 40th Annual meeting of the Association for Computational Linguistics. Pennsylvania: Association for Computational Linguistics, 311–318.

Popel, Martin, Marketa Tomkova, Jakub Tomek, Łukasz Kaiser, Jakob Uszkoreit, Ondřej Bojar, Zdeněk Žabokrtský (2020). “Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals”. Nature Communications 11, 4381. Available at <https://doi.org/10.1038/s41467-020-18073-9>. Accessed 14.03.2021.

Popović, Maja, Arle Lommel, Aljoscha Burchardt, Eleftherios Avramidis, Hans Uszkoreit (2014). “Relations between different types of post-editing operations, cognitive effort and temporal effort”. In: Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way (eds.). Proceedings of the 17th Annual conference of the European Association for Machine Translation. Dubrovnik: European Association for Machine Translation, 191–198.

Popović, Maja (2015). “CHRF: character n-gram F-score for automatic MT evaluation”. In: Ondřej Bojar, Rajan Chatterjee, Christian Federmann, Barry Haddow, Chris Hokamp, Matthias Huck, Varvara Logacheva, Pavel Pecina (eds.). Proceedings of the Tenth Workshop on Statistical Machine Translation. Lisbon: Association for Computational Linguistics, 392–395.

Publications Office (2021). Long-lived worms hold the secret for healthy ageing in humans. CORDIS EU Research Results. Available at <https://cordis.europa.eu/article/id/428745-long-lived-worms-hold-the-secret-for-healthy-ageing-in-humans>. Accessed 14.03.2021.

Quah, Chiew Kin (2006). Translation and Technology. Basingstoke /New York: Palgrave Macmillan.

Rinsche, Adriane, Nadia Portera-Zanotti (2009). The Size of the Language Industry in the EU. Studies on Translation and Multilingualism. Brussels: European Commission.

Rossi, Caroline, Jean-Pierre Chevrot (2019). “Uses and perceptions of machine translation at the European Commission”. The Journal of Specialised Translation 31: 178–200.

Snover, Matthew, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, John Makhoul (2006). “A study of translation edit rate with targeted human annotation”. Proceedings of the 7th Conference of the Association for Machine Translation in the Americas (AMTA). Cambridge, USA: AMTA, 223–231.

Strandvik, Ingemar (2017). “Evaluation of outsourced translations. State of play in the European Commission’s Directorate-General for Translation (DGT)”. In: Tomáš Svoboda, Łucja Biel, Krzysztof Łoboda (eds.). Quality Aspects in Institutional Translation. Berlin: Language Science Press, 123–137.

Tabakowska, Elżbieta (1999). O przekładzie na przykładzie. Kraków: Znak.

Toral, Antonio, Sheila Castilho, Ke Hu, Andy Way (2018). “Attaining the unattainable? Reassessing claims of human parity in neural machine translation”. In: Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor (eds.) Proceedings of the Third Conference on Machine Translation: Research Papers, Vol. 1. Brussels: Association for Computational Linguistics, 112–123.

Toral, Antonio, Víctor M. Sánchez-Cartagena (2017). “A multifaceted evaluation of neural versus phrase-based machine translation for 9 language directions”. In: Mirella Lapata, Phil Blunsom, Alexander Koller (eds.). Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Valencia: Association for Computational Linguistics, 1063–1073.

Unia Europejska (2013). “Rozporządzenie Parlamentu Europejskiego i Rady (UE) nr 1308/2013 z dnia 17 grudnia 2013 r. ustanawiające wspólną organizację rynków produktów rolnych oraz uchylające rozporządzenia Rady (EWG) nr 922/72, (EWG) nr 234/79, (WE) nr 1037/2001 i (WE) nr 1234/2007”, Dziennik Urzędowy Unii Europejskiej, Dz.U. L 347, 20.12.2013. 671–854.

Downloads

Published

2020-09-29

How to Cite

Łoboda, K. . (2020). Beyond MT metrics in specialised translation: Automated and manual evaluation of machine translation output for freelance translators and small LSPs in the context of EU documents. Beyond Philology An International Journal of Linguistics, Literary Studies and English Language Teaching, (17/4), 45–73. https://doi.org/10.26881/bp.2020.4.02

Issue

Section

Articles