Method of measuring the effort related to post-editing machine translated outputs produced in the English>Polish language pair by Google, Microsoft and DeepL MT engines: A pilot study
Keywords:machine translation, English->Polish language pair, post-editing, post-editing effort, pilot study, machine translation engines
This article presents the methodology and results of a pilot study concerning the impact of three popular and widely accessible machine translation engines (developed by Google, Microsoft and DeepL companies) on the pace of post-editing work and on the general effort related to post-editing of raw MT outputs. Fourteen volunteers were asked to translate and post-edit two source texts of similar characters and levels of complexity. The results of their work were collected and compared to develop a set of quantitative and qualitative data, which was later used to make assumptions related to the general rate of postediting work and the quality of the post-edited sentences produced by the subjects. The aim of the pilot study described below was to determine whether the applied method can be successfully used in more profound studies on the quality and impact of machine translation in the English->Polish language pair and on the potential of MT solutions on the Polish translation market.
Allen, Jeffrey (2003). “Post-editing”. In: Harold Somers (ed.). Computers and Translation: A Translator’s Guide. Amsterdam – Philadelphia: John Benjamins, 297-317.
Avramidis, Eleftherios (2017). “Comparative quality estimation for machine translation observations on machine learning and features”. The Prague Bulletin of Mathematical Linguistics 108/1: 307-318.
Bengio, Yoshua, Rejean Ducharme, Pascal Vincent, Christian Jauvin (2003). “A neural probabilistic language model”. Journal of Machine Learning Research 3: 1137-1155.
Bentivogli, Luisa, Arianna Bisazza, Mauro Cettolo, Marcello Federico (2016). “Neural versus phrase-based machine translation quality: A case study”. In: Jian Su, Kevin Duh, Xavier Carreras (eds.). Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, USA: Association for Computational Linguistics, 257-267.
Bojar, Ondrej, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aurelie Neveol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, Marcos Zampieri (2016). “Findings of the 2016 Conference on Machine Translation (WMT16)”. In: Ondrej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aurelie Neveol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, Marcos Zampieri (eds.). Proceedings of the First Conference on Machine Translation. Volume 2: Shared Task Papers. Berlin: Association for Computational Linguistics, 131-198.
Callison-Burch, Chris, Cameron Fordyce, Philipp Koehn, Christof Monz, Josh Schroeder (2007). “(Meta-) Evaluation of machine translation”. In: Chris Callison-Burch, Philipp Koehn (eds.). StatMT'07 Proceedings of the Second Workshop on Statistical Machine Translation. Stroudsburg, USA: Association for Computational Linguistics, 136-158.
Fiederer, Rebecca, Sharon O’Brien (2009). “Quality and machine translation: A realistic objective?”. The Journal of Specialised Translation 11: 52-74. Available at <https://www.jostrans.org/issue11/art_fiederer_obrien.pdf>. Accessed 12.07.2019.
Graham, Yvette, Quingsong Ma, Timothy Baldwin, Qun Liu, Carla Parra, Carolina Scarton, (2017). “Improving evaluation of document-level machine translation quality estimation”. In: Mirella Lapata, Phil Blunsom, Alexander Koller (eds.). Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2: Short Papers. Valencia: Association for Computational Linguistics, 356-361.
Han, Lifeng, Derek F. Wong, Lidia S. Chao (2017). “Machine Translation Evaluation Resources and Methods: A Survey”. Cornell University Library. Available at <https://arxiv.org/abs/1605.04515>. Accessed 12.07.2019.
Hutchins, John, Harold L. Somers (1992). An Introduction to Machine Translation. London: Academic Press.
Koehn, Philipp (2010). Statistical Machine Translation. Cambridge: Cambridge University Press.
Snover, Matthew, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, John Makhoul (2006). “A study of translation edit rate with targeted human annotation”. In: Laurie Gerber, Nizar Habash, Alon Lavie (eds.). Proceedings of the 7th Conference of the Association for Machine Translation in the Americas. Cambridge, USA: AMTA, 223-231.
Specia, Lucia, Atefeh Farzindar (2010). “Estimating machine translation post-editing effort with HTER”. Conference Paper at AMTA 2010-workshop, Bringing MT to the User: MT Research and the Translation Industry. Denver, USA, 31.10–4.11.2010.
White, John, Theresa O’Connel, Francis O’Mara (1994). “The ARPA MT evaluation methodologies: Evolution, lessons, and future approaches”. In: Muriel Vasconcellos( ed.). Proceedings of the First Conference of the Association for Machine Translation in the Americas. Columbia, USA: AMTA, 193-205.
Wu, Yonghui, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi (2016). “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation”. Cornell University Library. Available at <https://arxiv.org/pdf/1609.08144.pdf>. Accessed 12.07.2019.