Machine translation of multiword units in the field of culinary tourism A case study

Main Article Content

Isabel Peñuelas Gil
https://orcid.org/0000-0002-4885-5783

Abstract

It is undeniable that machine translation has become a tool that is here to stay and that has transformed the way users approach the translation process. This phenomenon has had a significant impact in several areas, but it is especially noticeable in the context of tourism due to its international nature. It is becoming increasingly common for companies, particularly small and medium-sized ones, to use machine translation tools to reach a wider, multilingual audience. However, despite their popularity, these tools can offer very limited results in terms of quality and appropriateness. This work focuses on the study of the possibilities and limitations that machine translation systems present when dealing with multiword units within the field of culinary tourism. For this purpose, a monolingual corpus (ES) was compiled following the compilation protocol proposed by Seghiri (2017). The corpus includes thirty culinary tourism brochures and guides from different Spanish regions and is the origin of all the multi-word units, as well as their respective contexts. These units were then subjected to a machine translation process using four engines (DeepL, Google Translate, Microsoft Translator, and Yandex), which belong to the most widely used paradigms when it comes to machine translation for specific purposes. The results obtained were categorised following a modified version of the human evaluation system proposed by Ortiz Boix (2016), which allowed to identify the performance differences between some of the most popular engines and revealed the communicative obstacles users might face when dealing with phraseology.

Downloads

Article Details

How to Cite
Peñuelas Gil, I. (2024). Machine translation of multiword units in the field of culinary tourism: A case study. Hikma, 23(3), 1–27. https://doi.org/10.21071/hikma.v23i3.16992
Section
Articles

References

Álvarez Jurado, M. (2020). Adquisición y transmisión del conocimiento experto a través de la traducción de las guías turísticas de arquitectura. Onomázein, (NE VII), 1-17. https://doi.org/10.7764/onomazein.ne7.01 DOI: https://doi.org/10.7764/onomazein.ne7.01

Anthony, L. (2020). AntConc (Versión 3.5.9) [Programa de ordenador]. Waseda University. https://www.laurenceanthony.net/software/antconc/

Anthony, L. (2022). AntFileConverter (Version 2.0.2) [Programa de ordenador]. Waseda University. https://www.laurenceanthony.net/software

Austermühl, F. y Kortenbuck, A. (2012). A translator’s sword of Damocles? An introduction to machine translation. En F. Austermühl, Electronic tools for translation (3ª ed, pp. 153-176). Routledge.

Biber, D., Johansson, S., Leech, G., Conrad, S. y Finegan, E. (1999). Grammar of spoken and written English. Pearson Education Limited. https://doi.org/10.1075/z.232 DOI: https://doi.org/10.1075/z.232

Bowker, L. (2023). De-mystifying translation: introducing translation to non-translators. Routledge. https://doi.org/10.4324/9781003217718 DOI: https://doi.org/10.4324/9781003217718

Bowker, L. y Buitrago Ciro, J. (2019). Machine translation and global research: towards improved machine translation literacy in the scholarly community. Emerald Publishing. DOI: https://doi.org/10.1108/9781787567214

Brown, P., Cocke, J., Della Pietra, S., Della Pietra, V., Jelinek, F., Lafferty, J. D., Mercer, R. y Roossin, P. (1990). A statistical approach to language translation. Computational linguistics, 16(2), 79-85. https://aclanthology.org/J90-2002

Brown, P., Cocke, J., Della Pietra, S., Della Pietra, V., Jelinek, F., Mercer, R. y Roossin, P. (1988). A statistical approach to language translation. Proceedings of the 12th conference on Computational linguistics, 1, pp. 71-76. https://aclanthology.org/C88-1016 DOI: https://doi.org/10.3115/991635.991651

Carré, A., Kenny, D., Rossi, C., Sánchez-Gijón, P. y Torres-Hostench, O. (2022). Machine translation for language learners. En D. Kenny (Ed.), Machine translation for everyone: empowering users in the age of artificial intelligence (pp. 187-207). Language Science Press. https://doi.org/10.5281/zenodo.6760024

Corpas Pastor, G. (2013). Detección, descripción y contraste de las unidades fraseológicas mediante tecnologías lingüísticas. En I. Olza Moreno y E. Manero Richard (Coords.), Fraseopragmática (pp. 335-374). Frank & Time.

Costa-Jussà, M. R. y Fonollosa, J. A. (2015). Latest trends in hybrid machine translation and its applications. Computer Speech and Language, 32(1), 3-10. https://doi.org/10.1016/j.csl.2014.11.001 DOI: https://doi.org/10.1016/j.csl.2014.11.001

Fernández Nistal, P. (2020). Los corpus como herramienta de traducción para los traductores del siglo XXI: el caso del chorizo ibérico de bellota. En S. Álvarez Álvarez y M. T. Ortego Antón (Eds.), Perfiles estratégicos de traductores e intérpretes. La transmisión de la información experta multilingüe en la sociedad del conocimiento del siglo XXI (pp. 143-160). Comares.

Jackendoff, R. (1997). The architecture of the language faculty. MIT Press. https://doi.org/10.2307/417010 DOI: https://doi.org/10.2307/417010

Kenny, D. (2022). Human and machine translation. En D. Kenny (Ed.), Machine translation for everyone: empowering users in the age of artificial intelligence (pp. 23-49). Language Science Press. https://doi.org/10.5281/zenodo.6653406

Kilgarriff, A., Rychlý, P., Smrž, P. y Tugwell, D. (2004). The Sketch Engine. Proceedings of the 11th EURALEX International Congress, pp. 105-116.

Koehn, P., Och, F. J. y Marcu, D. (2003). Statistical phrase-based translation. Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp. 127-133. https://aclanthology.org/N03-1017 DOI: https://doi.org/10.3115/1073445.1073462

Ministerio de Industria, Comercio y Turismo. (2023, 2 de febrero). Datos de Frontur y Egatur del INE [Nota de prensa]. http://www.mincotur.gob.es/es-es/GabinetePrensa/NotasPrensa/2023/Paginas/En-2022-visitaron-España-71,6-millones-de-turistas-internacionales-que-realizaron-un-gasto-de-87.061-millones-de-euros.aspx

Mitkov, R., Seretan, V., Corpas Pastor, G. y Monti, J. (2018). Multiword units in machine translation and translation technology. En R. Mitkov, V. Seretan, G. Corpas Pastor y J. Monti (Eds.), Multiword units in machine translation and translation technology (pp. 1-38). John Benjamins. https://doi.org/10.1075/cilt.341.01mon DOI: https://doi.org/10.1075/cilt.341.01mon

Moorkens, J. (2022). Ethics and machine translation. En D. Kenny (Ed.), Machine translation for everyone: empowering users in the age of artificial intelligence (pp. 121-140). Language Science Press.

Oliver, A. (2016). Herramientas tecnológicas para traductores. Editorial UOC.

Organización Mundial del Turismo. (2023). World tourism barometer - May 2023 (excerpt), 21(2). https://webunwto.s3.eu-west-1.amazonaws.com/s3fs-public/2023-05/UNWTO_Barom23_02_May_EXCERPT_final.pdf DOI: https://doi.org/10.18111/wtobarometereng.2023.21.1.2

Ortego Antón, M. T. (2019). La terminología del sector agroalimentario (español-inglés) en los estudios contrastivos y de traducción especializada basados en corpus: los embutidos. Peter Lang. http://doi.org/10.3726/b15808 DOI: https://doi.org/10.3726/b15808

Ortego Antón, M. T. (2020). Las fichas descriptivas de embutidos en español y en inglés: un análisis contrastivo de la estructura retórica basado en corpus. Revista Signos, 53(102), 170-194. http://doi.org/10.4067/S0718-09342020000100170 DOI: https://doi.org/10.4067/S0718-09342020000100170

Ortego Antón, M. T. (2024). The design of Torrezno TRAD: the semiautomatic Spanish-English writing and translation aid tool. En I. Peñuelas Gil y M. T. Ortego Antón (Eds.), Interpreting and translation for agri-food professionals in the global marketplace (pp. 69-84). De Gruyter. https://doi.org/10.1515/9783111101729-004 DOI: https://doi.org/10.1515/9783111101729-004

Ortiz Boix, C. (2016). Implementing machine translation and post-editing to the translation of wildlife documentaries through voice-over and off-screen dubbing [Tesis doctoral, Universitat Autònoma de Barcelona]. http://hdl.handle.net/10803/400020

Penadés Martínez, I. (2015). Para un diccionario de locuciones. De la lingüística teórica a la fraseografía práctica. Universidad de Alcalá.

Peñuelas Gil, I. (2024). Estudio contrastivo del tratamiento de las expresiones multiverbales del turismo gastronómico en los sistemas de traducción automática del español al inglés [Tesis doctoral, Universidad de Valladolid]. https://doi.org/10.35376/10324/67810 DOI: https://doi.org/10.35376/10324/67810

Pérez Blanco, M. e Izquierdo, M. (2021). Developing a corpus-informed tool for Spanish professionals writing specialized texts in English. En J. Lavid-López, C. Maíz-Arévalo y J. R. Zamorano-Mansilla (Eds.), Corpora in translation and contrastive research in the digital age (pp. 147-173). John Benjamins. https://doi.org/10.1075/btl.158.06per DOI: https://doi.org/10.1075/btl.158.06per

Rivera-Trigueros, I. (2022). Machine translation systems and quality assessment: a systematic review. Lang Resources & Evaluation, 56, pp. 593–619. https://doi.org/10.1007/s10579-021-09537-5 DOI: https://doi.org/10.1007/s10579-021-09537-5

Sag, I. A., Baldwin, T., Bond, F., Copestake, A. y Flickinger, D. (2002). Multiword expressions: a pain in the neck for NLP. Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics (CICLING 2002), pp. 1−15. https://doi.org/10.1007/3-540-45715-1_1 DOI: https://doi.org/10.1007/3-540-45715-1_1

Sánchez Carnicer, J. (2022). Traducción y discapacidad. Un estudio comparado de la terminología inglés-español en la prensa escrita. Peter Lang. https://doi.org/10.3726/b19567 DOI: https://doi.org/10.3726/b19567

Sánchez Ramos, M. M. y Rico Pérez, C. (2020). Traducción automática: conceptos clave, procesos de evaluación y técnicas de posedición. Comares.

Seghiri, M. (2017). Metodología de elaboración de un glosario bilingüe y bidireccional (inglés-español/español-inglés) basado en corpus para la traducción de manuales de instrucciones de televisores. Babel, 63(1), 43-64. https://doi.org/10.1075/babel.63.1.04seg DOI: https://doi.org/10.1075/babel.63.1.04seg

Vieira, L. N. (2020). Machine translation in the news: a framing analysis of the written press. Translation Spaces, 9(1), pp. 98–122. https://doi.org/10.1075/ts.00023.nun DOI: https://doi.org/10.1075/ts.00023.nun

Yamada, K. y Knight, K. (2001). A syntax-based statistical translation model. Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, pp. 523-530. https://aclanthology.org/P01-1067 DOI: https://doi.org/10.3115/1073012.1073079

Yandex. (2017, 14 de septiembre). One model is better than two. Yandex.Translate launches a hybrid machine translation system. https://yandex.com/company/blog/one-model-is-better-than-two-yu-yandex-translate-launches-a-hybrid-machine-translation-system

Zens, R., Och, F. J. y Ney, H. (2002). Phrase-based statistical machine translation. En M. Jarke, J. Koehler y G. Lakemeyer (Eds.), KI 2002. Advances in Artificial Intelligence: 25th Annual German Conference on AI, KI 2002 Aachen, Germany, September 16–20, 2002 Proceedings, 2479 (pp. 18–32). Springer. https://doi.org/10.1007/3-540-45751-8_2 DOI: https://doi.org/10.1007/3-540-45751-8_2