Irish-based Large Language Model with extreme low-resource settings in machine translation
dc.contributor.author | Tran, Khanh-Tung | en |
dc.contributor.author | O'Sullivan, Barry | en |
dc.contributor.author | Nguyen, Hoang D. | en |
dc.contributor.funder | Science Foundation Ireland | en |
dc.date.accessioned | 2024-07-09T09:02:29Z | |
dc.date.available | 2024-07-09T09:02:29Z | |
dc.date.issued | 2024-08-11 | en |
dc.description.abstract | Large Language Models (LLMs) have demonstrated exceptional performances in a wide range of natural language processing tasks. However, their success does not always extend to machine translation, particularly in challenging scenarios such as translating low-resource languages. This study investigates the multilingual capability of LLMs, with a case study on Irish, an extremely low-resource language, focusing on translation tasks between English and Irish. We propose a dynamic, efficient language adaptation framework for English-centric LLMs, which involves layer-specific adjustments and subsequent fine-tuning for machine translation. Our findings highlight several key insights: (1) different layers in the LLM serve distinct functions such as language understanding and task reasoning, (2) effective translation requires extensive pre-training on both source and target languages, and (3) targeted fine-tuning for machine translation leads to significant improvements of 36.7% for English to Irish and 133.4% for Irish to English compared to the previous state-of-the-art. | en |
dc.description.status | Peer reviewed | en |
dc.description.version | Accepted Version | en |
dc.format.mimetype | application/pdf | en |
dc.identifier.citation | Tran, K.-T., O'Sullivan, B. and Nguyen, H. D. (2024) 'Irish-based Large Language Model with Extreme Low-Resource Settings in Machine Translation', LoResMT 2024: The Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages, @ACL2024, Bangkok, Thailand, August 11–16. | en |
dc.identifier.endpage | 10 | en |
dc.identifier.startpage | 1 | en |
dc.identifier.uri | https://hdl.handle.net/10468/16110 | |
dc.language.iso | en | en |
dc.publisher | ACL | en |
dc.relation.project | info:eu-repo/grantAgreement/SFI/SFI Research Centres/12/RC/2289/IE/INSIGHT - Irelands Big Data and Analytics Research Centre/ | en |
dc.relation.project | info:eu-repo/grantAgreement/SFI/SFI Centres for Research Training Programme::Data and ICT Skills for the Future/18/CRT/6223/IE/SFI Centre for Research Training in Artificial Intelligence/ | en |
dc.relation.uri | https://www.loresmt.org | en |
dc.rights | © 2023 Association for Computational Linguistics | en |
dc.subject | Large Language Models (LLMs) | en |
dc.subject | Natural Language Processing (NLP) | en |
dc.subject | Translation | en |
dc.subject | Machine translation | en |
dc.subject | Language technologies | en |
dc.subject | Accessibility | en |
dc.subject | Irish language | en |
dc.title | Irish-based Large Language Model with extreme low-resource settings in machine translation | en |
dc.type | Conference item | en |