Reasoning transfer for an extremely low-resource and endangered language: Bridging languages through sample-efficient language understanding
Loading...
Files
Accepted Version
Date
2025
Authors
Tran, Khanh-Tung
O’Sullivan, Barry
Nguyen, Hoang D.
Journal Title
Journal ISSN
Volume Title
Publisher
Published Version
Abstract
Recent advances have enabled Large Language Models (LLMs) to tackle reasoning tasks by generating chain-of-thought (CoT) rationales, yet these gains have largely applied to high-resource languages, leaving low-resource languages behind. In this work, we first investigate CoT techniques in extremely low-resource scenarios through previous prompting, model-editing, and fine-tuning approaches. We introduce English-Pivoted CoT Training, leveraging the insight that LLMs internally operate in a latent space aligned toward the dominant language. Given input in a low-resource language, we perform supervised fine-tuning to generate CoT in English and output the final response in the target language. Across mathematical reasoning benchmarks, our approach outperforms other baselines with up to 28.33% improvement in low-resource scenarios. Our analysis and additional experiments, including Mixed-Language CoT and Two-Stage Training, show that explicitly separating language understanding from reasoning enhances cross-lingual reasoning abilities. To facilitate future work, we also release LC2024, the first benchmark for mathematical tasks in Irish, an extremely low-resource and endangered language. Our results and resources highlight a practical pathway to multilingual reasoning without extensive retraining in every extremely low-resource language, despite data scarcity.
Description
Keywords
Reasoning transfer , Large Language Models (LLMs) , Chain-of-thought (CoT)
Citation
Tran, K.-T., O’Sullivan, B. and Nguyen, H. D. (2025) 'Reasoning transfer for an extremely low-resource and endangered language: Bridging languages through sample-efficient language understanding', 39th Annual AAAI Conference on Artificial Intelligence, Philadelphia, Pennsylvania, USA, 25 February - 4 March 2025.
