Inicio  /  Applied Sciences  /  Vol: 13 Par: 2 (2023)  /  Artículo
ARTÍCULO
TITULO

Low-Resource Neural Machine Translation Improvement Using Source-Side Monolingual Data

Atnafu Lambebo Tonja    
Olga Kolesnikova    
Alexander Gelbukh and Grigori Sidorov    

Resumen

Despite the many proposals to solve the neural machine translation (NMT) problem of low-resource languages, it continues to be difficult. The issue becomes even more complicated when few resources cover only a single domain. In this paper, we discuss the applicability of a source-side monolingual dataset of low-resource languages to improve the NMT system for such languages. In our experiments, we used Wolaytta?English translation as a low-resource language. We discuss the use of self-learning and fine-tuning approaches to improve the NMT system for Wolaytta?English translation using both authentic and synthetic datasets. The self-learning approach showed +2.7 and +2.4 BLEU score improvements for Wolaytta?English and English?Wolaytta translations, respectively, over the best-performing baseline model. Further fine-tuning the best-performing self-learning model showed +1.2 and +0.6 BLEU score improvements for Wolaytta?English and English?Wolaytta translations, respectively. We reflect on our contributions and plan for the future of this difficult field of study.

 Artículos similares

       
 
Wenbo Zhang, Xiao Li, Yating Yang and Rui Dong    
The pre-training fine-tuning mode has been shown to be effective for low resource neural machine translation. In this mode, pre-training models trained on monolingual data are used to initiate translation models to transfer knowledge from monolingual dat... ver más
Revista: Information

 
Rogelio Bautista-Sánchez, Liliana Ibeth Barbosa-Santillan and Juan Jaime Sánchez-Escobar    
The prediction of vessel maritime navigation has become an exciting topic in the last years, especially considering economics, commercial exchange, and security. In addition, vessel monitoring requires better systems and techniques that help enterprises ... ver más
Revista: Applied Sciences

 
Tessfu Geteye Fantaye, Junqing Yu and Tulu Tilahun Hailu    
Deep neural networks (DNNs) have shown a great achievement in acoustic modeling for speech recognition task. Of these networks, convolutional neural network (CNN) is an effective network for representing the local properties of the speech formants. Howev... ver más
Revista: Computers

 
Jaco Badenhorst and Febe de Wet    
When the National Centre for Human Language Technology (NCHLT) Speech corpus was released, it created various opportunities for speech technology development in the 11 official, but critically under-resourced, languages of South Africa. Since then, the s... ver más
Revista: Information

 
Mohammad Ali Humayun, Ibrahim A. Hameed, Syed Muslim Shah, Sohaib Hassan Khan, Irfan Zafar, Saad Bin Ahmed and Junaid Shuja    
Automatic Speech Recognition, (ASR) has achieved the best results for English, with end-to-end neural network based supervised models. These supervised models need huge amounts of labeled speech data for good generalization, which can be quite a challeng... ver más
Revista: Applied Sciences