Inicio  /  Algorithms  /  Vol: 12 Par: 1 (2019)  /  Artículo
ARTÍCULO
TITULO

Ensemble and Deep Learning for Language-Independent Automatic Selection of Parallel Data

Despoina Mouratidis and Katia Lida Kermanidis    

Resumen

Machine translation is used in many applications in everyday life. Due to the increase of translated documents that need to be organized as useful or not (for building a translation model), the automated categorization of texts (classification), is a popular research field of machine learning. This kind of information can be quite helpful for machine translation. Our parallel corpora (English-Greek and English-Italian) are based on educational data, which are quite difficult to translate. We apply two state of the art architectures, Random Forest (RF) and Deeplearnig4j (DL4J), to our data (which constitute three translation outputs). To our knowledge, this is the first time that deep learning architectures are applied to the automatic selection of parallel data. We also propose new string-based features that seem to be effective for the classifier, and we investigate whether an attribute selection method could be used for better classification accuracy. Experimental results indicate an increase of up to 4% (compared to our previous work) using RF and rather satisfactory results using DL4J.

 Artículos similares

       
 
Hoan-Suk Choi and Jinhong Yang    
Suicidal ideation constitutes a critical concern in mental health, adversely affecting individuals and society at large. The early detection of such ideation is vital for providing timely support to individuals and mitigating its societal impact. With so... ver más
Revista: Applied Sciences

 
Chunling Wang, Tianyi Hang, Changke Zhu and Qi Zhang    
The Czech Republic is one of the countries along the Belt and Road Initiative, and classifying land cover in the Czech Republic helps to understand the distribution of its forest resources, laying the foundation for forestry cooperation between China and... ver más
Revista: Applied Sciences

 
Antonello Pasini and Stefano Amendola    
Neural network models are often used to analyse non-linear systems; here, in cases of small datasets, we review our complementary approach to deep learning with the purpose of highlighting the importance and roles (linear, non-linear or threshold) of cer... ver más
Revista: Applied Sciences

 
Haojie Lian, Xinhao Li, Leilei Chen, Xin Wen, Mengxi Zhang, Jieyuan Zhang and Yilin Qu    
Neural radiance fields and neural reflectance fields are novel deep learning methods for generating novel views of 3D scenes from 2D images. To extend the neural scene representation techniques to complex underwater environments, beyond neural reflectanc... ver más

 
Zeqin Tian, Dengfeng Chen and Liang Zhao    
Accurate building energy consumption prediction is a crucial condition for the sustainable development of building energy management systems. However, the highly nonlinear nature of data and complex influencing factors in the energy consumption of large ... ver más
Revista: Applied Sciences