Redirigiendo al acceso original de articulo en 15 segundos...
Inicio  /  Computers  /  Vol: 11 Par: 4 (2022)  /  Artículo
ARTÍCULO
TITULO

A Lite Romanian BERT: ALR-BERT

Dragos Constantin Nicolae    
Rohan Kumar Yadav and Dan Tufis    

Resumen

Large-scale pre-trained language representation and its promising performance in various downstream applications have become an area of interest in the field of natural language processing (NLP). There has been huge interest in further increasing the model?s size in order to outperform the best previously obtained performances. However, at some point, increasing the model?s parameters may lead to reaching its saturation point due to the limited capacity of GPU/TPU. In addition to this, such models are mostly available in English or a shared multilingual structure. Hence, in this paper, we propose a lite BERT trained on a large corpus solely in the Romanian language, which we called ?A Lite Romanian BERT (ALR-BERT)?. Based on comprehensive empirical results, ALR-BERT produces models that scale far better than the original Romanian BERT. Alongside presenting the performance on downstream tasks, we detail the analysis of the training process and its parameters. We also intend to distribute our code and model as an open source together with the downstream task.

Palabras claves

 Artículos similares

       
 
Mahammad Khalid Shaik Vadla, Mahima Agumbe Suresh and Vimal K. Viswanathan    
Understanding customer emotions and preferences is paramount for success in the dynamic product design landscape. This paper presents a study to develop a prediction pipeline to detect the aspect and perform sentiment analysis on review data. The pre-tra... ver más
Revista: Algorithms

 
Fenfang Li, Zhengzhang Zhao, Li Wang and Han Deng    
Sentence Boundary Disambiguation (SBD) is crucial for building datasets for tasks such as machine translation, syntactic analysis, and semantic analysis. Currently, most automatic sentence segmentation in Tibetan adopts the methods of rule-based and stat... ver más
Revista: Applied Sciences

 
Piyush Vyas, Gitika Vyas and Gaurav Dhiman    
The beginning of this decade brought utter international chaos with the COVID-19 pandemic and the Russia-Ukraine war (RUW). The ongoing war has been building pressure across the globe. People have been showcasing their opinions through different communic... ver más
Revista: Algorithms

 
Sakib Shahriar, Noora Al Roken and Imran Zualkernan    
The automatic classification of poems into various categories, such as by author or era, is an interesting problem. However, most current work categorizing Arabic poems into eras or emotions has utilized traditional feature engineering and machine learni... ver más
Revista: Computers

 
Aman Ullah, Khairullah Khan, Aurangzeb Khan and Shoukat Ullah    
The trend of E-commerce and online shopping is increasing rapidly. However, it is difficult to know about the quality of items from pictures and videos available on the online stores. Therefore, online stores and independent products reviews sites share ... ver más
Revista: Computers