Inicio  /  Information  /  Vol: 15 Par: 2 (2024)  /  Artículo
ARTÍCULO
TITULO

Comparative Analysis of NLP-Based Models for Company Classification

Maryan Rizinski    
Andrej Jankov    
Vignesh Sankaradas    
Eugene Pinsky    
Igor Mishkovski and Dimitar Trajanov    

Resumen

The task of company classification is traditionally performed using established standards, such as the Global Industry Classification Standard (GICS). However, these approaches heavily rely on laborious manual efforts by domain experts, resulting in slow, costly, and vendor-specific assignments. Therefore, we investigate recent natural language processing (NLP) advancements to automate the company classification process. In particular, we employ and evaluate various NLP-based models, including zero-shot learning, One-vs-Rest classification, multi-class classifiers, and ChatGPT-aided classification. We conduct a comprehensive comparison among these models to assess their effectiveness in the company classification task. The evaluation uses the Wharton Research Data Services (WRDS) dataset, consisting of textual descriptions of publicly traded companies. Our findings reveal that the RoBERTa and One-vs-Rest classifiers surpass the other methods, achieving F1 scores of 0.81 and 0.80 on the WRDS dataset, respectively. These results demonstrate that deep learning algorithms offer the potential to automate, standardize, and continuously update classification systems in an efficient and cost-effective way. In addition, we introduce several improvements to the multi-class classification techniques: (1) in the zero-shot methodology, we TF-IDF to enhance sector representation, yielding improved accuracy in comparison to standard zero-shot classifiers; (2) next, we use ChatGPT for dataset generation, revealing potential in scenarios where datasets of company descriptions are lacking; and (3) we also employ K-Fold to reduce noise in the WRDS dataset, followed by conducting experiments to assess the impact of noise reduction on the company classification results.

 Artículos similares

       
 
Hamed Taherdoost and Mitra Madanchian    
Blockchain technology has become a powerful disruptive force that upends established ideas in several industries. A fascinating point of convergence is that of blockchain technology and Business Process Management (BPM), where the distributed and immutab... ver más
Revista: Information

 
Marcin Klosok, Daria Gendosz de Carrillo, Piotr Laszczyca, Tomasz Plociniczak, Halina Jedrzejowska-Szypulka and Tomasz Sawczyn    
Revista: Applied Sciences

 
Siarhei Autsou, Karolina Kudelina, Toomas Vaimann, Anton Rassõlkin and Ants Kallaste    
Servomotors have found widespread application in many areas, such as manufacturing, robotics, automation, and others. Thus, the control of servomotors is divided into various principles and methods, leading to a high diversity of control systems. This ar... ver más
Revista: Applied Sciences

 
Carolina Bona-Sánchez, Heidi Salokangas and Kaisa Sorsa    
This study explores the complexities of cost behavior in the textile industry, conducting a comparative analysis between firms in the Nordic countries and Spain. Our main goal is to examine how distinct economic and corporate governance models impact the... ver más
Revista: Applied Sciences

 
Agnieszka Szpala, Slawomir Winiarski, Malgorzata Kolodziej, Bogdan Pietraszewski, Ryszard Jasinski, Tadeusz Niebudek, Andrzej Lejczak, Karolina Lorek, Jacek Balchanowski, Slawomir Wudarczyk and Marek Wozniewski    
This investigation aimed to scrutinise the kinematic and spatio-temporal characteristics of free and Nordic walking (NW) in older adults, utilising both traditional and biosensor-integrated mechatronic poles. The hypothesis was that including biosensors ... ver más
Revista: Applied Sciences