IMPROVING ARABIC SENTIMENT ANALYSIS ON SOCIAL MEDIA: A COMPARATIVE STUDY ON APPLYING DIFFERENT PRE-PROCESSING TECHNIQUES

Essam Kazem Al-Yasiri

Ahmed Al-Azawei

Resumen

Regardless of the clear growth of Arabic texts on social networking sites (SNSs), it is still difficult to understand or summarize users' opinions or perspectives on a specific topic. Accordingly, Arabic text classification is one of the most challenging topics. This is because of several issues related to the nature of the Arabic language and words that have different variation in meaning. In this paper, after tokenizing the Arabic words, we investigate the role of several pre-processing techniques before classifying Arabic text into different categories. Arabic words were converted into vectors using the term frequency-inverse document frequency (TF-IDF) technique. The findings show that applying Linear Support Vector Machine (LSVC) with stop words and without stemming techniques can outperform the application of Decision Tree (DT) and Random Forest (RF) methods. It was found that the effectiveness of the proposed LSVC is 99.37%. These outcomes are significant to identify users' opinions on SNSs and can have many implications on political, social, economic, and business sectors.

Palabras claves

Social Networking Sites - Arabic sentiment analysis - Pre-processing techniques - Classifying Arabic text - Data mining algorithms

Acceso

P�GINAS

N�MERO

Volumen: 8 N�mero: 6 Parte: 0 (2019)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Information
Applied Sciences

DOI

http://dx.doi.org/10.6084/ijact.v8i6.907

IMPROVING ARABIC SENTIMENT ANALYSIS ON SOCIAL MEDIA: A COMPARATIVE STUDY ON APPLYING DIFFERENT PRE-PROCESSING TECHNIQUES

Art�culos similares

Revistas destacadas