Classification of Virtual Harassment on Social Networks Using Ensemble Learning Techniques

Nureni Ayofe Azeez and Emad Fadhal

Resumen

Background: Internet social media platforms have become quite popular, enabling a wide range of online users to stay in touch with their friends and relatives wherever they are at any time. This has led to a significant increase in virtual crime from the inception of these platforms to the present day. Users are harassed online when confidential information about them is stolen, or when another user posts insulting or offensive comments about them. This has posed a significant threat to online social media users, both mentally and psychologically. Methods: This research compares traditional classifiers and ensemble learning in classifying virtual harassment in online social media networks by using both models with four different datasets: seven machine learning algorithms (Nave Bayes NB, Decision Tree DT, K Nearest Neighbor KNN, Logistics Regression LR, Neural Network NN, Quadratic Discriminant Analysis QDA, and Support Vector Machine SVM) and four ensemble learning models (Ada Boosting, Gradient Boosting, Random Forest, and Max Voting). Finally, we compared our results using twelve evaluation metrics, namely: Accuracy, Precision, Recall, F1-measure, Specificity, Matthew?s Correlation Coefficient (MCC), Cohen?s Kappa Coefficient KAPPA, Area Under Curve (AUC), False Discovery Rate (FDR), False Negative Rate (FNR), False Positive Rate (FPR), and Negative Predictive Value (NPV) were used to show the validity of our algorithms. Results: At the end of the experiments, For Dataset 1, Logistics Regression had the highest accuracy of 0.6923 for machine learning algorithms, while Max Voting Ensemble had the highest accuracy of 0.7047. For dataset 2, K-Nearest Neighbor, Support Vector Machine, and Logistics Regression all had the same highest accuracy of 0.8769 in the machine learning algorithm, while Random Forest and Gradient Boosting Ensemble both had the highest accuracy of 0.8779. For dataset 3, the Support Vector Machine had the highest accuracy of 0.9243 for the machine learning algorithms, while the Random Forest ensemble had the highest accuracy of 0.9258. For dataset 4, the Support Vector Machine and Logistics Regression both had 0.8383, while the Max voting ensemble obtained an accuracy of 0.8280. A bar chart was used to represent our results, showing the minimum, maximum, and quartile ranges. Conclusions: Undoubtedly, this technique has assisted in no small measure in comparing the selected machine learning algorithms as well as the ensemble for detecting and exposing various forms of cyber harassment in cyberspace. Finally, the best and weakest algorithms were revealed.

Palabras claves

harassment - classification - ensemble - metrics - algorithm - learning - classifiers

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 13 Parte: 7 (2023)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Applied Sciences
Algorithms
Computers

DOI

https://doi.org/10.3390/app13074570

Art�culos similares

Detection of Plausibility and Error Reasons in Finite Element Simulations with Deep Learning Networks

Acceso

Sebastian Bickel, Stefan Goetz and Sandro Wartzack

The field of application of data-driven product development is diverse and ranges from requirements through the early phases to the detailed design of the product. The goal is to consistently analyze data to support and improve individual steps in the de... ver m�s

Revista: Algorithms

SAPBERT: Speaker-Aware Pretrained BERT for Emotion Recognition in Conversation

Acceso

Seunguook Lim and Jihie Kim

Emotion recognition in conversation (ERC) is receiving more and more attention, as interactions between humans and machines increase in a variety of services such as chat-bot and virtual assistants. As emotional expressions within a conversation can heav... ver m�s

Revista: Algorithms

Automated Optimization-Based Deep Learning Models for Image Classification Tasks

Acceso

Daudi Mashauri Migayo, Shubi Kaijage, Stephen Swetala and Devotha G. Nyambo

Applying deep learning models requires design and optimization when solving multifaceted artificial intelligence tasks. Optimization relies on human expertise and is achieved only with great exertion. The current literature concentrates on automating des... ver m�s

Revista: Computers

Investigation of a Hybrid LSTM + 1DCNN Approach to Predict In-Cylinder Pressure of Internal Combustion Engines

Acceso

Federico Ricci, Luca Petrucci, Francesco Mariani and Carlo Nazareno Grimaldi

The control of internal combustion engines is becoming increasingly challenging to the customer?s requirements for growing performance and ever-stringent emission regulations. Therefore, significant computational efforts are required to manage the large ... ver m�s

Revista: Information

Structure Learning and Hyperparameter Optimization Using an Automated Machine Learning (AutoML) Pipeline

Acceso

Konstantinos Filippou, George Aifantis, George A. Papakostas and George E. Tsekouras

In this paper, we built an automated machine learning (AutoML) pipeline for structure-based learning and hyperparameter optimization purposes. The pipeline consists of three main automated stages. The first carries out the collection and preprocessing of... ver m�s

Revista: Information

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas disponibles