Inicio  /  Applied Sciences  /  Vol: 13 Par: 21 (2023)  /  Artículo
ARTÍCULO
TITULO

Comparison of the Effectiveness of Various Classifiers for Breast Cancer Detection Using Data Mining Methods

Noor Kamal Al-Qazzaz    
Iyden Kamil Mohammed    
Halah Kamal Al-Qazzaz    
Sawal Hamid Bin Mohd Ali and Siti Anom Ahmad    

Resumen

Countless women and men worldwide have lost their lives to breast cancer (BC). Although researchers from around the world have proposed various diagnostic methods for detecting this disease, there is still room for improvement in the accuracy and efficiency with which they can be used. A novel approach has been proposed for the early detection of BC by applying data mining techniques to the levels of prolactin (P), testosterone (T), cortisol (C), and human chorionic gonadotropin (HCG) in the blood and saliva of 20 women with histologically confirmed BC, 20 benign subjects, and 20 age-matched control women. In the proposed method, blood and saliva were used to categorize the severity of the BC into normal, benign, and malignant cases. Ten statistical features were collected to identify the severity of the BC using three different classification schemes?a decision tree (DT), a support vector machine (SVM), and k-nearest neighbors (KNN) were evaluated. Moreover, dimensionality reduction techniques using factor analysis (FA) and t-stochastic neighbor embedding (t-SNE) have been computed to obtain the best hyperparameters. The model has been validated using the k-fold cross-validation method in the proposed approach. Metrics for gauging a model?s effectiveness were applied. Dimensionality reduction approaches for salivary biomarkers enhanced the results, particularly with the DT, thereby increasing the classification accuracy from 66.67% to 93.3% and 90%, respectively, by utilizing t-SNE and FA. Furthermore, dimensionality reduction strategies for blood biomarkers enhanced the results, particularly with the DT, thereby increasing the classification accuracy from 60% to 80% and 93.3%, respectively, by utilizing FA and t-SNE. These findings point to t-SNE as a potentially useful feature selection for aiding in the identification of patients with BC, as it consistently improves the discrimination of benign, malignant, and control healthy subjects, thereby promising to aid in the improvement of breast tumour early detection.

 Artículos similares

       
 
Jiacun Wang, Guipeng Xi, Xiwang Guo, Shujin Qin and Henry Han    
The scheduling of disassembly lines is of great importance to achieve optimized productivity. In this paper, we address the Hybrid Disassembly Line Balancing Problem that combines linear disassembly lines and U-shaped disassembly lines, considering multi... ver más
Revista: Information

 
Vladimir Ulansky and Ahmed Raza    
Maintenance strategies play a crucial role in ensuring the reliability and performance of complex systems. Imperfect inspections, characterized by the probabilities of false positives and false negatives, significantly impact the effectiveness of mainten... ver más
Revista: Aerospace

 
Leon Kopitar, Iztok Fister, Jr. and Gregor Stiglic    
Introduction: Type 2 diabetes mellitus is a major global health concern, but interpreting machine learning models for diagnosis remains challenging. This study investigates combining association rule mining with advanced natural language processing to im... ver más
Revista: Information

 
Vahid Safavi, Arash Mohammadi Vaniar, Najmeh Bazmohammadi, Juan C. Vasquez and Josep M. Guerrero    
Predicting the remaining useful life (RUL) of lithium-ion (Li-ion) batteries is crucial to preventing system failures and enhancing operational performance. Knowing the RUL of a battery enables one to perform preventative maintenance or replace the batte... ver más
Revista: Information

 
Maryan Rizinski, Andrej Jankov, Vignesh Sankaradas, Eugene Pinsky, Igor Mishkovski and Dimitar Trajanov    
The task of company classification is traditionally performed using established standards, such as the Global Industry Classification Standard (GICS). However, these approaches heavily rely on laborious manual efforts by domain experts, resulting in slow... ver más
Revista: Information