ARTÍCULO
TITULO

Methods to improve the accuracy of machine learning algorithms while reducing the dimensionality of the data set

A.V. Vorobyev    

Resumen

The limited availability of information collection is a factor hindering the application of high-performance machine learning algorithms. The development of methods to improve the accuracy of models while reducing the observation periods, can be an effective tool for prediction in understudied areas. The paper considers the relationship between the dimensionality of the data set and the predictive capabilities of machine learning models, and determines the impact of the number of observations on the accuracy and robustness of models built on ensemble algorithms and regularized regression algorithms. In the course of the experiments, the change in the weighted average absolute error with decreasing the dimensionality of the set was considered, and the algorithms most resistant to this factor were identified. The lower limit of use of ensemble algorithms for detection of regularities and construction of stable model, in regression tasks, in cases of non-linear dependence of target variable with predictors and under condition of absence of high impact of anomalies and noises in data was revealed. The effect of automated Bayesian hyperparameter optimization on model accuracy when the data set is reduced is considered. The models for which pre-optimization of hyperparameters, by means of wood-structured Parzen estimation, is the most preferable are determined.

 Artículos similares

       
 
Tobias Zeulner, Gerhard Johann Hagerer, Moritz Müller, Ignacio Vazquez and Peter A. Gloor    
Current methods for assessing individual well-being in team collaboration at the workplace often rely on manually collected surveys. This limits continuous real-world data collection and proactive measures to improve team member workplace satisfaction. W... ver más
Revista: Information

 
Yumei Zhang, Jie Zhang, Ye Li, Dan Yao, Yue Zhao, Yi Ai, Weijun Pan and Jiang Li    
Acoustic metamaterials (AMs) composed of periodic artificial structures have extraordinary sound wave manipulation capabilities compared with traditional acoustic materials, and they have attracted widespread research attention. The sound insulation perf... ver más
Revista: Acoustics

 
Yunfei Yang, Zhicheng Zhang, Jiapeng Zhao, Bin Zhang, Lei Zhang, Qi Hu and Jianglong Sun    
Resistance serves as a critical performance metric for ships. Swift and accurate resistance prediction can enhance ship design efficiency. Currently, methods for determining ship resistance encompass model tests, estimation techniques, and computational ... ver más

 
Jie Ren, Changmiao Li, Yaohui An, Weichuan Zhang and Changming Sun    
Few-shot fine-grained image classification (FSFGIC) methods refer to the classification of images (e.g., birds, flowers, and airplanes) belonging to different subclasses of the same species by a small number of labeled samples. Through feature representa... ver más
Revista: AI

 
Carmen Otilia Rusanescu, Maria Ciobanu, Marin Rusanescu and Raluca Lucia Dinculoiu    
This work is a comprehensive study focusing on various methods for processing wheat straw to enhance its suitability for bioethanol production. It delves into mechanical, physical, chemical, and biological pretreatments, each aimed at improving the enzym... ver más
Revista: Applied Sciences