Inicio  /  Applied Sciences  /  Vol: 11 Par: 5 (2021)  /  Artículo
ARTÍCULO
TITULO

Machine Learning-Based Identification of the Strongest Predictive Variables of Winning and Losing in Belgian Professional Soccer

Youri Geurkink    
Jan Boone    
Steven Verstockt and Jan G. Bourgois    

Resumen

This study aimed to identify the strongest predictive variables of winning and losing in the highest Belgian soccer division. A predictive machine learning model based on a broad range of variables (n = 100) was constructed, using a dataset consisting of 576 games. To avoid multicollinearity and reduce dimensionality, Variance Inflation Factor (threshold of 5) and BorutaShap were respectively applied. A total of 13 variables remained and were used to predict winning or losing using Extreme Gradient Boosting. TreeExplainer was applied to determine feature importance on a global and local level. The model showed an accuracy of 89.6% ± 3.1% (precision: 88.9%; recall: 90.1%, f1-score: 89.5%), correctly classifying 516 out of 576 games. Shots on target from the attacking penalty box showed to be the best predictor. Several physical indicators are amongst the best predictors, as well as contextual variables such as ELO -ratings, added transfers value of the benched players and match location. The results show the added value of the inclusion of a broad spectrum of variables when predicting and evaluating game outcomes. Similar modelling approaches can be used by clubs to identify the strongest predictive variables for their leagues, and evaluate and improve their current quantitative analyses.

 Artículos similares

       
 
Yi Zhao and Song-Kyoo Kim    
This paper addresses the enhancement of modern security through the integration of electrocardiograms (ECGs) into biometric authentication systems. As technology advances, the demand for reliable identity authentication systems has grown, given the rise ... ver más
Revista: Information

 
Max Schrötter, Andreas Niemann and Bettina Schnor    
Over the last few years, a plethora of papers presenting machine-learning-based approaches for intrusion detection have been published. However, the majority of those papers do not compare their results with a proper baseline of a signature-based intrusi... ver más
Revista: Information

 
Xiaohui Yan, Tianqi Zhang, Wenying Du, Qingjia Meng, Xinghan Xu and Xiang Zhao    
Water quality prediction, a well-established field with broad implications across various sectors, is thoroughly examined in this comprehensive review. Through an exhaustive analysis of over 170 studies conducted in the last five years, we focus on the a... ver más

 
Saikat Das, Mohammad Ashrafuzzaman, Frederick T. Sheldon and Sajjan Shiva    
The distributed denial of service (DDoS) attack is one of the most pernicious threats in cyberspace. Catastrophic failures over the past two decades have resulted in catastrophic and costly disruption of services across all sectors and critical infrastru... ver más
Revista: Algorithms

 
Eike Blomeier, Sebastian Schmidt and Bernd Resch    
In the early stages of a disaster caused by a natural hazard (e.g., flood), the amount of available and useful information is low. To fill this informational gap, emergency responders are increasingly using data from geo-social media to gain insights fro... ver más
Revista: Information