Battery Remaining Useful Life Prediction Using Machine Learning Models: A Comparative Study

Safavi, Vahid; Mohammadi Vaniar, Arash; Bazmohammadi, Najmeh; Vasquez, Juan C.; Guerrero, Josep M.

doi:10.3390/info15030124

Open AccessArticle

Battery Remaining Useful Life Prediction Using Machine Learning Models: A Comparative Study

¹

Center for Research on Microgrids (CROM), AAU Energy, Aalborg University, 9220 Aalborg East, Denmark

²

Department of Electrical and Electronics Engineering, Middle East Technical University, 06800 Ankara, Turkey

³

Center for Research on Microgrids (CROM), Department of Electronic Engineering, Technical University of Catalonia, 08034 Barcelona, Spain

⁴

ICREA, Pg. Lluís Companys 23, 08010 Barcelona, Spain

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Information 2024, 15(3), 124; https://doi.org/10.3390/info15030124

Submission received: 6 February 2024 / Revised: 18 February 2024 / Accepted: 21 February 2024 / Published: 22 February 2024

(This article belongs to the Section Information Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Predicting the remaining useful life (RUL) of lithium-ion (Li-ion) batteries is crucial to preventing system failures and enhancing operational performance. Knowing the RUL of a battery enables one to perform preventative maintenance or replace the battery before its useful life expires, which is vital in safety-critical applications. The prediction of the RUL of Li-ion batteries plays a critical role in their optimal utilization throughout their lifetime and supporting sustainable practices. This paper conducts a comparative analysis to assess the effectiveness of multiple machine learning (ML) models in predicting the capacity fade and RUL of Li-ion batteries. Three case studies are analyzed to assess the performances of the state-of-the-art ML models, considering two distinct datasets. These case studies are conducted under various operating conditions such as temperature, C-rate, state of charge (SOC), and depth of discharge (DOD) of the batteries in Cases 1 and 2, and a different set of features and charging policies for the second dataset in Case 3. Meanwhile, diverse extracted features from the initial cycles of the second dataset are considered in Case 3 to predict the RUL of Li-ion batteries in all cycles. In addition, a multi-feature multi-target (MFMT) feature mapping is introduced to investigate the performance of the developed ML models in predicting the battery capacity fade and RUL in the entire life cycle. Multiple ML models that are developed for the comparison analysis in the proposed methodology include Random Forest (RF), extreme gradient boosting (XGBoost), light gradient-boosting machine (LightGBM), multi-layer perceptron (MLP), long short-term memory (LSTM), and attention-LSTM. Furthermore, hyperparameter tuning is applied to improve the performance of the XGBoost and LightGBM models. The results demonstrate that the extreme gradient boosting with hyperparameter tuning (XGBoost-HT) model outperforms the other ML models in terms of the root-mean-squared error (RMSE) and mean absolute percentage error (MAPE) of the battery capacity fade and RUL for all cycles. The obtained RMSE and MAPE values for XGBoost-HT in terms of cycle life are 69 cycles and 6.5%, respectively, for the third case. In addition, the XGBoost-HT model handles the MFMT feature mapping within an acceptable range of RMSE and MAPE, compared to the rest of the developed ML models and similar benchmarks.

Keywords:

lithium-ion batteries; remaining useful life; machine learning; XGBoost; LightGBM; Random Forest; MLP; LSTM; Attention-LSTM

1. Introduction

With the growing emphasis on sustainable energy development, several nations have implemented new energy policies regarding the use of electric vehicles and zero-carbon power generation [1]. Despite this progress, new electric vehicles continue to encounter obstacles, most notably in the context of battery technology, which concerns their safety, performance, and lifetime. In addition, energy storage systems and electric vehicle propulsion require lithium-ion (Li-ion) batteries [2]. With the increasing adoption of electric vehicles, accurate remaining useful life (RUL) prediction and real-time battery state of health (SOH) monitoring within the battery management system have become increasingly crucial for Li-ion batteries. Accurate RUL prediction is essential for optimizing battery management strategies, enabling timely replacement or maintenance, and reducing costs associated with unanticipated malfunctions and battery aging. It provides valuable insights into battery performance and helps ensure safety and reliability in various applications, thereby ultimately contributing to a more sustainable and efficient use of Li-ion batteries. Predicting the RUL of Li-ion batteries is a challenging task due to their complex degradation process, influenced by factors such as state of charge (SOC), SOH, charging and discharging cycles, and battery temperature [3,4].

1.1. Literature Review

In order to optimize the utilization of Li-ion batteries, it is important to model the battery degradation and predict its RUL with an acceptable accuracy level. This objective can be reached by using physics-based and data-driven methods. The physics-based modeling methods provide a deep insight into the degradation and aging mechanisms of the battery under different operating conditions. Physics-based techniques can be classified into three categories, namely, physicochemical models that employ sets of differential equations to simulate various aging processes, semi-empirical models that incorporate physical equations, and empirical models based on data [5,6]. In [7], a semi-empirical model for Li-ion battery degradation is investigated using operating profiles to predict the life loss of battery cells by using basic theories of battery degradation. In [8], the RUL prediction is deployed by integrating a simplified electrochemical model and employing capacity and voltage data. In addition, the prediction robustness of this proposed method is enhanced by incorporating noisy data and considering the discharge profile. In [9], a nonlinear algorithm is developed to derive a physics-based model for predicting the RUL and capacity of a Li-ion battery, considering a half-cell model with a few degradation mechanisms that affect battery performance. The lack of accurate model parameters presents a difficulty in using physics-based electrochemical models in real-time settings, leading to several concerns such as overfitting or local optimization problems. Moreover, although these models are capable of characterizing battery performance deterioration by incorporating a variety of variables such as discharge time and temperature, they encounter a challenge in predicting the RUL due to the dynamic nature of degradation rates over the battery’s lifetime [10]. The RUL prediction is proposed through the utilization of a single charging curve, in which a physics-based model that has the capability of deriving parameters associated with aging is developed. Following this, the aforementioned parameters are utilized to drive a deep neural network (NN) towards generating an RUL prediction. Subsequently, the effectiveness of the trained model is validated across three unique battery types functioning in seven distinct conditions [11]. Also, a similar approach is conducted by combining a physics-informed approach with NN to predict the RUL in [12].

For predicting the RUL of batteries, direct mapping and time series forecasting models are two distinct categories of data-driven models. Strong correlations between input features and cell capacity form the basis of direct mapping, and machine learning (ML) methods can leverage these associations to predict the RUL. By directly utilizing common attributes such as voltage, current, capacity, resistance, and temperature, a health indicator is generated that exhibits a stronger correlation with cell capacity. In [13], a RUL prediction for Li-ion batteries is developed in a real-time framework by utilizing the convolutional neural network (CNN)–long short-term memory (LSTM) model. As a result, this model can outperform the conventional LSTM technique by extracting the time series sequences. Another approach for forecasting the RUL and SOH of Li-ion batteries is presented in [14], in which the interactions between short-term and long-term degradation trends are monitored to reduce the noise in feature maps. A feature extraction model that represents battery aging is developed in [15] and the results demonstrate an enhancement in predicting the RUL. In [16], a model that integrates temporal attention, NN, temporal convolutional network (TCN), and feature attention is proposed to improve the accuracy of battery RUL prediction. The outcomes indicate a 33 and 54 percent reduction in the root-mean-squared error (RMSE) of the RUL prediction on NASA and CALCE Li-ion battery datasets, respectively.

Numerous NN methods, such as the gated recurrent unit (GRU)–recurrent neural network (RNN) [17], autoencoder [18], CNN-LSTM [19], Attention-LSTM [20], transformer-based NN [21], and Bayesian-based NN [22], have been utilized for predicting the RUL of Li-ion batteries. In [23], multiple input data, such as voltage, current, and charging temperature patterns, are fed into a multi-input LSTM model as features with a single output to predict the battery RUL. The results verify the effectiveness of the multi-channel LSTM in terms of mean absolute percentage error (MAPE) compared to the single-channel LSTM. In [20], an Attention-BiLSTM model is developed for battery RUL prediction with four input features of voltage, temperature, current, and battery capacity. The results reveal that the uncertainty associated with the multi-step model decreases. In [24], a RUL prediction methodology is proposed by developing a Random Forest (RF) model using limited data for the degradation of the battery and by considering multiple features such as internal resistance, charging and discharging voltage pattern, cycle time, ambient temperature, and capacity of the battery. The effectiveness of the proposed model is evaluated through a sensitivity analysis.

In [25], ML algorithms are implemented to predict and classify the RUL of Li-ion batteries by utilizing discharge voltage curves from initial cycles of the batteries and considering initial discharge capacity, discharge voltage curve, temperature, and charge time as input features. In [26], an ML model consisting of RF and light gradient-boosting machine (LightGBM) with a parameter optimization strategy is proposed to enhance the performance of Li-ion battery RUL predictions. This study investigates a robust prediction model by incorporating three key features: voltage, current, and temperature. In addition, the simulation results show that, by applying the Harris Hawks optimization method to tune the hyperparameters, the RMSE and MAPE values for RUL prediction on the test data are decreased. In [27], three regression models that utilize a supervised ML regression algorithm, namely, the bagging regressor, linear regressor, and RF regressor models, are investigated to predict the RUL of Li-ion batteries in electric vehicles considering the discharge voltage and temperature as input features. The obtained results highlight the effectiveness of the RF regressor model amongst the others in RUL prediction.

Extreme gradient boosting (XGBoost) is an ML algorithm that is popular for its performance in regression problems and, because of its tree-based nature, it is a promising tool for predictive tasks [28]. Particle swarm optimization is utilized in [29] to improve the efficacy of XGBoost in predicting the SOH and SOC of Li-ion batteries. In addition, acknowledging the dynamic correlation between SOH and SOC in this model yields more accurate predictions for both variables. An RUL prediction framework is proposed in [30] by utilizing XGBoost for anomaly detection and dynamic time warping for RUL prediction. In [31], the correlation confusion matrix, powered by XGBoost and LightGBM, is developed for predicting battery RUL. In [32], an XGBoost model is proposed to predict the battery’s RUL using Kalman and particle filters by incorporating the operational data.

1.2. Novelty and Contribution

Based on the conducted literature review, this paper identifies crucial gaps in training ML models with multi-target values, along with considerations related to datasets. To address these gaps, three case studies are presented, focusing on the challenging aspects of capacity fade and RUL prediction in Li-ion batteries. The first two case studies utilize a synthetic dataset exhibiting linear behavior in capacity fade. The rationale behind this approach is to assess whether the developed ML models can accurately predict linear trends throughout the battery’s life cycles, providing a foundation for predicting nonlinear and realistic capacity fade and RUL. In contrast, the third case study employs a real dataset that represents nonlinear and realistic battery behavior. The features utilized in the first two case studies include SOC, depth of discharge (DOD), C-rate, and temperature, while features extracted from the second dataset are presented in Section 3. Notably, the second and third case studies introduce the multi-feature multi-target (MFMT) feature mapping, resulting in significantly improved performance metrics such as mean absolute error (MAE), RMSE, and MAPE regarding the RUL prediction of Li-ion batteries. The ML models developed in this paper for the comparative analysis encompass RF, multi-layer perceptron (MLP), XGBoost, LightGBM, LSTM, and Attention-LSTM. Additionally, hyperparameter tuning is conducted for XGBoost and LightGBM to enhance their predictive capabilities. The paper’s contributions are summarized as follows:

Proposing an MFMT feature mapping for the developed ML models to improve capacity fade prediction for all cycles.
Utilizing two distinct battery datasets with different feature extractions for training and testing the proposed three case studies.
Performing hyperparameter tuning for the performance improvement of XGBOOST and LightGBM models.
Providing an in-depth performance assessment of the developed ML models along with a comparison with other relevant works.

1.3. Paper Organization

The rest of the paper is organized as follows: Section 2 elaborates on the comparative analysis methodology with a focus on the explanation of each developed ML model. Subsequently, Section 3 presents the case study and simulation results. Following this, Section 4 concludes the paper.

2. Proposed Comparative Analysis Methodology

In this section, the proposed methodology for implementing the comparative analysis of multiple ML models for predicting the RUL of Li-ion batteries is presented. Based on the conducted literature review, a comparative framework will be developed to assess the performance of the most effective and commonly used ML models for the RUL prediction of Li-ion batteries. A MFMT feature mapping is introduced in the proposed architecture. This proposed methodology will be discussed in three case studies using two different datasets. In the proposed framework, data preprocessing is performed on the datasets and multiple features are extracted. Extracted features are fed into the input layers of the ML models for training purposes. Target values are derived from the capacity degradation data of the battery, in which eight parameters are extracted, namely, the remaining life cycle numbers, mean, standard deviation, minimum values, the first (25th percentile), median (50th percentile), and third (75th percentile) quantiles, and maximum values of capacity data. This target dataset is passed to the input layers of the models as target values. Then, the dataset is divided into training, validation, and testing sets. Eight ML algorithms, namely XGBoost, extreme gradient boosting with hyperparameter tuning (XGBoost-HT), LightGBM, light gradient-boosting machine with hyperparameter tuning (LightGBM-HT), RF, MLP, LSTM, and Attention-LSTM are developed to predict the RUL of the batteries. RMSE and MAE metrics are extracted for each model to evaluate the accuracy of the candidate model. Figure 1 and Figure 2 provide the general structure of the proposed comparative analysis methodology. Multiple subsections are dedicated to the mathematical formulations and detailed explanations of the developed ML models and steps undertaken in the methodology follow.

2.1. XGboost Model

XGBoost, an extreme gradient-boosting open-source technique, uses decision trees and gradient boosting to create reliable prediction models. Extra randomization factors, penalization of trees, proportionate shrinking of leaf nodes, and Newton boosting are used to improve performance and avoid overfitting. Penalizing trees prevents overfitting, while Newton boosting decreases the classifier’s correlation to improve performance. Decision trees are fundamental to XGBoost’s assessment, offering unique explanations for diverse cases. This model uses parallel and distributed computing to quickly navigate datasets and make accurate predictions with less input data. Complex tasks are simplified, efficient, and versatile by using XGBoost. XGBoost improves loss functions to turn weak learners into powerful prediction models using gradient boosting. The model combines a compact column-based method for optimal mathematical calculations, and random sampling to reduce overfitting and accelerate the training process. XGBoost’s complicated equation depends on the leaf count, weight, and regularization coefficient. The method gradually adds functions with each training iteration to maintain prediction accuracy [28].

The objective function used in the XGBoost model is provided by Equation (1), in which the first term corresponds to the loss function and the second one corresponds to the regularization term [33].

Obj = \sum_{i = 1}^{N} L (y_{i}, {\hat{y}}_{i}^{(t)}) + \sum_{i = 1}^{K} Ω (f_{i})

(1)

where

y_{i}

represents the actual data for the i-th observation,

{\hat{y}}_{i}^{(t)}

denotes the predicted value,

L (y_{i}, {\hat{y}}_{i}^{(t)})

represents the loss function corresponding to tree t, N defines the number of data points in the training dataset, and K is the total number of individual decision trees in the ensemble. The term

Ω (f_{i})

corresponds to the regularization term to control the complexity of the tree function, which is formulated by the following equation [33]:

Ω (f) = γ T + \frac{1}{2} λ {∥ ω ∥}^{2}

(2)

where T represents the number of leaves in the tree,

ω

presents the weights assigned to the leaves,

γ

stands for the learning rate or shrinkage used for tree pruning, and

λ

is the regularization coefficient applied to prevent the model from overfitting.

2.2. LightGBM Model

LightGBM, a gradient-boosting framework developed by Microsoft, is known for its exceptional performance associated with fast prediction, lower memory occupation, and high accuracy particularly when it comes to processing extensive datasets and building precise ML models. It is classified as gradient boosting, a member of the family of ensemble learning techniques, in which models are constructed to rectify the errors made by their predecessors. The mathematical formulation of LightGBM in terms of its learner performance and loss function is expressed by Equations (3)–(7) [34]:

H_{T} (x) = \sum_{t = 1}^{T} H_{t} (x), H_{t} \in Θ

(3)

h_{t} (x) = arg min_{h \in H} L (y, H_{t - 1} (x) + h_{t} (x))

(4)

r_{t} = - \frac{\partial L (y, H_{t - 1} (x))}{\partial H_{t - 1} (x)}

(5)

h_{t} (x) = arg min_{h \in H} \sum {(r_{t} - h_{t} (x))}^{2}

(6)

H_{t} (x) = H_{t - 1} (x) + h_{t} (x)

(7)

where

H_{t}

,

h_{t - 1} (x)

, and

Θ

represent the tth learner, weak learner, and the set of all learners, respectively. t is the index of the weak learner, T is the total number of weak learners, and x defines the input variable, representing the observations for which predictions are being made. The learner

H_{t - 1} (x)

and loss function (

L (y, H_{t - 1} (x))

) are obtained from the previous iteration. To accelerate convergence, during the iterative process, the negative gradient of the loss function is utilized as a substitute for the actual loss function.

2.3. LSTM and Attention-LSTM Models

LSTM is a RNN architecture designed to learn long-term dependencies in sequential data. Unlike traditional RNNs, LSTMs have a memory cell that can store information over long periods, allowing them to capture the long-range context in the data. The LSTM architecture consists of four main gates: the input gate, the forget gate, the output gate, and the memory cell, where their mathematical representations are provided through Equations (8)–(12), respectively [35].

i_{t} = σ (W_{x i} x_{t} + W_{h i} h_{t - 1} + b_{i})

(8)

f_{t} = σ (W_{x f} x_{t} + W_{h f} h_{t - 1} + b_{f})

(9)

o_{t} = σ (W_{x o} x_{t} + W_{h o} h_{t - 1} + b_{o})

(10)

c_{t} = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ tanh (W_{x c} x_{t} + W_{h c} h_{t - 1} + b_{c})

(11)

h_{t} = o_{t} ⊙ tanh (c_{t})

(12)

where, t represents the current time step,

x_{t}

denotes the input vector at time t,

h_{t}

represents the hidden state vector at time t, and

c_{t}

stands for the memory cell vector at time t. The weights and biases associated with the input, forget, output gates and memory cell update are denoted by

W_{x i}, W_{h i}, b_{i}

,

W_{x f}, W_{h f}, b_{f}

,

W_{x o}, W_{h o}, b_{o}

, and

W_{x c}, W_{h c}, b_{c}

, respectively. Additionally,

σ

represents the sigmoid activation function and tanh denotes the hyperbolic tangent activation function.

Attention-LSTM is an extension of the LSTM architecture that incorporates an attention mechanism. The attention mechanism allows the LSTM to focus on specific parts of the input sequence, which can improve its performance on tasks that require long-term dependencies. The Attention-LSTM architecture consists of an LSTM layer followed by an attention layer. The attention layer calculates a weighted sum of the hidden states of the LSTM layer, where a query vector determines the weights. The output of the attention layer is then used as the input to a fully connected layer to produce the final output [36].

e_{t} = v^{T} tanh (W_{a} h_{t})

(13)

c_{t} = \sum_{t = 1}^{T} α_{t} h_{t}

(14)

y_{t} = W_{o} c_{t} + b_{o} .

(15)

where

e_{t}

represents the attention weight for time step t, v is the query vector,

h_{t}

is the hidden state vector of the LSTM layer at time t,

W_{a}

is the weight matrix for the attention layer,

c_{t}

is the weighted sum of the hidden states,

W_{o}, b_{o}

are the weights and biases for the fully connected layer, and

y_{t}

is the output of the Attention-LSTM at time t.

2.4. Random Forest Model

RF is a supervised learning algorithm that creates a multitude of decision trees at training time and outputs the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. RFs correct for decision trees’ habit of overfitting to their training set. RFs are less likely to overfit the training data because they average the predictions of multiple decision trees and can achieve higher accuracy than single decision trees because they can learn more complex relationships in the data. The mathematical formulation for the RF algorithm is as follows Equation (16) [37]:

\hat{y} = \frac{1}{K} \sum_{k = 1}^{K} {\hat{y}}_{k}

(16)

where

\hat{y}

represents the aggregated prediction made by a RF and K is the total number of decision trees in the forest. For a given input data point x, the RF predicts

\hat{y}

by combining the individual predictions

{\hat{y}}_{k}

made by each decision tree.

2.5. MLP Model

MLP is a feed-forward artificial NN that consists of multiple layers of interconnected nodes. Each node is a simple processing unit that takes a weighted sum of its inputs, applies a nonlinear activation function, and then outputs the result. The layers are typically arranged in a feed-forward manner, meaning that the output of each layer is the input to the next layer. The mathematical formulations for the output of the hidden and output layers in the MLP algorithm are expressed by Equations (17) and (18) [38].

h_{j} = f (\sum_{i = 1}^{n} w_{i j} x_{i} + b_{j})

(17)

where x represents the input vector, h is the hidden layer vector,

h_{j}

is the output of the j-th hidden node,

x_{i}

is the input to the j-th hidden node,

w_{i j}

is the weight of the connection from the i-th input node to the j-th hidden node,

b_{j}

is the bias of the j-th hidden node, and f is the activation function.

y_{k} = softmax (\sum_{j = 1}^{m} w_{k j} h_{j} + b_{k})

(18)

where

h_{j}

is the output of the j-th hidden node,

w_{k j}

is the weight of the connection from the j-th hidden node to the k-th output node, and

b_{k}

is the bias of the k-th output node.

2.6. Hyperparameter Tuning

In developing a ML model, tuning hyperparameters is an important stage in optimizing the efficacy of a model. Hyperparameter tuning is the process of finding the best values for the model’s hyperparameters, which are the parameters that control the model’s behavior, such as the learning rate, the number of trees, the number of estimators, the maximum depth parameter, minimum child weight, and the regularization parameters. A variety of methodologies are available for optimizing hyperparameters, in which frequently employed methods are random search, Bayesian optimization method, and grid search [39]; these methods can be computationally intensive when applied to expansive search spaces. We have utilized grid search [40] in our research; it is a method that entails assessing each possible combination of hyperparameters within a predetermined range. To improve the performance of the XGBoost and LightGBM models, the hyperparameter-tuned versions of these models, namely, XGBoost-HT and LightGBM-HT, are developed using the Grid Search CV.

2.7. Assessment Criteria for the Proposed Comparative Analysis

The performance criteria used in the comparative analysis of this paper include three widely accepted metrics: MAE, MAPE, and RMSE. These metrics are numerical indicators of the precision with which the model predicts the RUL of Li-ion batteries. The MAPE measures the average absolute percentage error between the actual and predicted values. It is calculated by dividing the absolute difference between the actual and predicted values by the actual value and then taking the average of these values over the entire dataset. The MAPE is expressed as a percentage and ranges from 0 to ∞, with 0 indicating perfect prediction and larger values indicating poorer prediction accuracy. The RMSE measures the average squared difference between the actual and predicted values. It is calculated by taking the square root of the average of the squared differences between the actual and predicted values over the entire dataset. The MAE is another metric that quantifies prediction accuracy by measuring the average absolute difference between the actual and predicted values. Lower MAE values indicate better prediction accuracy, while larger values suggest poorer accuracy. Similarly, the RMSE and MAE both range from 0 to ∞, with 0 indicating perfect prediction and larger values indicating poorer prediction accuracy [41]. Mathematical expressions representing MAE, RMSE, and MAPE follow in Equations (19)–(21):

M A E = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - \hat{y_{i}} |

(19)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}

(20)

M A P E = \frac{1}{n} \sum_{i = 1}^{n} |\frac{y_{i} - \hat{y_{i}}}{y_{i}}|

(21)

where

y_{i}

and

\hat{y_{i}}

are the actual and predicted values for the remaining useful life of the battery, respectively.

3. Case Study and Numerical Results

In this section, case studies that are implemented to evaluate the performance of the developed ML models are presented. To this end, eight distinct ML models, namely XGBoost, XGBoost-HT, LightGBM, LightGBM-HT, RF, MLP, LSTM, and Attention-LSTM are employed for predicting the RUL of Li-ion batteries. Subsequently, we explain the battery dataset, outline the simulation configuration, and present various scenarios for conducting the comparative analyses. By exploring these diverse scenarios, we aim to investigate the performance and capabilities of the selected ML models in predicting the RUL for Li-ion batteries for all life cycles, utilizing the newly introduced MFMT feature mapping. Modeling of the ML models for RUL prediction of Li-ion batteries is implemented in a Python environment (V3.9) on a personal computer with GPU enabled.

3.1. Data Library

Dataset 1 utilized in the first two case studies is sourced from [42], created by implementing a battery degradation simulation within the MATLAB Simulink environment [43]. This dataset consists of 945 different battery aging tests, with different parameters. The battery model offers flexibility in adjusting parameters such as SOC, DOD, temperature, and current rate (C-rate). The configurations for battery types are adaptable based on the manufacturer’s datasheet, enabling the incorporation of new battery types. Ambient temperature, battery aging, and dynamic internal resistance are simulated in this model, which is closely linked to battery degradation behavior. This battery model facilitates the simulation of a wide range of battery types, operational profiles, and conditions demonstrating linear battery capacity degradation; however, the real-world battery dataset presents the challenge of nonlinear capacity fades. While a singular output model excels at predicting RUL in a linear context, addressing the nonlinearity in capacity fade requires a model capable of tracking these complex patterns. This paper proposes the adoption of XGBoost-HT with an MFMT approach as a promising solution to this problem. We plan to validate the efficacy of this model by applying it to actual battery data, dataset 2, aiming to enhance the accuracy and reliability of predictions in practical scenarios. The battery cycle life for dataset 1 and dataset 2 is presented in Figure 3, respectively.

Dataset 2 consists of 124 commercial Li-ion phosphate/graphite cells (A123 system, model APR18650M1A, 1.1 Ah nominal capacity) subjected to various fast-charging conditions until reaching the end of life (EOL) (80% of initial capacity) within a controlled environment set at 30 °C. This dataset is used in the third case study. Various fast charging policies are applied by varying the C-rate from 3.6 C to 6 C across three separate stages: the initial stage with a C-rate of C1, the intermediate stage with a C-rate of C2, and the final stage followed by a 4 C discharge rate until reaching the cut-off voltage (3.6 V to 2.0 V). As introduced by Severson et al. [25], this dataset is divided into three segments: training, primary test, and secondary test which encompass a total of 41, 43, and 40 Li-ion batteries, respectively. In Case 3, the training dataset is split into training and validation sets, while the primary test evaluates the proposed ML model. This dataset includes 72 distinct charging policies and life cycles ranging from 150 to 2300. In addition, the rest time considered for charging the batteries varies between 1 and 300 s. At each cycle, measurements of temperature, current, voltage, and charge/discharge capacity are measured providing insights into battery degradation characteristics. The extracted features from the first 100 cycles of capacity degradation include key parameters such as

Δ Q_{v}

,

Δ Q_{m}

, the slope of the linear fit to the capacity fade curve, the intercept of the linear fit to the capacity fade curve, discharge capacity at the second cycle, average charge time over the first five cycles, minimum internal resistance, the internal resistance difference between cycles 2 and 100, C1, C2, SOC, rest time 1, and rest time 2. The target values considered for training and testing are similar to those utilized in the first two cases.

The distribution of battery life cycle in both datasets and the four temperature, SOC, C-rate, and DOD features for dataset 1 are illustrated in Figure 4 and Figure 5, respectively.

3.2. Comparative Analysis

In this section, we provide a detailed discussion of the numerical results obtained from three distinct cases aimed at predicting the capacity fade and RUL of Li-ion batteries for dataset 1 and dataset 2, utilizing multiple ML algorithms. In Cases 1 and 2, the models are trained with four essential features, while Case 3 incorporates 13 features serving as input data for the ML models. The primary target variable, or label, for Case 1 is the remaining life cycle number. The RUL predictions generated by the ML models for Case 1 are presented in Section 3.2.1. Unlike Case 1, in Cases 2 and 3, the predictive framework is more intricate, encompassing multiple target variables or labels provided in Section 3.2.2 and Section 3.2.3. These target values include the life cycle, mean, standard deviation, minimum values, maximum values, and the first, median, and third quantiles of the capacity data, reflecting various statistical aspects and characteristics associated with the Li-ion batteries. For a more detailed understanding of the interplay between features and target values in Case 2 and Case 3, refer to Figure 2. An overview of the analysis conducted in each case is given below:

Case 1: Training and testing the multi-feature single-target ML models for dataset 1;
Case 2: Training and testing the MFMT ML models for dataset 1;
Case 3: Validating the proposed methodology with selected ML models for dataset 2 utilizing the proposed MFMT feature mapping.

3.2.1. Case 1

In this case, four features, specifically SOC, DOD, temperature, and C-rate, are used in input layers as features, and the life cycle of the batteries is chosen for the output layer as a single label. In Figure 6, the RUL prediction of Li-ion batteries for the test dataset can be observed using multiple ML models. The figure shows that the XGBoost, LightGBM-HT, and RF models are quite accurate in predicting the battery cycles in comparison to the other models. We can realize that the LSTM and MLP models have poor performance since more outliers are associated with their predictions. In addition, the LightGBM model is acceptable for performing predictions, neither as accurate as LightGBM-HT nor as poor as MLP. Therefore, XGBoost, LightGBM-HT, and RF can be proper choices among the ML models to be utilized for predicting the capacity fade of Li-ion batteries in the first case. For the capacity fade and RUL prediction of Li-ion batteries in Case 1, Figure 7 illustrates the performance of the XGBoost-HT and XGBoost models, both outperforming other ML models. In this case, the RUL prediction with XGBoost-HT closely aligns with the actual values, indicating a degradation rate of -0.000024 Ah per cycle. Also, RMSE and MAE values are obtained for each ML model, which are recorded in the optimal epoch. These values are presented in Table 1, where lower RMSE and MAE values indicate better prediction results. For the XGBoost-HT model, the RMSE and MAE values are 0.033 and 0.057, respectively. Other developed ML models can be prioritized according to the RMSE values in sequential order such as XGBoost, RF, LightGBM-HT, Attention-LSTM, LightGBM, LSTM, and MLP. Hence, the XGBoost and RF models have great performance in the capacity fade and RUL prediction of Li-ion batteries in Case 1.

3.2.2. Case 2

In this case, the input layer incorporates four key features: SOC, DOD, temperature, and C-rate. Additionally, the output layer receives eight parameters derived from capacity degradation data as multiple labels. Figure 8 illustrates the capacity fade and RUL prediction for Li-ion batteries using the developed ML models. Since multiple labels are fed into the model for training, the XGBoost-HT model shows an accurate performance in predicting the capacity data. Although the XGBoost-HT, XGBoost, and RF models have proper performance, we can observe that the MLP model is the worst if the labels have multiple dimensions. It is important to highlight that the LightGBM model cannot be used in this particular case. This limitation arises from its inability to handle multiple labels in the output layer as a label. For the RUL prediction of a sample test Li-ion battery, Figure 9 shows the XGBoost-HT and RF models have great performance with a predicted degradation rate of −0.000023 Ah per cycle. Additionally, MAE and RMSE values for this case are provided in Table 1. These values for the XGBoost-HT model are 0.014 and 0.040, respectively, which are lower compared to other developed ML models. It is noteworthy to state that the performance of RF is notable, with RMSE and MAE values of 0.014 and 0.052, respectively. The outcomes from Cases 1 and 2 indicate that the XGBoost-HT and RF models have the potential to be effective ML tools for predicting the RUL and capacity fade of Li-ion batteries.

3.2.3. Case 3

Based on the outcomes from Case 2, four of the developed ML models exhibiting notable performance in terms of RMSE and MAE metrics have been selected for testing on a real dataset. The selected ML models for validation purposes include XGBoost, XGBoost-HT, RF, and Attention-LSTM. Figure 10 provides the capacity fade and RUL predictions obtained from these models compared with actual values. According to the results, it is evident that both XGBoost and XGBoost-HT outperform Attention-LSTM and RF in terms of both capacity degradation prediction and RUL prediction within this dataset. Table 2 provides the resultant RMSE and MAPE values of the third case in comparison with the available benchmarks. Accordingly, the proposed XGBoost-HT model outperforms not only the developed ML models in this paper but also the other benchmarks with RMSE and MAPE values of 6.9 cycles and 6.5% for predicting the RUL of Li-ion batteries, respectively. This observation underscores the robustness and efficacy of the XGBoost-HT method in capturing the intricate dynamics of battery performance under varying conditions.

4. Conclusions

This paper provided a comprehensive comparative study of eight ML models for predicting the capacity fade and RUL of Li-ion batteries. The developed models include RF, MLP, XGBoost, XGBoost-HT, LSTM, Attention-LSTM, LightGBM, and LightGBM-HT. The significance of this study lies in exploring three case studies, in which the MFMT feature mapping was proposed to enhance the performance of ML models in predicting the accurate RUL of Li-ion batteries with two different synthetic and real datasets. To this end, Case 1 involves the utilization of four input features and a single target, whereas Case 2 employs four input features with multiple target values considering the proposed MFMT feature mapping. These two case studies consider the synthetic dataset to train and test the performance of the developed ML models. Results revealed that the XGBoost-HT has lower MAE and RMSE values compared to the rest of the ML models for both of the first two cases. The obtained RMSE and MAPE values for XGBoost-HT are 0.033 and 0.057 for the first case, and 0.014 and 0.040 for the second case, respectively. Based on the results, a third case was further investigated to validate the performance of developed ML models that exhibited better performance in the first two case studies on a real dataset. Accordingly, four ML models were selected to perform a test with a real dataset to investigate the effectiveness of the proposed methodology and developed ML models. The results of Case 3 also verified the lower RMSE and MAPE values for XGBoost-HT. The obtained RMSE and MAPE values of the third case for XGBoost-HT in terms of cycle life were 69 and 6.5, respectively. These values are lower than the reported results in the literature for similar benchmarks in the RUL prediction of Li-ion batteries. Therefore, the results highlighted the exceptional performance of the XGBoost-HT model in the proposed three case studies. This superior performance is attributed to the ensemble nature of the XGBoost-HT model, which incorporates regularization techniques and has an optimized implementation. In contrast, the LightGBM and LightGBM-HT models demonstrate limitations in handling the proposed MFMT feature mappings, indicating their unsuitability for complex feature relationships. These findings underscore the XGBoost-HT model’s efficacy in predicting the RUL of Li-ion batteries, positioning it as a reliable choice for battery management systems due to its high accuracy and robustness. The MFMT feature mapping introduced in Case 2 further enriches the novelty of this research, contributing to the advancement of battery prognostics.

Author Contributions

Conceptualization, V.S., A.M.V., N.B., J.C.V. and J.M.G.; methodology, V.S., A.M.V. and N.B.; software, V.S., A.M.V. and N.B.; validation, V.S., A.M.V. and N.B.; formal analysis, V.S., A.M.V. and N.B.; investigation, V.S., A.M.V. and N.B.; resources, V.S., N.B., J.C.V. and J.M.G.; data curation, V.S., A.M.V., N.B., J.C.V. and J.M.G.; writing—original draft preparation, V.S., A.M.V. and N.B.; writing—review and editing, V.S., A.M.V., N.B., J.C.V. and J.M.G.; visualization, V.S., A.M.V. and N.B.; supervision, J.C.V. and J.M.G.; project administration, J.C.V. and J.M.G.; funding acquisition, J.C.V. and J.M.G. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by VILLUM FONDEN under the VILLUM Investigator Grant (no. 25920): Center for Research on Microgrids (CROM).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Study data available upon request from the corresponding author.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this paper:

Li-ion	Lithium ion
SOH	State of health
SOC	State of charge
DOD	Depth of discharge
RUL	Remaining useful life
MFMT	Multi-feature multi-target
C-rate	Current rate
MC	Monte Carlo
ML	Machine learning
DL	Deep learning
NN	Neural network
DNN	Deep neural network
CNN	Convolutional neural network
CNN-LSTM	Convolutional neural network–long short-term memory
TCNN	Temporal convolutional neural network
RNN	Recurrent neural network
LSTM	Long short-term memory
XGBoost	Extreme gradient boosting
LightGBM	Light gradient-boosting machine
XGBoost-HT	Extreme gradient boosting with hyperparameter tuning
LightGBM-HT	Light gradient-boosting machine with hyperparameter tuning
TCN	Temporal convolutional network
MLP	Multi-layer perceptron
RF	Random Forest
SVR	Support Vector Regression
GRU	Gated recurrent unit
RMSE	Root-mean-squared error
MAE	Mean absolute error
MAPE	Mean absolute percentage error
PSO	Particle swarm optimization

References

Jagadale, A.; Zhou, X.; Xiong, R.; Dubal, D.P.; Xu, J.; Yang, S. Lithium ion capacitors (LICs): Development of the materials. Energy Storage Mater. 2019, 19, 314–329. [Google Scholar] [CrossRef]
Rahimi-Eichi, H.; Ojha, U.; Baronti, F.; Chow, M.Y. Battery management system: An overview of its application in the smart grid and electric vehicles. IEEE Ind. Electron. Mag. 2013, 7, 4–16. [Google Scholar] [CrossRef]
Lipu, M.H.; Hannan, M.; Hussain, A.; Hoque, M.; Ker, P.J.; Saad, M.H.M.; Ayob, A. A review of state of health and remaining useful life estimation methods for lithium-ion battery in electric vehicles: Challenges and recommendations. J. Clean. Prod. 2018, 205, 115–133. [Google Scholar] [CrossRef]
Safavi, V.; Bazmohammadi, N.; Vasquez, J.C.; Guerrero, J.M. Battery State-of-Health Estimation: A Step towards Battery Digital Twins. Electronics 2024, 13, 587. [Google Scholar] [CrossRef]
Ahmadian, A.; Sedghi, M.; Elkamel, A.; Fowler, M.; Golkar, M.A. Plug-in electric vehicle batteries degradation modeling for smart grid studies: Review, assessment and conceptual framework. Renew. Sustain. Energy Rev. 2018, 81, 2609–2624. [Google Scholar] [CrossRef]
Li, Y.; Liu, K.; Foley, A.M.; Zülke, A.; Berecibar, M.; Nanini-Maury, E.; Van Mierlo, J.; Hoster, H.E. Data-driven health estimation and lifetime prediction of lithium-ion batteries: A review. Renew. Sustain. Energy Rev. 2019, 113, 109254. [Google Scholar] [CrossRef]
Xu, B.; Oudalov, A.; Ulbig, A.; Andersson, G.; Kirschen, D.S. Modeling of Lithium-Ion Battery Degradation for Cell Life Assessment. IEEE Trans. Smart Grid 2018, 9, 1131–1140. [Google Scholar] [CrossRef]
El-Dalahmeh, M.; Al-Greer, M.; El-Dalahmeh, M.; Bashir, I. Physics-based model informed smooth particle filter for remaining useful life prediction of lithium-ion battery. Measurement 2023, 214, 112838. [Google Scholar] [CrossRef]
Downey, A.; Lui, Y.H.; Hu, C.; Laflamme, S.; Hu, S. Physics-based prognostics of lithium-ion battery using non-linear least squares with dynamic bounds. Reliab. Eng. Syst. Saf. 2019, 182, 1–12. [Google Scholar] [CrossRef]
Wang, Y.; Tian, J.; Sun, Z.; Wang, L.; Xu, R.; Li, M.; Chen, Z. A comprehensive review of battery modeling and state estimation approaches for advanced battery management systems. Renew. Sustain. Energy Rev. 2020, 131, 110015. [Google Scholar] [CrossRef]
Ma, L.; Tian, J.; Zhang, T.; Guo, Q.; Hu, C. Accurate and efficient remaining useful life prediction of batteries enabled by physics-informed machine learning. J. Energy Chem. 2024, 91, 512–521. [Google Scholar] [CrossRef]
Nascimento, R.G.; Corbetta, M.; Kulkarni, C.S.; Viana, F.A. Hybrid physics-informed neural networks for lithium-ion battery modeling and prognosis. J. Power Sources 2021, 513, 230526. [Google Scholar] [CrossRef]
Xie, Q.; Liu, R.; Huang, J.; Su, J. Residual life prediction of lithium-ion batteries based on data preprocessing and a priori knowledge-assisted CNN-LSTM. Energy 2023, 281, 128232. [Google Scholar] [CrossRef]
Tang, A.; Jiang, Y.; Nie, Y.; Yu, Q.; Shen, W.; Pecht, M.G. Health and lifespan prediction considering degradation patterns of lithium-ion batteries based on transferable attention neural network. Energy 2023, 279, 128137. [Google Scholar] [CrossRef]
Tang, T.; Yuan, H. The capacity prediction of Li-ion batteries based on a new feature extraction technique and an improved extreme learning machine algorithm. J. Power Sources 2021, 514, 230572. [Google Scholar] [CrossRef]
Li, L.; Li, Y.; Mao, R.; Li, L.; Hua, W.; Zhang, J. Remaining useful life prediction for lithium-ion batteries with a hybrid model based on TCN-GRU-DNN and dual attention mechanism. IEEE Trans. Transp. Electrif. 2023, 9, 4726–4740. [Google Scholar] [CrossRef]
Rouhi Ardeshiri, R.; Ma, C. Multivariate gated recurrent unit for battery remaining useful life prediction: A deep learning approach. Int. J. Energy Res. 2021, 45, 16633–16648. [Google Scholar] [CrossRef]
Wei, M.; Ye, M.; Wang, Q.; Twajamahoro, J.P. Remaining useful life prediction of lithium-ion batteries based on stacked autoencoder and gaussian mixture regression. J. Energy Storage 2022, 47, 103558. [Google Scholar] [CrossRef]
Hafizhahullah, H.; Yuliani, A.R.; Pardede, H.; Ramdan, A.; Zilvan, V.; Krisnandi, D.; Kadar, J. A Hybrid CNN-LSTM for Battery Remaining Useful Life Prediction with Charging Profiles Data. In Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications, Virtual Event, Indonesia, 22–23 November 2022; pp. 106–110. [Google Scholar]
Zhang, Z.; Zhang, W.; Yang, K.; Zhang, S. Remaining useful life prediction of lithium-ion batteries based on attention mechanism and bidirectional long short-term memory network. Measurement 2022, 204, 112093. [Google Scholar] [CrossRef]
Han, Y.; Li, C.; Zheng, L.; Lei, G.; Li, L. Remaining useful life prediction of lithium-ion batteries by using a denoising transformer-based neural network. Energies 2023, 16, 6328. [Google Scholar] [CrossRef]
Pugalenthi, K.; Park, H.; Hussain, S.; Raghavan, N. Remaining useful life prediction of lithium-ion batteries using neural networks with adaptive bayesian learning. Sensors 2022, 22, 3803. [Google Scholar] [CrossRef]
Park, K.; Choi, Y.; Choi, W.J.; Ryu, H.Y.; Kim, H. LSTM-based battery remaining useful life prediction with multi-channel charging profiles. IEEE Access 2020, 8, 20786–20798. [Google Scholar] [CrossRef]
Yang, N.; Hofmann, H.; Sun, J.; Song, Z. Remaining Useful Life Prediction of Lithium-ion Batteries with Limited Degradation History Using Random Forest. IEEE Trans. Transp. Electrif. 2023. [Google Scholar] [CrossRef]
Severson, K.A.; Attia, P.M.; Jin, N.; Perkins, N.; Jiang, B.; Yang, Z.; Chen, M.H.; Aykol, M.; Herring, P.K.; Fraggedakis, D.; et al. Data-driven prediction of battery cycle life before capacity degradation. Nat. Energy 2019, 4, 383–391. [Google Scholar] [CrossRef]
Jafari, S.; Byun, Y.C. Optimizing Battery RUL Prediction of Lithium-ion Batteries based on Harris Hawk Optimization Approach using Random Forest and LightGBM. IEEE Access 2023, 11, 87034–87046. [Google Scholar] [CrossRef]
Ha, V.T. Experimental Study on Remaining Useful Life Prediction of Lithium-Ion Batteries Based on Three Regressions Models for Electric Vehicle Applications. Appl. Sci. 2023, 13, 7660. [Google Scholar] [CrossRef]
Ali, Z.A.; Abduljabbar, Z.H.; Taher, H.A.; Sallow, A.B.; Almufti, S.M. Exploring the Power of eXtreme Gradient Boosting Algorithm in Machine Learning: A Review. Acad. J. Nawroz Univ. 2023, 12, 320–334. [Google Scholar]
Liu, X.; Liu, X.; Fang, L.; Wu, M.; Wu, J. Dual particle swarm optimization based data-driven state of health estimation method for lithium-ion battery. J. Energy Storage 2022, 56, 105908. [Google Scholar] [CrossRef]
Que, Z.; Xu, Z. A data-driven health prognostics approach for steam turbines based on xgboost and dtw. IEEE Access 2019, 7, 93131–93138. [Google Scholar] [CrossRef]
Yang, S. Prediction Method of Remaining Service Life of Li-ion Batteries Based on XGBoost and LightGBM. In Proceedings of the 2022 2nd International Conference on Algorithms, High Performance Computing and Artificial Intelligence (AHPCAI), Guangzhou, China, 21–23 October 2022; pp. 324–327. [Google Scholar]
Jafari, S.; Byun, Y.C. XGBoost-Based Remaining Useful Life Estimation Model with Extended Kalman Particle Filter for Lithium-Ion Batteries. Sensors 2022, 22, 9522. [Google Scholar] [CrossRef]
Bentéjac, C.; Csörgo, A.; Martínez-Muñoz, G. A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
Jiao, Z.; Wang, H.; Xing, J.; Yang, Q.; Yang, M.; Zhou, Y.; Zhao, J. A LightGBM Based Framework for Lithium-Ion Battery Remaining Useful Life Prediction Under Driving Conditions. IEEE Trans. Ind. Inform. 2023, 19, 11353–11362. [Google Scholar] [CrossRef]
Smagulova, K.; James, A.P. A survey on LSTM memristive neural network architectures and applications. Eur. Phys. J. Spec. Top. 2019, 228, 2313–2324. [Google Scholar] [CrossRef]
Li, Y.; Zhu, Z.; Kong, D.; Han, H.; Zhao, Y. EA-LSTM: Evolutionary attention-based LSTM for time series prediction. Knowl. Based Syst. 2019, 181, 104785. [Google Scholar] [CrossRef]
Biau, G.; Scornet, E. A random forest guided tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
Popescu, M.C.; Balas, V.E.; Perescu-Popescu, L.; Mastorakis, N. Multilayer perceptron and neural networks. WSEAS Trans. Circuits Syst. 2009, 8, 579–588. [Google Scholar]
Yang, L.; Shami, A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 2020, 415, 295–316. [Google Scholar] [CrossRef]
Belete, D.M.; Huchaiah, M.D. Grid search in hyperparameter optimization of machine learning models for prediction of HIV/AIDS test results. Int. J. Comput. Appl. 2022, 44, 875–886. [Google Scholar] [CrossRef]
Chai, T.; Draxler, R.R. Root mean square error (RMSE) or mean absolute error (MAE). Geosci. Model Dev. Discuss. 2014, 7, 1525–1534. [Google Scholar]
Zhao, C.; Li, X. Microgrid Optimal Energy Scheduling Considering Neural Network based Battery Degradation. IEEE Trans. Power Syst. 2023, 39, 1594–1606. [Google Scholar] [CrossRef]
Omar, N.; Monem, M.A.; Firouz, Y.; Salminen, J.; Smekens, J.; Hegazy, O.; Gaulous, H.; Mulder, G.; Van den Bossche, P.; Coosemans, T.; et al. Lithium iron phosphate based battery—Assessment of the aging parameters and development of cycle life model. Appl. Energy 2014, 113, 1575–1585. [Google Scholar] [CrossRef]
Ma, G.; Wang, Z.; Liu, W.; Fang, J.; Zhang, Y.; Ding, H.; Yuan, Y. A two-stage integrated method for early prediction of remaining useful life of lithium-ion batteries. Knowl. Based Syst. 2023, 259, 110012. [Google Scholar] [CrossRef]
Wei, Z.; Liu, C.; Sun, X.; Li, Y.; Lu, H. Two-phase early prediction method for remaining useful life of lithium-ion batteries based on a neural network and Gaussian process regression. Front. Energy 2023, 1–16. [Google Scholar] [CrossRef]
Alipour, M.; Tavallaey, S.S.; Andersson, A.M.; Brandell, D. Improved Battery Cycle Life Prediction Using a Hybrid Data-Driven Model Incorporating Linear Support Vector Regression and Gaussian. ChemPhysChem 2022, 23, e202100829. [Google Scholar] [CrossRef]

Figure 1. Structure of the proposed multi-feature single-target ML model.

Figure 2. Structure of the proposed multi-feature multi-target ML model.

Figure 3. Battery capacity fade for dataset 1 (a) and dataset 2 (b).

Figure 4. Distribution of battery cycle life for dataset 1 (a) and dataset 2 (b).

Figure 5. Distribution of four features (temperature (a), C-rate (b), SOC (c), and DOD (d)) and battery life cycle in dataset 1.

Figure 6. Comparison of developed ML models for predicting the RUL of Li-ion batteries for Case 1: XGBoost (a), XGBoost-HT (b), RF (c), MLP (d), LightGBM (e), LightGBM-HT (f), LSTM (g), Attention-LSTM (h).

Figure 7. Comparison of developed ML models for predicting the capacity degradation of Li-ion batteries for a sample test battery in Case 1: XGBoost (a), XGBoost-HT (b), RF (c), MLP (d), LightGBM (e), LightGBM-HT (f), LSTM (g), Attention-LSTM (h).

Figure 8. Comparison of developed ML models for predicting the RUL of Li-ion batteries for Case 2: XGBoost (a), XGBoost-HT (b), RF (c), MLP (d), LSTM (e), and Attention-LSTM (f).

Figure 9. Comparison of developed ML models for predicting the capacity degradation of Li-ion batteries for a sample test battery in Case 2: XGBoost (a), XGBoost-HT (b), RF (c), MLP (d), LSTM (e), and Attention-LSTM (f).

Figure 10. Comparison of developed ML models for predicting the RUL of Li-ion batteries for validation of the proposed method in Case 2 with dataset 2: XGBoost (a), XGBoost-HT (b), RF (c), MLP (d), LightGBM (e), LightGBM-HT (f), LSTM (g), Attention-LSTM (h).

Table 1. The RMSE and MAE values for RUL prediction of batteries in dataset 1.

Model	Case 1		Case 2
	MAE	RMSE	MAE	RMSE
XGBoost	0.029	0.065	0.013	0.055
XGBoost-HT	0.033	0.057	0.014	0.040
MLP	0.086	0.120	0.058	0.107
Random Forest	0.032	0.065	0.014	0.052
LSTM	0.058	0.093	0.050	0.103
Attention-LSTM	0.041	0.074	0.047	0.099
LightGBM	0.040	0.075	-	-
LightGBM-HT	0.039	0.070	-	-

Table 2. RMSE and MAPE values for RUL prediction of batteries in dataset 2.

Benchmark	Model	RMSE (cycle)		MAPE (%)
		Training	Test	Training	Test
Severson et al. [25]	Variance	103	138	14.1	14.7
	Discharge	76	91	9.8	13
	Full	51	118	5.6	14.1
Ma et al. [44]	CNN	51	90	5.1	10
Wei et al. [45]	GPR	1	80	0.09	8
Alipour et al. [46]	LSVR	13.8	177	1.1	8.3
This paper	XGBoost	0.9	73.4	0.1	7.2
	XGBoost-HT	1.6	69	0.2	6.5
	Random Forest	36.2	88	3.7	9.2
	Attention-LSTM	223.6	237.8	31.5	29.2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Safavi, V.; Mohammadi Vaniar, A.; Bazmohammadi, N.; Vasquez, J.C.; Guerrero, J.M. Battery Remaining Useful Life Prediction Using Machine Learning Models: A Comparative Study. Information 2024, 15, 124. https://doi.org/10.3390/info15030124

AMA Style

Safavi V, Mohammadi Vaniar A, Bazmohammadi N, Vasquez JC, Guerrero JM. Battery Remaining Useful Life Prediction Using Machine Learning Models: A Comparative Study. Information. 2024; 15(3):124. https://doi.org/10.3390/info15030124

Chicago/Turabian Style

Safavi, Vahid, Arash Mohammadi Vaniar, Najmeh Bazmohammadi, Juan C. Vasquez, and Josep M. Guerrero. 2024. "Battery Remaining Useful Life Prediction Using Machine Learning Models: A Comparative Study" Information 15, no. 3: 124. https://doi.org/10.3390/info15030124

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Battery Remaining Useful Life Prediction Using Machine Learning Models: A Comparative Study

Abstract

1. Introduction

1.1. Literature Review

1.2. Novelty and Contribution

1.3. Paper Organization

2. Proposed Comparative Analysis Methodology

2.1. XGboost Model

2.2. LightGBM Model

2.3. LSTM and Attention-LSTM Models

2.4. Random Forest Model

2.5. MLP Model

2.6. Hyperparameter Tuning

2.7. Assessment Criteria for the Proposed Comparative Analysis

3. Case Study and Numerical Results

3.1. Data Library

3.2. Comparative Analysis

3.2.1. Case 1

3.2.2. Case 2

3.2.3. Case 3

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI