Next Article in Journal
Frequency-Dependent Contrast Enhancement for Conductive and Non-Conductive Materials in Electrical Impedance Tomography
Previous Article in Journal
The Numerical Analysis of Textile Reinforced Concrete Shells: Basic Principles
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Short-Term Energy Consumption Prediction of Large Public Buildings Combined with Data Feature Engineering and Bilstm-Attention

1
School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China
2
School of Building Services Science and Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(5), 2137; https://doi.org/10.3390/app14052137
Submission received: 25 January 2024 / Revised: 21 February 2024 / Accepted: 1 March 2024 / Published: 4 March 2024
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

:
Accurate building energy consumption prediction is a crucial condition for the sustainable development of building energy management systems. However, the highly nonlinear nature of data and complex influencing factors in the energy consumption of large public buildings often pose challenges in improving prediction accuracy. In this study, we propose a combined prediction model that combines signal decomposition, feature screening, and deep learning. First, we employ the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) to decompose energy consumption data. Next, we propose the Maximum Mutual Information Coefficient (MIC)-Fast Correlation Based Filter (FCBF) combined feature screening method for feature selection on the decomposed components. Finally, the selected input features and corresponding components are fed into the Bi-directional Long Short-Term Memory Attention Mechanism (BiLSTMAM) model for prediction, and the aggregated results yield the energy consumption forecast. The proposed approach is validated using energy consumption data from a large public building in Shaanxi Province, China. Compared with the other five comparison methods, the RMSE reduction of the CEEMDAN-MIC-FCBF-BiLSTMAM model proposed in this study ranged from 57.23% to 82.49%. Experimental results demonstrate that the combination of CEEMDAN, MIC-FCBF, and BiLSTMAM modeling markedly improves the accuracy of energy consumption predictions in buildings, offering a potent method for optimizing energy management and promoting sustainability in large-scale facilities.

1. Introduction

The building sector is a major contributor to global energy consumption and carbon dioxide emissions, surpassing both industry and transportation, accounting for 46% of total energy consumption and a high share of 36% in carbon emissions [1,2]. Driven by population growth and an increasing demand for comfortable living environments, energy consumption in the building sector has shown an annual growth rate of 1.1% [3]. Consequently, achieving energy efficiency and carbon reduction in the building sector has become a crucial focus in global environmental research [4]. To realize energy efficiency and carbon reduction in the building sector, efficient building energy planning strategies and various intelligent building energy-saving technologies are indispensable [5]. In this context, the establishment of accurate building energy consumption prediction mechanisms is of paramount importance, serving as the foundation for the implementation of smart building energy-saving technologies such as safety monitoring, demand response, and optimization control [6,7].
Standard building energy consumption prediction models mainly fall into three types: white-box models, gray-box models, and black-box models [8]. White-box models utilize the principles of thermodynamics and mathematical equations to accurately describe the building energy consumption process. Although they offer high transparency and prediction accuracy, they face challenges due to high data requirements, complexity, and dependence on deep domain knowledge [9]. Gray-box models, positioned at the intersection of white-box and black-box models in theoretical frameworks, combine physical system principles with empirical knowledge. By incorporating parameter estimation and tuning, gray-box models exhibit data adaptability and robustness, demonstrating ideal transparency and flexibility in handling incomplete data or complex systems. However, they require domain expertise for parameter adjustments, demanding a certain level of understanding and prior knowledge, leading to relatively higher computational costs [10,11]. In the field of large-scale public building energy consumption prediction, black-box models have achieved success due to their simplicity in modeling, high prediction accuracy, and ability to address nonlinear problems [12]. However, feature screening and handling energy consumption data noise are critical factors affecting the prediction accuracy of black-box models. Therefore, many researchers have focused on studying energy consumption prediction methods based on black-box models [13,14].
Over the years, various scholars have conducted research on energy consumption prediction using black-box models, primarily involving statistical methods and machine learning methods [15,16]. Statistical methods [17], including multivariate linear regression [18] and autoregressive integrated moving average (ARIMA) [19], can fit linear relationships in data. However, the models of statistical methods are usually simpler, and it is difficult to capture complex time-series dynamics or complex relationships in high-dimensional spaces when dealing with nonlinear time-series data or high-dimensional data, thus generating large errors. As the internal complexity of buildings continues to increase, machine learning methods demonstrate advantages in dealing with nonlinear, non-stationary, and high-dimensional data [20]. Machine learning-based energy consumption prediction models encompass several types, such as Support Vector Machines (SVMs) [21,22], Artificial Neural Networks (ANNs) [23,24], and deep learning (DL) methods [25,26]. Among them, deep learning methods exhibit advantages in the field of energy consumption prediction compared to ordinary machine learning methods [27]. They can automatically learn complex nonlinear relationships, adapt to large-scale data, effectively capture spatiotemporal relationships, and enable end-to-end learning. Therefore, deep learning methods are widely applied in the predictive research of complex systems, including energy consumption in large-scale public buildings.
Predicting energy consumption in large public buildings is highly challenging due to variations in functional areas, diverse time-series characteristics, complex building structures, sensor data noise, and various influencing factors on energy consumption [28]. To address this, researchers propose methods utilizing feature engineering and signal decomposition techniques for energy consumption prediction in large public buildings, demonstrating promising predictive accuracy [29,30]. Zheng et al. [31] combined Empirical Mode Decomposition (EMD) and Long Short-Term Memory (LSTM) neural networks to construct a short-term load prediction model. They applied the EMD method to decompose the load into several Intrinsic Mode Functions (IMFs) and a residue, while the LSTM neural network was used to predict each IMF and residue, yielding the final prediction. Tao et al. [32] employed an EMD-Elman neural network prediction model for wind speed prediction and compared its performance with other widely used methods, showing a significant improvement in predictive accuracy. Addressing the issue of high feature complexity in the prediction process, Mao et al. [33] combined the entropy weight method with gray relational analysis for feature extraction to enhance the accuracy of cooling load prediction. Torres et al. [34] proposed the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) algorithm to enhance the performance of EEMD. Compared to EMD and EEMD, CEEMDAN improves denoising capabilities, preserves the physical meaning of the signal, enhances the physical interpretability of mode functions, and offers more flexible parameter settings and better convergence [35]. While these studies employ signal decomposition to enhance predictive accuracy, none of them individually perform regression feature selection on the decomposed components. Therefore, in this study, we utilize the CEEMDAN method for data signal decomposition and concurrently conduct feature importance screening on the different decomposition outcomes. This approach aids in more accurately predicting energy consumption data.
In conclusion, while some studies have employed ensemble learning methods involving signal decomposition and feature engineering, these approaches often lack feature screening or the utilization of different machine learning models for different frequency signals after decomposition. Therefore, this study proposes a model for predicting energy consumption in large public buildings by combining CEEMDAN signal decomposition, composite feature screening, and Bilstm-attention prediction. This approach aims to enhance the accuracy of energy consumption prediction through signal decomposition and feature screening. The core idea involves decoupling the highly nonlinear energy consumption data of large public buildings using CEEMDAN signal decomposition, obtaining decomposed Intrinsic Mode Function (IMF) components and residuals. For each IMF component and residual, composite feature screening is applied separately, considering the varying importance of features for each component. Finally, these components are input into the Bilstm-attention model for individual predictions. The structure diagram of this paper is shown in Figure 1. The results for each component are summed to obtain the final prediction. The experimental results show that the model is effective in predicting the energy consumption of complex large public buildings.
Addressing the limitations of existing methods, this paper proposes a combined prediction model that integrates signal decomposition, feature screening, and deep learning. The key contributions of this study are outlined as follows:
  • To address the complex energy consumption data of large public buildings, the CEEMDAN method is employed to decompose the data, resulting in decomposed IMF components and residuals and effectively decoupling the energy consumption data of large public buildings.
  • A combined feature screening method, MIC-FCBF, is proposed, ranking the regression feature importance for each decomposed IMF component and residual. Simultaneously, the BiLSTMAM model is utilized to predict each component and residual.
  • The accuracy and robustness of the model are validated using data from a large public building in Xi’an, Shaanxi Province, China. The application of the model is discussed, and its superiority is verified through comparisons with multiple benchmark methods.
The remaining sections of this paper are organized as follows. Section 2 describes the methods employed in this study. Section 3 presents the data collection process for the investigated building energy consumption experiments. Section 4 analyzes the experimental results. Finally, Section 5 provides the conclusions and prospects of the research.

2. Methodology

Large public buildings with diverse functional zones like offices, commercial areas, and public spaces exhibit varying energy consumption patterns. This complexity in feature correlations across zones poses challenges in predicting energy consumption accurately. To address the low correlation between energy consumption and features, this study employs signal decomposition for energy consumption data and a combinational feature screening approach. This strategy enhances prediction accuracy by capturing nonlinear energy consumption features and mitigating the impact of low-correlation features on predictions.

2.1. MIC-FCBF Combined Feature Screening

In the context of the intricate relationship between energy consumption data and features in large public buildings, employing a single feature selection method is insufficient even after signal decomposition. Therefore, this study utilizes the Maximum Mutual Information Coefficient (MIC) method to capture intricate interconnections between features effectively. Initially, MIC ensures high-information features, reducing dataset dimensionality and enhancing computational efficiency. Combining MIC with the Fast Correlation-Based Filter (FCBF) method further refines feature selection, considering symmetric uncertainty for more accurate correlation and redundancy information. This integrated feature screening method comprehensively assesses feature importance, improving model generalization and reducing overfitting risks compared to single methods.

2.1.1. Maximum Mutual Information Coefficient

The Maximum Mutual Information Coefficient (MIC) is a correlation analysis method based on the mutual information coefficient, which measures the dependency between two random variables [36]. MIC is computed by assessing mutual information across different dataset subsets and selecting the maximum value. It involves dividing the dataset, calculating mutual information for each subset, and choosing the highest value. This method quantifies linear or nonlinear correlations, preserving information for strongly correlated variables and addressing nonlinear relationships among weakly correlated ones. The MIC calculation equation is as follows:
I U ; O = U , O p U , O log 2 p U , O p U p O
M I C U ; O = max U O < B I U , O log 2 min U , O
In this equation, I represents mutual information, |U||O| denotes the number of large grids partitioned in the U and O directions, and B is the grid constraint variable, where B = nθ (θ = 0.55 or 0.6).

2.1.2. Fast Correlation-Based Filter Algorithm

The Fast Correlation-Based Filter (FCBF) is an algorithm designed for feature selection, aiming to rapidly and accurately filter out features from high-dimensional feature sets that contribute to classification tasks [37]. This algorithm utilizes the Symmetrical Uncertainty (SU) metric to quantify the correlation between features and performs feature selection by defining relevance and redundancy.
Let the initial feature set be represented by X = x i j N × M , denoting N M-dimensional sample vectors and M N-dimensional feature vectors. To more accurately describe the relevant information of Symmetrical Uncertainty (SU), entropy is employed as a measure for random variables.
Assuming that two random variables X = x 1 , x 2 , , x i , i = 1 , 2 , , m have m feature vectors and Y = y 1 , y 2 , , y j , j = 1 , 2 , , n has n feature vectors, then the information entropy H X of random variable X and the conditional entropy H X | Y of random variable X under the condition of variable Y are defined as:
H X = i P x i log 2 P x i
H X | Y = j P y j i P x i | y j log 2 P x i | y j
P x i and P x j are the probabilities of random variables X and Y taking the values x i and y j , respectively; P x i | y j is the probability that the random variable X is x i given that Y = y j .
The measurement method of relevant information between random variables X and Y is defined as mutual information, then the mutual information between the two is expressed as follows:
I X ; Y = H X H X | Y
By combining Equations (3) and (5), SU can be expressed as follows:
L S U X ; Y = 2 I X ; Y H X + H Y
As can be seen from the SU definition, the value range is (0, 1), and the increase in this value means that the correlation between the two variables increases.
To ensure that the initial feature space has the maximum correlation between the target feature and the aging category while minimizing redundancy with other features, define the aging category as variable C. Calculate the correlation L S U v i ; C between each feature v i in the original feature space and the aging category C. Set a threshold m. If L S U v i ; C < m, then eliminate feature v i , retaining all features that meet this condition [38]. After the initial screening, the remaining set of features undergoes a secondary selection. In this step, the correlation between feature v i and each of the remaining features v j is computed. If L S U v i ; v j > L S U v j ; C , the feature v j is removed. Following this selection process, the features are arranged in a specific order from best to least, resulting in a detailed feature ranking.

2.2. Development of Integrated Energy Consumption Prediction Model

2.2.1. CEEMDAN Signal Decomposition Method

Empirical Mode Decomposition (EMD) is a denoising method used for nonlinear and non-smooth signals [39]. However, EMD is prone to modal aliasing, leading to inaccuracies in the extracted Intrinsic Mode Functions (IMFs) [40]. Modal aliasing is an error phenomenon in EMD decomposition where there is an overlap in certain moments or frequencies in the signal, resulting in less clear IMFs [41]. Ensemble Empirical Mode Decomposition (EEMD) is an improved adaptive denoising method based on EMD, designed to reduce modal aliasing [42]. In this study, denoising IMF components involves using the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) method, known for its effectiveness in handling modal aliasing and adaptability to complex signals. The flow structure of this method is shown in Figure 2.
Let E i ( x ) be the i eigenmode component obtained by EMD decomposition and the i eigenmode component obtained by CEEMDAN decomposition be C i ( t ) ¯ . v j is the Gaussian white noise signal satisfying the standard normal distribution, j = 1 , 2 , , N is the number of times of adding white noise, ε is the standard table of white noise, and y ( t ) is the signal to be decomposed. CEEMDAN decomposition steps are as follows:
Step 1: White Gaussian noise is added to signal y ( t ) to be decomposed to get a new signal y ( t ) + ( 1 ) q ε v j ( t ) , q = 1 , 2 and the new signal is decomposed by EMD to get the first-order intrinsic mode component C 1 .
E i ( y ( t ) + ( 1 ) q ε v j ( t ) ) = C 1 j ( t ) + r j
Step 2: Take the overall average of the generated N mode components to obtain the first Intrinsic Mode Function (IMF) of the CEEMDAN decomposition.
C 1 ( t ) ¯ = 1 N j = 1 N C 1 j ( t )
Step 3: Calculate the residual by removing the first mode component:
r 1 ( t ) = y ( t ) C 1 ( t ) ¯
Step 4: Introduce paired positive and negative Gaussian white noise to signal r 1 ( t ) , perform EMD decomposition on the new signal, and obtain the first-order mode component D 1 . Consequently, the second intrinsic mode component of CEEMDAN decomposition can be obtained.
C 2 ( t ) ¯ = 1 N j = 1 N D 1 j ( t )
Step 5: Calculate the residual by removing the second mode component:
r 2 ( t ) = r 1 ( t ) C 2 ( t ) ¯
Step 6: Repeat the above steps until the obtained residual signal becomes a monotonic function and further decomposition is not possible, indicating the end of the algorithm. At this point, the number of intrinsic mode components obtained is k, and the original signal y ( t ) is decomposed into the following:
y ( t ) = k = 1 K C k ( t ) ¯ + r k ( t )

2.2.2. Bilstm-Attention Method

BiLSTM is a recursive neural network architecture, which is designed to capture sequential patterns in data. Unlike the traditional LSTM depicted in Figure 3, the BiLSTM processes input data in both forward and backward directions, enabling it to capture dependencies in both past and future contexts. This architecture consists of two LSTM layers: one for forward processing and the other for backward processing of the input sequence. The outputs from both directions are then concatenated or combined, providing the input sequence. BiLSTM, with its advantages in capturing temporal relationships and strong adaptability, is chosen in this study for energy consumption prediction [43].
The forget gate is responsible for selectively discarding certain information from the previous time step, while the input gate determines which new information can be passed to the next state [44]. The output gate specifies the output from the current state to the next. In LSTM, these three key components (forget gate, input gate, and output gate) collaborate to precisely control the flow of information.
i t = σ ( W x i x t + W h i h t 1 + b i )
f t = σ ( W x f x t + W h f h t 1 + b f )
o t = σ ( W x o x t + W h o h t 1 + b o )
c ˜ t = tanh ( W x c x t + W h c h t 1 + b c )
c t = f t c t 1 + i t c ˜ t
h t = o t tanh ( c t )
In the equation: i t , f t , and o t represent the input gate, forget gate, and output gate, respectively; x t is the input vector at the current time step; σ denotes the sigmoid function; tanh is the hyperbolic tangent activation function; W x i ,   W x f ,   W x o ,   a n d   W x c are the weight matrices from the input layer to different gate mechanisms; b f ,   b i ,   b o ,   a n d   b c are bias vectors for different gate mechanisms; W h i ,   W h f ,   W h o ,   a n d   W h c are the weight matrices from the hidden layer to different gate mechanisms; c t   a n d   c t 1 are the storage unit information at the current time t and the previous time, respectively; c ~ t is the cell state value at the current time step without using the input gate and forget gate for information adjustment; and denotes element-wise multiplication.
BiLSTM is a combination of forward and backward LSTMs, considering historical data in both positive and negative temporal directions. It is equivalent to adding a backward LSTM, allowing the hidden layer to read future sequence information from the input. By incorporating information from both forward and backward directions in the computation, BiLSTM can effectively capture patterns in data bidirectionally. The structure is illustrated in Figure 4, where h and h represent the sequences processed in the forward and backward directions, respectively. The output of BiLSTM can be expressed as follows:
h t = ( H c , t 1 , H c , t ) , t [ 1 , i ]
In Equation (19), H c , t  represents the output of the CNN layer at time t.
Attention Mechanism (AM) is a commonly used mechanism in deep learning models. Its primary objective is to enhance the model’s focus on different parts of the input sequence, enabling the model to more accurately learn and comprehend input information [45]. Attention achieves this by assigning different weights to feature vectors, concentrating attention to highlight key features, and obtaining better results [46]. The structure of the Attention Mechanism is illustrated in Figure 5. Here, x t t 1 , n represents the input of the BiLSTM network, h t ( t 1 , n ) corresponds to the hidden layer output obtained through BiLSTM for each input, and y is the output of the BiLSTM with the incorporated Attention Mechanism.

2.3. Algorithm Framework

In this paper, a prediction model of energy consumption of large public buildings based on CEEMDAN, MIC-FCBF, and BiLSTMAM is proposed. In the data decomposition section, the CEEMDAN method is employed to decompose energy consumption data, obtaining multiple Intrinsic Mode Function (IMF) components and residuals. In the second step, the combined MIC-FCBF feature selection method is applied to each IMF component and residual for feature screening, determining the optimal input features for each IMF component and residual. In the third step, the BiLSTMAM model utilizes the selected optimal input features to predict each IMF component and residual. In the third step, the BiLSTMAM model utilizes the selected optimal input features to predict each IMF component and residual. Finally, the predictions from the BiLSTMAM model for all components are summed to obtain the final energy consumption prediction result. The schematic diagram of the proposed model is shown in Figure 6.

3. Case Study

This section provides an introduction to the target building information and features related to building energy consumption. It lays the groundwork for investigating the correlation between building energy consumption and features. Additionally, the accuracy of energy consumption prediction for this building serves as a validation metric for the proposed model in this study. The predictive model proposed in this research was implemented using MATLAB 2022b on a system running the Windows 11 operating system with an Intel Core i9-12490F processor @3.00 Hz. This processor is equipped with high computational performance, contributing to ensuring the efficiency and accuracy of the model during practical execution.

3.1. Building Information

This study utilizes energy consumption data and relevant features from a large public building in Shaanxi Province, China. The energy consumption data employed in this research mainly includes data from the lighting power supply system and the air conditioning power supply system. The building consists of a total of seven floors, including two underground floors and five above-ground floors, with a height of 32.5 m and a footprint of 22,515 square meters. The air-conditioned area constitutes approximately 65% of the total building area, occupying around 14,635 square meters. The building is predominantly composed of commercial and office areas, with a high demand for thermal comfort indoors, resulting in a significant proportion of energy consumption attributed to air conditioning. The building operates throughout the year from 8:00 a.m. to 10:00 p.m., with the air conditioning systems in the commercial and office areas typically running continuously during the open hours to ensure indoor comfort. During the operation time of the building, data sampling was conducted every hour, and a total of 1300 energy consumption data points were collected. As shown in Figure 7, the x-axis represents the sampling points of energy consumption data, and the y-axis represents the specific values of energy consumption data. The building energy consumption data show certain periodicity and volatility.

3.2. Input Data and Data Collection

This study recorded hourly energy consumption data, time-related feature data, and meteorological feature data for the building between 1 June 2022 and 31 August 2022. Considering the target of the research is a large public building, the activation of internal air conditioning and lighting is influenced by time-related and meteorological features. The feature data points are illustrated in Table 1. Four time-related features were selected for this study, including the date within each month, time within each day, the type of workday, and whether it was a weekend. Meteorological feature data were primarily collected through a local weather station, including temperature, humidity, barometric pressure, wind speed, total solar irradiance at the previous moment, total solar irradiance, and energy consumption at the previous moment. The aforementioned feature datasets were utilized to predict the energy consumption of this public building.

4. Result and Discussion

This section discusses various comparative experiments and evaluation metrics to analyze and demonstrate that the CEEMDAN-MIC-FCBF-BiLSTMAM model outperforms current state-of-the-art data-driven models in accurately predicting energy consumption in large public buildings. The research goal of this paper is to design an energy consumption prediction model suitable for large public buildings with complex functional zones.

4.1. Evaluation Metrics of the Model

To validate the predictive performance of the CEEMDAN-MIC-FCBF-BiLSTMAM model, this paper selects Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Square Error (MSE), Coefficient of Determination (R2), and Residual Prediction Deviation (RPD) as the primary evaluation metrics for assessing the model’s predictive accuracy.
MAPE measures the average absolute percentage error between predicted and actual values, which is suitable for assessing prediction accuracy but sensitive to outliers. RMSE calculates the root mean square of prediction errors, emphasizing larger errors, and is commonly used for overall prediction accuracy assessment. MAE computes the mean absolute value of prediction errors, insensitive to outliers, reflecting the overall average error level. MSE measures the mean square of prediction errors, emphasizing larger errors, and is commonly used in mathematical computations. R2 gauges the model’s ability to explain variance in the target variable, with values ranging from 0 to 1, where closer to 1 indicates better explanatory power. Residual Prediction Deviation evaluates the model’s predictive performance relative to the standard deviation and is utilized to assess the model’s relative performance in a specific context. The calculation equation is shown in (20)–(25):
M A P E = 1 n i = 1 n y ^ i y i y i × 100 %
R M S E = 1 n i = 1 n y ^ i y i 2
M A E = 1 n i = 1 n y i y ^ i
M S E = 1 n i = 1 n y i y ^ i 2
R 2 = 1 i = 1 n y i y ^ i 2 i = 1 n y i y ¯ 2
R P D = 2 y i y ^ i ( y i + y ^ i ) × 100 %
y ^ i is the predicted value of the ith sample of the test set, y i is the true value of the ith sample in the test set, n is the total number of samples, y ¯ is the average of the predicted actual values, and σ ( y ) is the standard deviation of the actual value.

4.2. Energy Consumption Data CEEMDAN Signal Decomposition

In the signal decomposition section, this study applies the CEEMDAN method to decompose hourly energy consumption data collected from large public buildings. The signal-to-noise ratio in the CEEMDAN method is set to 0.2, the number of noise additions is set to 100, and the iteration count is set to 100. The energy consumption data are decomposed into 10 IMF components and one residual, as shown in Figure 8. The IMF components exhibit a continuous decrease in frequency from low to high, and Figure 9 displays the frequency spectrum of the decomposition results. IMF1’s frequency band is mainly distributed between 0.1 and 0.3, IMF2 decreases to the range of 0.1 to 0.15, and the frequency bands of other IMFs sequentially decrease. From IMF5 to IMF10, their frequency bands primarily exist between 0 and 0.05. It is evident that the early IMF components exhibit high irregularity, highlighting the effectiveness of the CEEMDAN method in decomposition.
The different frequency bands of IMF components typically imply varying correlations with features during the regression prediction process. The high-frequency components, IMF1 to IMF4, often exhibit strong correlations with input features of similar higher frequencies. On the other hand, IMF5 to IMF10, along with the residual component, usually demonstrate stronger correlations with long-term input features such as dates. Hence, it is necessary to perform separate regression feature importance screening for different IMF components.

4.3. Feature Analysis

In this section, the MIC-FCBF combined feature selection method is employed to analyze the input features for IMF1~10 and the residuals, exploring the correlation between input features and IMF1~10 as well as the residuals. Figure 10 shows the MIC (Maximal Information Coefficient) values between input features and IMF1~10, as well as the residuals. MIC values range between 0 and 1, with higher values indicating stronger correlations between input features and predicted values.
As depicted in Figure 10, since the residuals are obtained after the CEEMDAN signal decomposition, exhibiting stable long-term variations, other features such as WS, CC, and ECP display a certain degree of correlation. Particularly, the feature ECP contains a significant amount of valid information related to the residuals. MIC values for Day, DT, BP, WS, and CC with respect to IMF1~10 are relatively low, indicating a lower correlation, which aligns with the actual scenario. Therefore, input features with MIC values below 0.2 are excluded, and the remaining features undergo further screening using the FCBF method.
Figure 11 presents the SU (Symmetrical Uncertainty) values calculated through the FCBF (Fast Correlation Based Filter) method for the correlation between input features and the predicted values. These SU values are used to measure the association between features and the target variable, aiding in the selection of features that are crucial for regression prediction, thereby optimizing model performance and reducing redundant features. For different IMF (Intrinsic Mode Function) components, as well as the residuals, with respect to TIME, WT, and T, the SU values are generally high. This aligns with the real-world relationship between energy consumption and features. However, due to the impact of noise and nonlinear relationships, there is a slight decrease in the SU values for TIME in IMF5 and IMF7. This observation underscores the practical value of the FCBF method.
Therefore, in the subsequent FCBF filtering process, features with SU values greater than 0.3 are selected as the final input features for IMF1~10 and the residuals in the prediction model training. Through this combined feature selection approach, which ensures the relevance between features and predicted values, the training and prediction of the energy consumption forecasting model are conducted.

4.4. Performance Validation of the Model

This study documented the hourly energy consumption data, time-related features, and meteorological feature data of a large public building in Shaanxi Province from 1 June to 31 August 2022. The dataset was divided into 70% for model training and 30% for model testing. To demonstrate the effectiveness of hybrid prediction models, experiments were conducted using five prediction models: BiLSTM, CNN-BiGRU, CEEMDAN-CNN-BiLSTM, CEEMDAN-BiLSTMAM, CEEMDAN-FCBF-BiLSTMAM, and the proposed CEEMDAN-MIC-FCBF-BiLSTMAM model. Comparative analyses were performed on the prediction results.
Figure 12 illustrates the results obtained by the proposed method and the comparative methods. Figure 13 shows the prediction error diagram of the proposed method and the comparison method. As shown in the figure, the prediction error of the proposed method is the lowest among the comparison methods. In periods of significant energy consumption fluctuations, the predictive performance of a single machine learning model appears to be limited. However, the hybrid prediction model proposed in this study, benefiting from CEEMDAN signal decomposition and MIC-FCBF combined feature selection, demonstrates suitable fitting even during periods of high energy consumption fluctuations. The predictive performance is significantly superior to other models. This highlights the superiority of the hybrid model, especially in addressing situations with substantial energy consumption fluctuations. By integrating various models and feature selection methods, the accuracy and stability of the predictions have been enhanced.
Figure 14 presents a radar chart of prediction accuracy evaluation metrics, offering a clear comparison between the proposed model and other models in terms of prediction performance. The specific evaluation metrics are shown in Table 2. The summary of the prediction results is as follows:
(1)
The CEEMDAN-MIC-FCBF-BiLSTMAM model outperforms other models in all accuracy evaluation metrics, exhibiting the smallest MAE, MSE, RMSE, and MAPE and the highest R2 and RPD values. This indicates that the CEEMDAN-MIC-FCBF-BiLSTMAM model excels in the accuracy and stability of energy consumption prediction for large public buildings.
(2)
Both the CEEMDAN-MIC-FCBF-BiLSTMAM model and the CEEMDAN-BiLSTMAM model outperform the BiLSTM model in various evaluation metrics, showing an improvement in model performance. This suggests that CEEMDAN signal decomposition and the incorporation of AM (Attention Mechanism) in BiLSTM contribute to enhanced prediction accuracy, especially in scenarios with complex feature coupling for energy consumption prediction in large public buildings.
(3)
The RMSE of CEEMDAN-MIC-FCBF-BiLSTMAM is 11.9, demonstrating a 57.23% improvement compared to CEEMDAN-FCBF-BiLSTMAM without MIC value screening and a 67.21% improvement compared to CEEMDAN-BiLSTMAM without feature selection. This indicates that the MIC-FCBF combined feature selection method can significantly improve the model’s prediction accuracy. The RMSE is 65.65% lower compared to CNN-BiLSTM, highlighting the effectiveness of feature extraction through CNN for improving prediction accuracy.
In conclusion, the superiority of the proposed model, CEEMDAN-MIC-FCBF-BiLSTMAM, lies in its use of the CEEMDAN algorithm for energy consumption data decomposition, the application of the combined feature selection MIC-FCBF method for screening IMF components and residuals post-decomposition, and the incorporation of the Attention Mechanism in BiLSTM. The experimental results demonstrate that this model, through combined feature selection, effectively reduces the computational complexity of energy consumption data after signal decomposition, achieving precise predictions for the energy consumption of large public buildings while improving model computation speed.

5. Conclusions

Improving the accuracy of energy consumption prediction for large public buildings using the CEEMDAN-MIC-FCBF-BiLSTMAM method in this study contributes to more precise energy management, increased energy efficiency, and reduced carbon emissions. The research focuses on a large public building in Shaanxi Province, employing time-related and meteorological features for prediction. Initially, the CEEMDAN method decomposes energy consumption data into multiple IMF components and residuals. Subsequently, the MIC-FCBF combined feature selection method is applied to each IMF component and residual for feature selection, obtaining input features for the prediction model. Finally, the input features and corresponding IMF components and residuals are fed into the BiLSTMAM model for prediction, and the predicted results are aggregated to derive the energy consumption forecast for the building. Comparative analysis with five benchmark methods using six evaluation metrics demonstrates the superior predictive performance of the proposed method. This study primarily focuses on hourly energy consumption prediction for the entire building and suggests future research to refine predictions by considering the internal structure of large public buildings and implementing energy consumption zoning for enhanced accuracy. This study primarily focuses on improving the hourly energy consumption prediction accuracy of large public buildings through signal decomposition and feature screening. However, it is acknowledged that the weights and hyperparameters of deep learning models also play a crucial role in influencing prediction accuracy. In future research, we intend to enhance the accuracy of energy consumption predictions by employing swarm intelligence algorithms for optimizing both the model weights and hyperparameters. This approach aims to further elevate the energy consumption prediction precision for large public buildings.

Author Contributions

Conceptualization, Z.T. and D.C.; data curation, Z.T. and L.Z.; formal analysis, D.C. and L.Z.; methodology, Z.T.; writing—original draft, Z.T.; writing—review and editing, D.C. and L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation Program (51209167; 12002251), the Open Fund Project of Shaanxi Key Laboratory of Geotechnical and Underground Space Engineering (YT202004), the Scientific Research Program of Shaanxi Provincial Education Department (22JC043), the Xi’an Science, Technology Project (23GXFW0045), and the Shaanxi Provincial Natural Science Foundation General Project (2024JC-YBMS-286).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Gao, L.; Liu, T.; Cao, T.; Hwang, Y.; Radermacher, R. Comparing deep learning models for multi energy vectors prediction on multiple types of building. Appl. Energy 2021, 301, 117486. [Google Scholar] [CrossRef]
  2. Yang, Y.; Chen, Y.; Wang, Y.; Li, C.; Li, L. Modelling a combined method based on ANFIS and neural network improved by DE algorithm: A case study for short-term electricity demand forecasting. Appl. Soft Comput. 2016, 49, 663–675. [Google Scholar] [CrossRef]
  3. Gassar, A.A.A.; Cha, S.H. Energy prediction techniques for large-scale buildings towards a sustainable built environment: A review. Energy Build. 2020, 224, 110238. [Google Scholar] [CrossRef]
  4. Cristino, T.M.; Neto, A.F.; Wurtz, F.; Delinchant, B. The Evolution of Knowledge and Trends within the Building Energy Efficiency Field of Knowledge. Energies 2022, 15, 691. [Google Scholar] [CrossRef]
  5. Zhang, L.; Wen, J.; Li, Y.; Chen, J.; Ye, Y.; Fu, Y.; Livingood, W. A review of machine learning in building load prediction. Appl. Energy 2021, 285, 116452. [Google Scholar] [CrossRef]
  6. Wang, Z.; Srinivasan, R.S. A review of artificial intelligence based building energy use prediction: Contrasting the capabilities of single and ensemble prediction models. Renew. Sustain. Energy Rev. 2017, 75, 796–808. [Google Scholar] [CrossRef]
  7. Ye, Y.; Zuo, W.; Wang, G. A comprehensive review of energy-related data for U.S. commercial buildings. Energy Build. 2019, 186, 126–137. [Google Scholar] [CrossRef]
  8. Dagdougui, H.; Bagheri, F.; Le, H.; Dessaint, L. Neural network model for short-term and very-short-term load forecasting in district buildings. Energy Build. 2019, 203, 109408. [Google Scholar] [CrossRef]
  9. Gao, Z.; Yu, J.; Zhao, A.; Hu, Q.; Yang, S. A hybrid method of cooling load forecasting for large commercial building based on extreme learning machine. Energy 2022, 238, 122073. [Google Scholar] [CrossRef]
  10. Ekonomou, L. Greek long-term energy consumption prediction using artificial neural networks. Energy 2010, 35, 512–517. [Google Scholar] [CrossRef]
  11. Zhao, H.; Magoulès, F. A review on the prediction of building energy consumption. Renew Sustain. Energy Rev. 2012, 16, 3586–3592. [Google Scholar] [CrossRef]
  12. Wang, Z.; Wang, Y.; Zeng, R.; Srinivasan, R.S.; Ahrentzen, S. Random Forest based hourly building energy prediction. Energy Build. 2018, 171, 11–25. [Google Scholar] [CrossRef]
  13. Wu, X.; Kumar, V.; Ross Quinlan, J.; Ghosh, J.; Yang, Q.; Motoda, H.; McLachlan, J.G.; Ng, A.; Liu, B.; Yu, P.S.; et al. Top 10 algorithms in data mining. Knowl. Inf. Syst. 2008, 14, 1–37. [Google Scholar] [CrossRef]
  14. Wang, Z.; Srinivasan, R.S. A review of artificial intelligence based building energy prediction with a focus on ensemble prediction models. In Proceedings of the 2015 Winter Simulation Conference (WSC), Huntington Beach, CA, USA, 6–9 December 2016; pp. 3438–3448. [Google Scholar]
  15. Catalina, T.; Iordache, V.; Caracaleanu, B. Multiple regression model for fast prediction of the heating energy demand. Energy Build. 2013, 57, 302–312. [Google Scholar] [CrossRef]
  16. Ji, P.; Xiong, D.; Wang, P.; Chen, J. A Study on Exponential Smoothing Model for Load Forecasting. In Proceedings of the 2012 Asia-Pacific Power and Energy Engineering Conference, Shanghai, China, 27–29 March 2012; pp. 1–4. [Google Scholar]
  17. Domingos, P. A few useful things to know about machine learning. Commun. ACM 2012, 55, 78–87. [Google Scholar] [CrossRef]
  18. Chen, Y.; Huang, M.; Tao, Y. Density-based clustering multiple linear regression model of energy consumption for electric vehicles. Sustain. Energy Technol. Assess. 2022, 53, 102614. [Google Scholar] [CrossRef]
  19. Ma, M.; Wang, Z. Prediction of the Energy Consumption Variation Trend in South Africa based on ARIMA, NGM and NGM-ARIMA Models. Energies 2019, 13, 10. [Google Scholar] [CrossRef]
  20. Hu, J.; Zheng, W.; Zhang, S.; Li, H.; Liu, Z.; Zhang, G.; Yang, X. Thermal load prediction and operation optimization of office building with a zone-level artificial neural network and rule-based control. Appl. Energy 2021, 300, 117429. [Google Scholar] [CrossRef]
  21. Edwards, R.E.; New, J.; Parker, L.E. Predicting future hourly residential electrical consumption: A machine learning case study. Energy Build. 2012, 49, 591–603. [Google Scholar] [CrossRef]
  22. Zhao, H.X.; Magoulès, F. Parallel Support Vector Machines Applied to the Prediction of Multiple Buildings Energy Consumption. J. Algorithms Comput. Technol. 2010, 4, 231–249. [Google Scholar] [CrossRef]
  23. Bui, D.K.; Nguyen, T.N.; Ngo, T.D.; Nguyen-Xuan, H. An artificial neural network (ANN) expert system enhanced with the electromagnetism-based firefly algorithm (EFA) for predicting the energy consumption in buildings. Energy 2020, 190, 116370. [Google Scholar] [CrossRef]
  24. Etemad, A.; Shafaat, A.; Bahman, A.M. Data-driven performance analysis of a residential building applying artificial neural network (ANN) and multi-objective genetic algorithm (GA). Build. Environ. 2022, 225, 109633. [Google Scholar]
  25. Zhang, X.-R.; Lei, M.-Y.; Li, Y. An amplitudes-perturbation data augmentation method in convolutional neural networks for EEG decoding. In Proceedings of the 2018 5th International Conference on Information, Cybernetics, and Computational Social Systems (ICCSS), Hangzhou, China, 16–19 August 2018; pp. 231–235. [Google Scholar]
  26. Somu, N.; MR, G.R.; Ramamritham, K. A deep learning framework for building energy consumption forecast. Renew. Sustain. Energy Rev. 2021, 137, 110591. [Google Scholar] [CrossRef]
  27. Cao, W.; Yu, J.; Chao, M.; Wang, J.; Yang, S.; Zhou, M.; Wang, M. Short-Term Energy Consumption Prediction Method for Educational Buildings Based on Model Integration. Energy 2023, 283, 128580. [Google Scholar] [CrossRef]
  28. Kong, W.; Dong, Z.Y.; Jia, Y.; Hill, D.J.; Xu, Y.; Zhang, Y. Short-Term Residential Load Forecasting Based on LSTM Recurrent Neural Network. IEEE Trans. Smart Grid. C 2019, 10, 841–851. [Google Scholar] [CrossRef]
  29. Karijadi, I.; Chou, S.-Y. A hybrid RF-LSTM based on CEEMDAN for improving the accuracy of building energy consumption prediction. Energy Build. 2022, 259, 111908. [Google Scholar] [CrossRef]
  30. Kilinc, H.C.; Yurtsever, A. Short-term streamflow forecasting using hybrid deep learning model based on grey wolf algorithm for hydrological time series. Sustainability 2022, 14, 3352. [Google Scholar] [CrossRef]
  31. Zheng, H.; Yuan, J.; Chen, L. Short-Term Load Forecasting Using EMD-LSTM Neural Networks with a Xgboost Algorithm for Feature Importance Evaluation. Energies 2017, 10, 1168. [Google Scholar] [CrossRef]
  32. Tao, Q.-Y.; Yu, C.-J.; Li, Y.-L. Wind speed prediction in bridge site area based on empirical value decomposition and Elman neural network. Disaster Sci. 2017, 32, 85–89. [Google Scholar]
  33. Mao, Y.; Yu, J.; Zhang, N.; Dong, F.; Wang, M.; Li, X. A Hybrid Model of Commercial Building Cooling Load Prediction Based on the Improved NCHHO-FENN Algorithm. J. Build. Eng. 2023, 78, 107660. [Google Scholar] [CrossRef]
  34. Torres, M.E.; Colominas, M.A.; Schlotthauer, G.; Flandrin, P. A complete ensemble empirical mode decomposition with adaptive noise. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 22–27 May 2011; pp. 4144–4147. [Google Scholar]
  35. Lin, Z. Short-Term Prediction of Building Sub-Item Energy Consumption Based on the CEEMDAN-BiLSTM Method. Front. Energy Res. 2022, 10, 908544. [Google Scholar] [CrossRef]
  36. Xu, H.; Liu, Y.; Li, J.; Yu, H.; An, X.; Ma, K.; Liang, Y.; Hu, X.; Zhang, H. Study on the Influence of High and Low Temperature Environment on the Energy Consumption of Battery Electric Vehicles. Energy Rep. 2023, 9, 835–842. [Google Scholar] [CrossRef]
  37. Zhang, Y.; Zhang, Y.; He, W.; Yu, S.; Zhao, S. Improved Feature Size Customized Fast Correlation-Based Filter for Naive Bayes Text Classification. J. Intell. Fuzzy Syst. 2020, 38, 3117–3127. [Google Scholar] [CrossRef]
  38. Ruiz, R.; Riquelme, J.C.; Aguilar-Ruiz, J.S.; García-Torres, M. Fast Feature Selection Aimed at High-Dimensional Data via Hybrid-Sequential-Ranked Searches. Expert Syst. Appl. 2012, 39, 11094–11102. [Google Scholar] [CrossRef]
  39. Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.C.; Shih, H.H.; Zheng, Q.; Yen, N.C.; Tung, C.C.; Liu, H.H. The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
  40. Wu, Z.; Huang, N.E. Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 2011, 1, 1–41. [Google Scholar] [CrossRef]
  41. Mao, Q.; Fang, X.; Hu, Y.; Li, G. Chiller sensor fault detection based on empirical mode decomposition threshold denoising and principal component analysis. Appl. Therm. Eng. 2018, 144, 21–30. [Google Scholar] [CrossRef]
  42. Jia, Y.; Li, G.; Dong, X.; He, K. A novel denoising method for vibration signal of hob spindle based on EEMD and grey theory. Measurement 2021, 169, 108490. [Google Scholar] [CrossRef]
  43. Lei, L.; Shao, S.; Liang, L. An evolutionary deep learning model based on EWKM, random forest algorithm, SSA and BiLSTM for building energy consumption prediction. Energy 2024, 288, 129795. [Google Scholar] [CrossRef]
  44. Song, Y.; Xie, H.; Zhu, Z.; Ji, R. Predicting Energy Consumption of Chiller Plant Using WOA-BiLSTM Hybrid Prediction Model: A Case Study for a Hospital Building. Energy Build. 2023, 300, 113642. [Google Scholar] [CrossRef]
  45. Zhang, D.; Chen, B.; Zhu, H.; Goh, H.H.; Dong, Y.; Wu, T. Short-term wind power prediction based on two-layer decomposition and BiTCN-BiLSTM-attention model. Energy 2023, 285, 128762. [Google Scholar] [CrossRef]
  46. Guo, J.; Liu, M.; Luo, P.; Chen, X.; Yu, H.; Wei, X. Attention-based BILSTM for the degradation trend prediction of lithium battery. Energy Rep. 2023, 9, 655–664. [Google Scholar] [CrossRef]
Figure 1. Article structure chart.
Figure 1. Article structure chart.
Applsci 14 02137 g001
Figure 2. CEEMDAN signal decomposition.
Figure 2. CEEMDAN signal decomposition.
Applsci 14 02137 g002
Figure 3. The structure of one LSTM neuron.
Figure 3. The structure of one LSTM neuron.
Applsci 14 02137 g003
Figure 4. The structure of BiLSTM model.
Figure 4. The structure of BiLSTM model.
Applsci 14 02137 g004
Figure 5. Attention Mechanism structure.
Figure 5. Attention Mechanism structure.
Applsci 14 02137 g005
Figure 6. Energy consumption prediction model structure flow chart.
Figure 6. Energy consumption prediction model structure flow chart.
Applsci 14 02137 g006
Figure 7. Hourly energy consumption data of large public building.
Figure 7. Hourly energy consumption data of large public building.
Applsci 14 02137 g007
Figure 8. CEEMDAN decomposition of energy consumption.
Figure 8. CEEMDAN decomposition of energy consumption.
Applsci 14 02137 g008
Figure 9. CEEMDAN energy consumption data decomposition spectrum.
Figure 9. CEEMDAN energy consumption data decomposition spectrum.
Applsci 14 02137 g009
Figure 10. MIC values between features and decomposition results.
Figure 10. MIC values between features and decomposition results.
Applsci 14 02137 g010
Figure 11. Degree of correlation between features and decomposition results SU.
Figure 11. Degree of correlation between features and decomposition results SU.
Applsci 14 02137 g011
Figure 12. Prediction results of CEEMDAN-MIC-FCBF-BiLSTMAM and comparison methods.
Figure 12. Prediction results of CEEMDAN-MIC-FCBF-BiLSTMAM and comparison methods.
Applsci 14 02137 g012
Figure 13. Prediction error of CEEMDAN-MIC-FCBF-BiLSTMAM and comparison methods.
Figure 13. Prediction error of CEEMDAN-MIC-FCBF-BiLSTMAM and comparison methods.
Applsci 14 02137 g013
Figure 14. Radar map of prediction accuracy evaluation metrics.
Figure 14. Radar map of prediction accuracy evaluation metrics.
Applsci 14 02137 g014
Table 1. Energy consumption prediction input feature list.
Table 1. Energy consumption prediction input feature list.
FeatureAbbreviationTypeUnit
DateDayIndependent1, 2, 3, …, 31
TimeTimeIndependent1, 2, 3, …, 24
Type of weekdayWTIndependentWeekday and
weekend
Day of the weekDTIndependentSunday, Monday,
Saturday
Air temperatureATContinuous°C
Air humidityAHContinuous%
Barometric pressureBPContinuoushPa
Wind speedWSContinuousm/s
Cloud coverCCContinuous0–10 percent
Total solar irradiance at the previous momentRPContinuousW/m2
Total solar irradianceTRContinuousW/m2
Energy consumption at the previous momentECPContinuouskWh
Table 2. Prediction accuracy evaluation metrics comparison table.
Table 2. Prediction accuracy evaluation metrics comparison table.
ModelMAEMSERMSER2RPDMAPE
CEEMDAN-MIC-FCBF-BiLSTMAM8.41141.2111.920.99615.540.011
CEEMDAN-FCBF-BiLSTMAM19.62771.9227.830.9756.330.027
CEEMDAN-BiLSTMAM27.241313.1236.250.9575.850.034
CEEMDAN-CNN-BiLSTM22.551196.5334.670.9615.120.032
CNN-BiGRU28.371310.3836.250.9575.210.038
BiLSTM50.844604.1467.940.8512.670.066
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tian, Z.; Chen, D.; Zhao, L. Short-Term Energy Consumption Prediction of Large Public Buildings Combined with Data Feature Engineering and Bilstm-Attention. Appl. Sci. 2024, 14, 2137. https://doi.org/10.3390/app14052137

AMA Style

Tian Z, Chen D, Zhao L. Short-Term Energy Consumption Prediction of Large Public Buildings Combined with Data Feature Engineering and Bilstm-Attention. Applied Sciences. 2024; 14(5):2137. https://doi.org/10.3390/app14052137

Chicago/Turabian Style

Tian, Zeqin, Dengfeng Chen, and Liang Zhao. 2024. "Short-Term Energy Consumption Prediction of Large Public Buildings Combined with Data Feature Engineering and Bilstm-Attention" Applied Sciences 14, no. 5: 2137. https://doi.org/10.3390/app14052137

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop