Myocardial Infarction Classification Based on Convolutional Neural Network and Recurrent Neural Network

Feng, Kai; Pi, Xitian; Liu, Hongying; Sun, Kai

doi:10.3390/app9091879

Open AccessArticle

Myocardial Infarction Classification Based on Convolutional Neural Network and Recurrent Neural Network

by

Kai Feng

¹,

Xitian Pi

^1,2,*,

Hongying Liu

^1,3,* and

Kai Sun

¹

Key Laboratory of Biorheological Science and Technology, Ministry of Education, College of Bioengineering, Chongqing University, Chongqing 400030, China

²

Key Laboratory for National Defense Science and Technology of Innovation Micro-Nano Devices and System Technology, Chongqing 400030, China

³

Chongqing Engineering Research Center of Medical Electronics Technology, Chongqing 400030, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2019, 9(9), 1879; https://doi.org/10.3390/app9091879

Submission received: 25 March 2019 / Revised: 29 April 2019 / Accepted: 2 May 2019 / Published: 7 May 2019

(This article belongs to the Section Applied Biosciences and Bioengineering)

Download

Browse Figures

Versions Notes

Abstract

:

Myocardial infarction is one of the most threatening cardiovascular diseases for human beings. With the rapid development of wearable devices and portable electrocardiogram (ECG) medical devices, it is possible and conceivable to detect and monitor myocardial infarction ECG signals in time. This paper proposed a multi-channel automatic classification algorithm combining a 16-layer convolutional neural network (CNN) and long-short term memory network (LSTM) for I-lead myocardial infarction ECG. The algorithm preprocessed the raw data to first extract the heartbeat segments; then it was trained in the multi-channel CNN and LSTM to automatically learn the acquired features and complete the myocardial infarction ECG classification. We utilized the Physikalisch-Technische Bundesanstalt (PTB) database for algorithm verification, and obtained an accuracy rate of 95.4%, a sensitivity of 98.2%, a specificity of 86.5%, and an F1 score of 96.8%, indicating that the model can achieve good classification performance without complex handcrafted features.

Keywords:

myocardial infarction; ECG; convolutional neural network; recurrent neural network; LSTM

1. Introduction

Myocardial infarction is a cardiovascular disease caused by myocardial insufficient blood supply or even myocardial necrosis due to coronary artery occlusion. According to statistics from the American Health Association, nearly 720,000 Americans suffer from myocardial infarction each year [1]. In the early stage of this disease, patients with myocardial infarction usually show symptoms such as chest pain and chest tightness, but some patients still have no obvious symptoms, which makes it difficult to treat in time, thus threatening life [2]. Therefore, how to achieve the early diagnosis of myocardial infarction has a significant clinical value, and has become a research topic of many scholars.

Electrocardiogram (ECG) is one of the routine examination methods for myocardial infarction [2]. In the field of ECG signal processing, many traditional studies have focused on the feature extraction of myocardial infarction ECG signals including time domain, frequency domain, wavelet transform, and other characteristics. Sun et al. extracted ST segments and combined support vector machine (SVM) and multi-instance learning to complete myocardial infarction ECG classification [3]. Arif et al. started with the three time-domain features of T wave amplitude, Q wave amplitude, and ST segment offset level, and used the K-nearest neighbor (KNN) algorithm to achieve the detection and location of myocardial infarction [4]. Sharma et al. obtained the frequency domain features of ECG such as sample entropy, and applied SVM and the KNN algorithm to classify different types of myocardial infarction ECG [5]. Similar to Arif, Safdarian et al. extracted T wave characteristics from ECG signals, and employed pattern recognition methods for myocardial infarction classification [6]. Although the above algorithms can achieve good results, as ECG signals are very weak and susceptible to noise interference, feature point recognition cannot be guaranteed, which has become a limitation of such methods.

In recent years, with the development of deep learning, convolutional neural network (CNN), and recurrent neural network (RNN) have achieved great success in image classification, object detection, and speech recognition. Deep learning methods such as CNN automatically learn and extract features through deep neural networks, independent of the acquisition of handcrafted features and expert knowledge [7,8,9]. In the area of ECG signal processing, compared with the traditional methods, the deep learning method avoids ECG handcrafted feature extraction to simplify the implementation process to a certain extent, and has been applied by scholars. Xiong et al. completed the classification of atrial fibrillation ECG through a 16-layer CNN [10]. Due to the temporal characteristics of ECG signals, the long-short term memory (LSTM) in RNN also performs well. Saadatnejad et al. used LSTM to complete the classification of arrhythmia ECG [11]; in the classification of myocardial infarction ECG, Reasat T et al. extracted ECG signals from II, III, and AVF leads, and performed preprocessing and classification through shallow CNN [12]; moreover, Acharya et al. established a deep CNN to classify the noisy and denoised myocardial infarction ECG [13]. With the popularization of handheld electrocardiographs, smart bands, and smart watches, access to a single-lead ECG is possible in personal and home detection. Thus, it is important to detect and prevent myocardial infarction through single-lead ECG. Since there have only been a few studies on single-lead myocardial infarction ECG, there is still a very large space for exploration.

Therefore, this paper proposed a deep learning method combining CNN and RNN, established a multi-channel CNN-LSTM network structure, segmented the pre-processed ECG signal, extracted spatial features in the multi-channel convolution network, and acquired the temporal characteristics through LSTM. This method unified the feature extraction and classification procedures, realized the automatic classification of single-lead myocardial infarction ECG, and made in-depth research and analysis on the model convolution kernel, optimizer, and other parameters.

2. Methods

2.1. One-Dimensional CNN

CNN is a feedforward neural network with the characteristics of sparse connectivity and weight sharing. A typical CNN model consists of a series of convolutional layers, pooling layers, and fully-connected layers. As an important part of CNN, the convolutional layers convolute the output feature map of the previous layer and construct the output feature map after the activation function. The mathematical model can be expressed as Equation (1):

x_{j}^{i} = f (\sum_{i \in M_{j}} x_{i}^{l - 1} * k_{i j}^{l} + b_{j}^{l})

(1)

where

M_{j}

represents the input feature map;

l

denotes the number of layer;

k

is the convolution kernel; and

b

is the network bias vectors.

The pooling layers decrease the dimension of the upper layer feature map and achieve the purpose of information filtering. During practical application, max-pooling is often used, and its mathematical model is shown as Equation (2):

P_{i}^{l + 1} (j) = \max_{(j - 1) W + 1 \leq t \leq j W} {q_{i}^{l} (t)}

(2)

where

q_{i}^{l} (t)

represents the value of

t

neuron of

i

feature map in layer

l

;

W

is the size of the pooling area; and

P_{i}^{l + 1} (j)

is the responding position of the neuron in layer

l + 1

.

In the fully-connected layer, each neuron node is connected to all nodes of the previous layer of neurons, and the feature classification is performed using a specific activation function.

2.2. RNN

Compared with CNNs, RNNs are suitable for processing sequence signals. The output of the neuron of a RNN is determined by the current input of the neuron and the output of the neuron at the previous moment. Assume the input

x = (x_{1}, x_{2}, \dots . ., x_{t})

, the hidden layer

h = (h_{1}, h_{2}, \dots . ., h_{t})

, the output

y = (y_{1}, y_{2}, \dots . ., y_{t})

, then for time t:

h_{t} = H (W_{x h} x_{t} + W_{h h} h_{t - 1} + b_{h})

(3)

y_{t} = W_{h y} h_{t} + b_{y}

(4)

where

W

is the weight value;

b

is the bias vector; and

H

is the activation function in the hidden layer.

However, the traditional RNN has a problem called the vanishing gradient. One way to solve the problem is by using the LSTM model. The basic unit of LSTM is represented by cells, and the input, forget, and output gates control the behavior of the cells to achieve the long-term storage of memory information. Figure 1 shows the workflow of an LSTM model [14].

According to the above workflow, we can calculate as follows:

i_{t} = σ (W_{x i} x_{t} + W_{h i} h_{t - 1} + W_{c i} c_{t - 1} + b_{i})

(5)

f_{t} = σ (W_{x f} x_{t} + W_{h f} h_{t - 1} + W_{c f} c_{t - 1} + b_{f})

(6)

c_{t} = f_{t} c_{t - 1} + i_{t} \tanh (W_{x c} x_{t} + W_{h c} h_{t - 1} + b_{c})

(7)

o_{t} = σ (W_{x o} x_{t} + W_{h o} h_{t - 1} + W_{c o} c_{t - 1} + b_{o})

(8)

h_{t} = o_{t} \tanh (c_{t})

(9)

where

σ

represents sigmoid function;

i_{t}

,

f_{t}

, and

o_{t}

are the input gate, the forget gate, and the output gate, respectively;

c_{t}

is the cell activation vector with the same length of vector

h_{t}

in the hidden layer; and

W_{c i}

,

W_{c f}

, and

W_{c o}

are the weight matrix of peephole connections.

2.3. Network Model

After comprehensive analysis and study of the characteristics of the ECG signals, this paper designed the multi-channel CNN-LSTM model structure as shown in Figure 2. The input layer was the extracted heartbeat signals after preprocessing, and the adjacent five heartbeats (e.g., 3 s of ECG data) were selected and input into five identical CNN channels to obtain the feature map, which was concatenated into the LSTM network to acquire the temporal characteristics between signals. Finally, the temporal characteristics were output to the fully-connected layer for classification. The network structure had 16 layers including nine convolutional layers, two max-pooling layers, one global average pooling layer, one dropout layer, one LSTM layer, one flatten layer, and one fully-connected layer.

(1): Convolutional layers: since the input ECG signals were one-dimensional signals, the selected numbers of filters of the one-dimensional convolutional layer were 4, 8, and 16, respectively; the convolution kernel size was 5 and the stride was 1; the specific parameter setting process is described in Section 4. Meanwhile, batch normalization was utilized to ensure that the input distribution of each layer of the neural network was the same, and ReLU function was applied as the activation function. Compared with the sigmoid function and tanh function, the ReLU function converges faster and alleviates the over-fitting problem. ReLU activation function can be expressed as Equation (10):

$f (x) = \{\begin{cases} 0, x < 0 \\ x, x \geq 0 \end{cases}$

(10)
(2): Pooling layers: this paper used the max-pooling (kernel size was 5) in the fourth and eighth layers of the network, decreased the dimensionality of the calculated characteristic parameters, and retained the significant features to accelerate the calculation. The tenth layer used the global average pooling to reduce the extraction of potential features and obtain the calculation result of the convolutional network.
(3): Dropout layer: the dropout layer was applied between the global average pooling layer and the LSTM layer to achieve stronger generalization capability by randomly invalidating some network nodes.
(4): LSTM layer: after the convolutional network as described above, due to the strong temporal correlation of the ECG signals, we connected a layer of the LSTM network to obtain the temporal characteristics in the output features of the convolutional network.
(5): Flatten layer: we converted the multi-dimensional output of the LSTM network into a one-dimensional output.
(6): Fully-connected layer: the features after all processes were input to the fully-connected layer for classification, and the classifier was Softmax.

3. Experiments

3.1. Data Sources

The myocardial infarction ECG data used in this paper was from the Physikalisch-Technische Bundesanstalt (PTB) database [15], provided by the German National Metrology Institute. The PTB database contains 549 records in 290 cases, each of which was acquired synchronously by a total of 15 leads including a traditional 12-lead and 3 Frank-VCG, and a professional medical practitioner completed the label for each record. The sampling frequency of ECG signals in the PTB database was 1000 Hz. In the PTB database, there were 148 cases of myocardial infarction (368 records) and 52 healthy volunteers (80 records), and the remaining records were heart diseases such as myocarditis, rhythm disorder, and unstable angina. Aiming to study the single-lead myocardial infarction ECG classification, we extracted the myocardial infarction signals and healthy signals of I-lead with lengths of 30 s from the above 15 leads as the experimental data.

3.2. Data Preprocessing

During the acquisition process, ECG signals are subject to three types of noise such as myoelectric interference, baseline drift, and power line interference. Wavelet transform has a good effect on eliminating the above three kinds of ubiquitous noises in ECG signals. This paper used the wavelet transform method proposed in [16] to filter the original ECG noise, and utilized Daubechies D6 (‘db6’) wavelet basis function to decompose the ECG signals to 10 levels. Table 1 corresponds to the components of the wavelet transform frequency band of ECG signals. The low and high frequency component are called approximation and detail, respectively. We removed the D1 (250–500 Hz), D2 (125–250 Hz), D3 (62.5–125 Hz) detail components, and the A10 (0–0.4875 Hz) approximation component, and reconstructed the remaining components to obtain signals without noises. Figure 3 shows the heartbeats before and after the noises were removed.

After the noises were removed, each ECG record was segmented according to a fixed length in consideration of the input characteristics of the CNN. First, we used the Pan–Tompkins algorithm to detect the R-peaks [17], and then the R-peak positions were utilized for heartbeat segmentation. After segmentation, each heartbeat contained 600 sampling points (199 sampling points were selected on the left side of the R peak and 400 sampling points on the right side), and the length of a single heartbeat was 0.6 s, which basically covered the range of a P–QRS–T wave. The amplitude distribution of the normal and myocardial infarction segmentations differed, which would affect the calculation rate. In order to accelerate the calculation effect, the segmented ECG signals were normalized to improve the convergence speed of the model. As shown in Figure 4, there was a clear distinction between the normal ECG and myocardial infarction ECG [18].

3.3. Balanced Data

Since the ECG data samples used in the study were not balanced, and the healthy records were significantly less than the myocardial infarction records, in order to avoid over-fitting during training and improve the generalization ability of the model, the healthy data were randomly oversampled for balance. Thus, the number of the increased healthy samples was approximately the same as that of the myocardial infarction.

3.4. Cross-Validation

In order to improve the robustness of the algorithm model, 10-fold cross-validation was used in the training process. The pre-processed data were randomly divided into 10 parts. In the calculation of each fold, 90% of the data was used to train the model, and 10% of the data was used as a test set to test the performance of the model. This process was repeated 10 times, and the corresponding evaluation indicator for each calculation was recorded. Meanwhile, in order to observe the parameter variation of the training process and prevent over-fitting, 20% of the 90% training data was taken out as the validation set to test the performance of the model at each epoch. The data partitioning is shown in Figure 5.

3.5. Evaluation Index

In the analysis of the classification effect of the model, this paper comprehensively considered the following four indicators: accuracy (

A c c

), sensitivity (

S e n

), specificity (

S p e c

), and F1 score (

F 1

). The calculation method and meaning of each indicator are as follows:

A c c = \frac{T P + T N}{T P + T N + F N + F P}

(11)

S e n = \frac{T P}{T P + F N}

(12)

S p e c = \frac{T N}{T N + F P}

(13)

F 1 = \frac{2 T P}{2 T P + F P + F N}

(14)

where true positive (

T P

) represents the number of correct classification; false positive (

F P

) is the number of normal ECG, but marked as myocardial infarction; and false negative (

F N

) is the number of myocardial infarction ECG but marked as normal. In addition to the above indicators, this paper studied the two-category problem, so the receiver operating characteristic (ROC) curve and the area under curve (AUC) were used to describe the performance of the model.

4. Results

4.1. Development Environment

The experimental environment of this article was as follows: Intel Core [email protected] CPU, 8G RAM, and a GTX 750 graphics card. The development platform was Python 3.7, using the Keras framework and Tensorflow as the back-end.

4.2. Impact of Channel Numbers on Performance Indicators

During the experiment to test the influence of input length on the accuracy of the model, this paper found that when the number of model channels was set to five, i.e., when the data of five adjacent heartbeats (3 s) were selected as the input, the accuracy was the highest. Table 2 shows the experimental results of different heartbeats. Compared with the input length of 10 heartbeats (6 s), the accuracy, sensitivity, specificity, and F1 increased by 1.1%, 1.7%, 4.6%, and 1.4%, respectively; and compared with the input length of 15 heartbeats (9 s), the accuracy, sensitivity, specificity, and F1 indicators increased by 2.7%, 4.6%, 6.0%, and 3.4%, respectively.

Thus, the model channel was set to five, and we used five heartbeats as the input to obtain the classification effect confusion matrix as shown in Figure 6, which indicates that the model could identify 98% of myocardial infarction ECG. The ROC curve is shown in Figure 7. The AUC value of 0.9868 indicates that the model had excellent classification performance.

4.3. Impact of Convolution Kernel Sizes on Classification Results

After determining the number of channels and layers of the network structure, since the size of the convolution kernel has great influence on the classification performance and operation speed, we tested five different convolution kernel sizes for the convolutional network, and the results are shown in Table 3. It was found that when the convolution kernel size was set between [5,9], the AUC value of the classification effect was improved. Therefore, by taking into account not wasting computing resources, when the size of the convolution kernel was set between [5,9], a better classification effect could be obtained.

4.4. Impact of Different Optimizers and Learning Rates on Performance Index

Model training speed and classification accuracy can be improved by selecting an appropriate optimizer and optimal learning rate. In this paper, three commonly used optimizers—RMSprop, SGD, and Adam—were selected, the model was trained with different learning rates, and the average accuracy of the test set was the evaluation index. As the experimental results show in Table 4, the accuracy was the highest when using the Adam optimizer and the learning rate was set to 0.0001.

4.5. Determination of Model Parameters

After the above-mentioned experiments, the model parameters were determined (shown in Table 5). The number of filters in the convolutional layers was 4, 8, and 16, respectively. The convolution kernel size was 5, the stride was 1, and the activation function was ReLU; the size of the max-pooling layer was 5; the dropout layer was set to 0.5; the optimizer selected Adam, and the learning rate was 0.0001; each batch size was 32, and each training completed 100 epochs.

5. Discussion

This paper used 10-fold cross-validation to train the model, and the comparison of the results of the evaluation indicators obtained and the existing methods is shown in Table 6. It can be seen from Table 6 that the classification and recognition of myocardial infarction ECG have focused on multi-lead studies. By extracting the time-frequency domain features [3,5] and the wavelet coefficient features [7] of multi-lead ECG signals, myocardial infarction classification can be achieved with a high recognition rate through SVM, KNN, and other methods. Unlike traditional 12 leads, 3 Frank leads also can be used to derive the vectorcardiogram (VCG) to detect myocardial infarction. Dawson et al. found that the 12 lead ECG could be linearly transformed from a 3 lead VCG [19]. Aiming at classify myocardial infarction with VCG signals, Huang et al. acquired VCG signals from Frank XYZ leads and extracted 64 features to complete the detection [20]. Ge obtained multivariable autoregressive coefficients via the VCG signal to classify myocardial infarction [21]. However, the above methods all passed the complicated handcrafted feature extraction step, the calculation process was relatively cumbersome, and the data volume recorded by the multi-lead system was often very large and had more constraints on patients, which is not suitable for portable monitoring, and thus is limited to a certain extent. In studies of single-lead myocardial infarction, Safdarian [6] and Zewdie [22] used T-wave detection or morphological information as features to classify with the Naive Bayes and SVM methods, respectively; Acharya [13] achieved classification through the CNN structure with a 95.2% accuracy rate; unlike Acharya [13], this paper considered the ECG signal as a time series, and combined CNN and LSTM to extract deeper features and eliminated steps such as complex feature extraction without decreasing accuracy. It is worth mentioning that although the deep learning model can automatically learn to obtain feature information, compared with the traditional method, the model has higher requirements on the amount and time of training data. The model spent about 15 s per epoch during training. However, there was no need to retrain when classifying the results. During the test of 358 datasets, the total test time was 2.2 s, and the average test time was about 60 ms. Although the processing speed of conventional portable devices cannot be consistent with that of the computer used in the experiment, the algorithm can be implemented on the cloud platform for real-time processing to meet the requirements of clinical applications.

6. Conclusions

Early diagnosis of myocardial infarction is crucial to reduce patient mortality. To diagnose different types of myocardial infarction, many researchers have focused on 12 lead ECG and Frank lead VCG and have achieved great performances. However, multi-lead ECG devices are cumbersome instruments that are only available in hospitals and clinics. Due to advancements in technology, single-lead ECG devices are available for individual and home use for basic cardiac monitoring. Particularly in recent years, with the prevalence of portable ECG testing equipment, utilizing single-lead ECG to prevent and monitor myocardial infarction has played an important role. This paper proposed a classification model of myocardial infarction ECG based on multi-channel CNN and RNN. The network structure had deep structural features, which could acquire the spatial and temporal characteristics of ECG signals. Without any handcrafted feature extraction, the model could obtain an accuracy of 95.4%, a sensitivity of 98.2%, a specificity of 86.5%, and an F1 of 96.8%. It is an effective solution for the automatic classification of myocardial infarction ECG, which can help clinicians, non-specialists, or individuals achieve the prevention and diagnosis of myocardial infarction. However, the occurrence of myocardial infarction is often accompanied by other types of abnormal ECG, so in the future, we will focus on how to further optimize the model structure. Meanwhile, we will cooperate with clinicians to obtain more types of ECG data, and apply the model to other abnormal ECG for recognition and classification.

Author Contributions

Conceptualization, K.F. and X.P.; Data curation, K.F. and X.P.; Formal analysis, H.L. and K.S.; Writing—Original draft, K.F.; Writing—Review & editing, X.P. and H.L.

Funding

This research was funded by the National Natural Science Foundation of China (81671850), and the Chongqing technological Innovation and Application demonstration Project (cstc2018jscx-mszdX0027).

Conflicts of Interest

The authors declare no conflict of interest.

References

Benjamin, E.J.; Blaha, M.J.; Chiuve, S.E.; Cushman, M.; Das, S.R.; Deo, R.; De Ferranti, S.D.; Floyd, J.; Fornage, M.; Gillespie, C.; et al. Heart Disease and Stroke Statistics’2017 Update: A Report from the American Heart Association. Circulation 2017, 135, e146–e603. [Google Scholar] [CrossRef]
Thygesen, K.; Alpert, J.S.; Jaffe, A.S.; Simoons, M.L.; Chaitman, B.R.; White, H.D.; Katus, H.A.; Lindahl, B.; Morrow, D.A.; Clemmensen, P.M.; et al. Third Universal Definition of Myocardial Infarction. Circulation 2012, 126, 2020–2035. [Google Scholar] [CrossRef]
Sun, L.; Lu, Y.; Yang, K.; Li, S. ECG analysis using multiple instance learning for myocardial infarction detection. IEEE Trans. Biomed. Eng. 2012, 59, 3348–3356. [Google Scholar] [CrossRef] [PubMed]
Arif, M.; Malagore, I.A.; Afsar, F.A. Detection and localization of myocardial infarction using K-nearest neighbor classifier. J. Med. Syst. 2012, 279–289. [Google Scholar] [CrossRef] [PubMed]
Dev Sharma, L.; Kumar Sunkaria, R. Inferior myocardial infarction detection using stationary wavelet transform and machine learning approach. Signal Image Video Process. 2018, 12, 199–206. [Google Scholar] [CrossRef]
Safdarian, N.; Dabanloo, N.J.; Attarodi, G. A New Pattern Recognition Method for Detection and Localization of Myocardial Infarction Using T-Wave Integral and Total Integral as Extracted Features from One Cycle of ECG Signal. J. Biomed. Sci. Eng. 2014, 07, 818–824. [Google Scholar] [CrossRef]
Farooq, A.; Anwar, S.; Awais, M.; Rehman, S. A deep CNN based multi-class classification of Alzheimer’s disease using MRI. In Proceedings of the 2017 IEEE International Conference on Imaging Systems and Techniques (IST), Beijing, China, 18–20 October 2017; pp. 1–6. [Google Scholar] [CrossRef]
He, T.; Droppo, J. Exploiting LSTM structure in deep neural networks for speech recognition. In Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 20–25 March 2016; pp. 5445–5449. [Google Scholar] [CrossRef]
Yoo, Y.; Baek, J.-G. A Novel Image Feature for the Remaining Useful Lifetime Prediction of Bearings Based on Continuous Wavelet Transform and Convolutional Neural Network. Appl. Sci. 2018, 8, 1102. [Google Scholar] [CrossRef]
Xiong, Z.; Stiles, M.; Zhao, J. Robust ECG Signal Classification for the Detection of Atrial Fibrillation Using Novel Neural Networks. In Proceedings of the 2017 Computing in Cardiology Conference, Rennes, France, 24–27 September 2017; pp. 1–4. [Google Scholar] [CrossRef]
Saadatnejad, S.; Oveisi, M.; Hashemi, M. LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices. arXiv 2018, arXiv:1812.04818. [Google Scholar]
Reasat, T.; Shahnaz, C. Detection of inferior myocardial infarction using shallow convolutional neural networks. In Proceedings of the 2017 IEEE Region 10 Humanitarian Technology Conference, Dhaka, Bangladesh, 21–23 December 2017; pp. 718–721. [Google Scholar] [CrossRef]
Acharya, U.R.; Fujita, H.; Oh, S.L.; Hagiwara, Y.; Tan, J.H.; Adam, M. Application of deep convolutional neural network for automated detection of myocardial infarction using ECG signals. Inf. Sci. 2017, 415–416, 190–198. [Google Scholar] [CrossRef]
Li, X.; Wu, X. Long short-term memory based convolutional recurrent neural networks for large vocabulary speech recognition. In Proceedings of the Annual Conference of the International Speech Communication Association NTERSPEECH 2015, Dresden, Germany, September 2015; pp. 3219–3223. [Google Scholar]
Goldberger, A.L.; Amaral, L.A.N.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.-K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet. Circulation 2012, 101, E215–E220. [Google Scholar] [CrossRef]
Martis, R.J.; Acharya, U.R.; Min, L.C. ECG beat classification using PCA, LDA, ICA and Discrete Wavelet Transform. Biomed. Signal Process. Control 2013, 8, 437–448. [Google Scholar] [CrossRef]
Pan, J.; Tompkins, W.J. A Real-Time QRS Detection Algorithm. IEEE Trans. Biomed. Eng. 2007, 32, 230–236. [Google Scholar] [CrossRef] [PubMed]
De Luna, A.B.; Zareba, W.; Fiol, M.; Nikus, K.; Birnbaum, Y.; Baranowski, R.; Goldwasser, D.; Kligfield, P.; Piotrowicz, R.; Breithardt, G.; et al. Negative T wave in ischemic heart disease: A consensus article. Ann. Noninvasive Electrocardiol. 2014, 19, 426–441. [Google Scholar] [CrossRef] [PubMed]
Dawson, D.; Yang, H.; Malshe, M.; Bukkapatnam, S.T.S.; Benjamin, B.; Komanduri, R. Linear affine transformations between 3-lead (Frank XYZ leads) vectorcardiogram and 12-lead electrocardiogram signals. J. Electrocardiol. 2009, 42, 622–630. [Google Scholar] [CrossRef] [PubMed]
Huang, C.S.; Ko, L.W.; Lu, S.W.; Chen, S.A.; Lin, C.T. A vectorcardiogram-based classification system for the detection of Myocardial infarction. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2011, 2011, 973–976. [Google Scholar] [CrossRef] [PubMed]
Ge, D. Detecting myocardial infraction using VCG leads. In Proceedings of the 2008 2nd International Conference on Bioinformatics and Biomedical Engineering, Shanghai, China, 16–18 May 2008; pp. 2217–2220. [Google Scholar] [CrossRef]
Zewdie, G.; Xiong, M. Fully Automated Myocardial Infarction Classification using Ordinary Differential Equations. arXiv 2019, arXiv:1812.04818. [Google Scholar]
Strodthoff, N.; Strodthoff, C. Detecting and interpreting myocardial infarction using fully convolutional neural networks. Physiol. Meas. 2019, 40. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The workflow of an LSTM model.

Figure 2. The network structure diagram.

Figure 3. Heartbeats before and after noises were removed: (a) The original ECG signal; (b) ECG filtered signal by wavelet transform.

Figure 4. Heartbeats after normalization: (a) Normalized healthy heartbeat; (b) Normalized myocardial infraction heartbeat.

Figure 5. Data partitioning diagram.

Figure 6. Confusion matrix of classification effect.

Figure 7. ROC curve of classification effect.

Table 1. The frequency band of wavelet transform components.

Components	Frequency Band (Hz)	Components	Frequency Band (Hz)
D1	250–500	D7	3.9–7.81
D2	125–250	D8	1.95–3.9
D3	62.5–125	D9	0.975–1.95
D4	31.25–62.5	D10	0.4875–0.975
D5	15.63–31.25	A10	0–0.4875
D6	7.81–15.63

Table 2. Experimental results of different heartbeats.

Input Length	Acc	Sen	Spec	F1
3 s	95.4%	98.2%	86.5%	96.8%
6 s	94.3%	96.5%	81.9%	95.4%
9 s	92.7%	93.6%	80.5%	93.4%

Table 3. Experimental results of different convolution kernel sizes.

Layers	1–3	4	5–7	8	9–11	AUC
Layers	Conv	Pooling	Conv	Pooling	Conv	AUC
5 Channels Model	1 × 3	1 × 5	1 × 3	1 × 5	1 × 3	0.9841
	1 × 5	1 × 5	1 × 5	1 × 5	1 × 5	0.9868
	1 × 7	1 × 5	1 × 7	1 × 5	1 × 7	0.9903
	1 × 9	1 × 5	1 × 9	1 × 5	1 × 9	0.9908
	1 × 11	1 × 5	1 × 11	1 × 5	1 × 11	0.9868

Table 4. Experimental results of different optimizers and learning rates.

Learning Rate	0.0001	0.001	0.01	0.1
Optimizer	Acc
Adam	95.4%	93.6%	91.0%	90.5%
RMSprop	94.2%	93.2%	90.5%	89.6%
SGD	94.8%	92.0%	89.4%	88.7%

Table 5. Parameter setting of the model.

Layers	Type		Filter Number	Kernel Size	Output Shape
0	Input				600 × 1
1–3	Convolution	×5	4	5	600 × 4
4	Max-Pooling		-	5	120 × 4
5–7	Convolution		8	5	120 × 8
8	Max-Pooling		-	5	24 × 8
9–11	Convolution		16	5	24 × 16
12	Global-Avg-Pooling		-	-	16
	Concatenate		-		16 × 5
13	Dropout		0.5		16 × 5
14	LSTM		-		16 × 1
15	Flatten		-		16
16	Fully Connected		-		2

Table 6. Comparison of the classification effects between the method proposed in this paper and the others.

Author	Leads	Methods	Performance
Sun et al. [3]	12 leads	ST segments features using multiple instance learning and SVM	Sen = 92.6% Spe = 82.4%
Sharma et al. [5]	II, III, aVF	SVM + KNN on frequency domain features	Sen = 98.7% Spec = 98.7%
Remya et al. [7]	II, III, avF, V2,V3,V5	ANN based on wavelet features	Acc = 86% Sen = 83% Spec = 88%
Reasat et al. [12]	II, III, avF	shallow convolutional neural networks	Acc = 85% Sen = 85% Spec = 84%
Strodthoff et al. [23]	12 leads	fully convolutional neural networks	Acc = 94.1%; Sen = 93.7%; Spec = 96.1%
Huang et al. [20]	Frank leads	time and statistics features with KNN+SVM	Sen = 99.8% Spec = 92.5%
Ge [21]	Frank leads	multivariable autoregressive coefficients of VCG features	Acc = 99.1%
Safdarian et al. [6]	II lead	Naïve Bayes with T wave detection	Acc = 94.7%
Acharya et al. [13]	II lead	11 layers convolutional neural network	Acc = 95.2% Sen = 95.5% Spec = 94.2%
Zewdie et al. [22]	I lead	morphological features with SVM	Acc = 97%
The proposed	I lead	CNN-LSTM	Acc = 95.4% Sen = 98.2% Spec = 86.5% F1 = 96.8%

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Feng, K.; Pi, X.; Liu, H.; Sun, K. Myocardial Infarction Classification Based on Convolutional Neural Network and Recurrent Neural Network. Appl. Sci. 2019, 9, 1879. https://doi.org/10.3390/app9091879

AMA Style

Feng K, Pi X, Liu H, Sun K. Myocardial Infarction Classification Based on Convolutional Neural Network and Recurrent Neural Network. Applied Sciences. 2019; 9(9):1879. https://doi.org/10.3390/app9091879

Chicago/Turabian Style

Feng, Kai, Xitian Pi, Hongying Liu, and Kai Sun. 2019. "Myocardial Infarction Classification Based on Convolutional Neural Network and Recurrent Neural Network" Applied Sciences 9, no. 9: 1879. https://doi.org/10.3390/app9091879

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Myocardial Infarction Classification Based on Convolutional Neural Network and Recurrent Neural Network

Abstract

1. Introduction

2. Methods

2.1. One-Dimensional CNN

2.2. RNN

2.3. Network Model

3. Experiments

3.1. Data Sources

3.2. Data Preprocessing

3.3. Balanced Data

3.4. Cross-Validation

3.5. Evaluation Index

4. Results

4.1. Development Environment

4.2. Impact of Channel Numbers on Performance Indicators

4.3. Impact of Convolution Kernel Sizes on Classification Results

4.4. Impact of Different Optimizers and Learning Rates on Performance Index

4.5. Determination of Model Parameters

5. Discussion

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI