Next Article in Journal
First-Principles Investigation of the Shear Properties and Sliding Characteristics of c-ZrO2(001)/α-Al2O3(11¯02) Interfaces
Previous Article in Journal
Influence of Product Interface Material Stiffness on Human Tactile Perception during a Grasping Task
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Fault Diagnosis Model for Tennessee Eastman Processes Based on Feature Selection and Probabilistic Neural Network

College of Information Engineering, Nanchang University, Nanchang 330031, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2022, 12(17), 8868; https://doi.org/10.3390/app12178868
Submission received: 2 July 2022 / Revised: 28 August 2022 / Accepted: 1 September 2022 / Published: 4 September 2022

Abstract

:
Since the classification methods mentioned in previous studies are currently unable to meet the accuracy requirements for fault diagnosis in large-scale chemical industries, these methods are gradually being eliminated and rarely used. This research offers a probabilistic neural network (PNN) based on feature selection and a bio-heuristic optimizer as a fault diagnostic approach for chemical industries using artificial intelligence. The sample characteristics are initially simplified using heuristic feature selection and support vector machine recursive feature elimination (SVM-RFE). Using PNN as the principal classifier of the fault diagnostic model and employing a modified salp swarm algorithm (MSSA) linked with the bio-heuristic optimizer to optimize the hidden smoothing factor ( σ ) of PNN further improves the classification performance of PNN. The MSSA introduces the Lévy flight method, greatly enhancing exploration capabilities and convergence speed compared to the standard SSA. To validate the engineering application of the suggested method, a PSO-SVM-REF-MSSA-PNN model is created, and TE process data are utilized in tests. The model’s performance is evaluated by comparing its accuracy and F1-score to other regularly used classification models. The results indicate that the data samples selected by PSO-SVM-RFE features simplify and eliminate redundant features more effectively than other feature selection techniques. The MSSA algorithm’s optimization capabilities surpass those of conventional optimization techniques. The PNN network is more suitable for fault detection and classification in the chemical industry. The three considerations listed above make it evident that the proposed approach might greatly help identify TE process problems.

1. Introduction

Due to the rapid growth of mechanical automation and artificial intelligence, chemical control systems no longer rely solely on manual operations to manage complicated chemical conditions. However, the products of the chemical process are frequently poisonous, combustible, and explosive due to its tedious procedures and changing environment. Once a breakdown has occurred, the risk is significantly greater than in other businesses, and it is simple to cause massive deaths, environmental degradation, and economic losses [1,2]. In the past few years, academic and business circles around the world have become more interested in chemical process defect detection and diagnosis, which is one of the most important parts of modern chemical systems [3].
Failure in a chemical process is characterized by the deviation of one or more process variables from their normal state. The fault detection and diagnosis technology monitors the entire system operation process and determines the fault type based on state variables with different deviations [4,5]. Currently, defect diagnosis methodologies can be loosely categorized as knowledge-driven, model-driven, and data-driven [6]. However, the system’s size continues to increase, and the tightness of the correlation between feature variables increases. Knowledge-driven and model-driven diagnostic procedures can no longer meet modern industrial systems’ speed and accuracy requirements for massive data processing. The data-driven method is based on process data, makes a decision-making model, mimics how the factory actually works, can find and diagnose problems well, and is becoming the standard method.
In the realm of industrial process fault diagnosis, multivariate statistical approaches, such as independent component analysis (ICA), principal component analysis (PCA), and fisher discriminant analysis (FDA), have been frequently utilized [7,8,9,10]. As the number of data dimensions increases, however, the complexity of these statistics-based procedures increases exponentially, resulting in a dimensional catastrophe. The gradual application of shallow learning techniques such as support vector machine (SVM), k-nearest neighbor (KNN), and artificial neural network (ANN) converts defect detection and diagnosis problems into classification problems [11,12,13]. However, they rely on extensive training and fault samples, which are difficult for chemical diagnostic models.
The problem of troubleshooting chemical processes has been of interest to research scholars. Ragab et al. [14] discovered hidden knowledge in industrial datasets by revealing explainable patterns associated with underlying physical phenomena through logical analysis of data (LAD). These patterns are then combined to build a decision model for diagnosing faults during process operations and explaining the potential causes of these faults. Zhang et al. [15] used bidirectional recurrent neural network (BiRNN) to construct fault detection and diagnosis (FDD) models with complex RNN units and demonstrated the effectiveness of implementing BiRNN in chemical process fault diagnosis. Wang et al. [16] proposed an extended deep belief network (EDBN) to use the valuable information in the raw data fully. The raw data are also combined with hidden features as the input to each extended restricted Boltzmann machine (ERBM). Wang et al. [17] used long short-term memory (LSTM) and convolutional neural network (CNN) to extract features separately and then fused the extracted features. The features are further compressed and extracted by using them as the input of a multilayer perceptron so that the final extracted features of the network have both spatial and temporal features, thus improving the diagnostic performance of the network.
In addition, artificial neural networks have promising applications in fault diagnosis, and many different types of artificial neural networks are available for classification tasks. As a supervised network classifier based on the Bayesian minimal risk criterion, the probabilistic neural network (PNN) does not require weight adaption, the learning process is straightforward, the training speed is quick, and it possesses more robustness and fault tolerance. In addition, even with fewer training data, the insensitivity to noisy data maintains an excellent diagnostic accuracy. It has been applied successfully to photovoltaic array fault diagnosis [18], circuit breakers fault diagnosis [19], and distributed generation fault diagnosis [20]. However, PNN has significant limitations, such as a low recognition rate and misclassification due to the usage of the same smoothing factor in the iterative process and a complex network structure when the sample set is large. In recent years, meta-heuristic algorithms, such as grey wolf optimizer (GWO) [21], particle swarm optimization (PSO) [22], and sparrow algorithm (SA) [23], have grabbed the attention of researchers and been widely applied to improve the diagnostic performance of PNN further. However, the recently created salp swarm algorithm (SSA) in 2017 has some advantages over conventional optimization algorithms, including a straightforward theory and a rapid search rate. In addition, it has the unique benefit that exploration and production are balanced by a single parameter ( c 1 ) [24,25].
Although research into chemical process fault identification has made tremendous strides thanks to machine learning techniques, there are still some issues to be resolved. Numerous studies have demonstrated that, due to noisy features, computations with all feature sets in chemical processes may not always yield ideal results. Feature selection algorithms can be used to remove these superfluous features. The features election problem in classification can be characterized as “identifying the smallest subset of features from the whole collection of features that achieves the highest classification accuracy.” However, this frequently needs exponential calculation time, which is challenging. To improve classification, researchers employ evolutionary and heuristic methods to feature selection, such as genetic algorithm (GA) [26], ant colony optimization (ACO) [27], particle swarm optimization (PSO) [28], etc. In this feature selection method, the particle swarm algorithm is combined with the support vector machine, which is used to evaluate the fitness value of the particle swarm. This allows for more effective implementation of the feature selection process, as well as improved processing speed and accuracy. When used to classify issues, the method can improve accuracy by 2 to 4% [29].
SVM is not only a data-driven classification technique, but also an excellent machine learning technique. Numerous novel techniques combine data dimensionality reduction with SVM for process monitoring, defect information extraction, and variable elimination. SVM can give their dimension reduction techniques. For example, recursive feature elimination approaches choose the most important features by using accurate category rankings, getting important and useful information from samples, rebuilding samples for classification, and using this method to successfully diagnose chemical faults [30].
This work provides a novel chemical process defect diagnosis model based on Tennessee Eastman (TE) data and prior research expertise. The establishment of the model involves three sequential steps: establishing a two-stage feature selection approach with PSO-SVM and SVM-RFE, updating SSA with Lévy flight, and developing a fault detection method based on PNN with an optimum smoothing factor. The steps are as follows:
  • When there are nonlinear, high-dimensional TE process datasets, we use a two-stage feature selection method to eliminate duplicate features and reduce memory needs. This makes fault diagnosis more accurate and effective.
  • The Lévy flight method is included in SSA, and a new algorithm, MSSA, is developed to alleviate SSA’s deficiencies, such as its slow convergence speed and propensity to slip into local optimum. The approach can iteratively randomize the leader’s position and enhance the optimal global searchability. In addition, it can provide selective updates to followers, which will accelerate convergence.
  • Using MSSA to optimize the smoothing factor of PNN can improve the reliability, self-correction capability, and accuracy of PNN when dealing with data categorization problems.
The rest of the paper is structured as follows. Section 2 introduces feature selection, machine learning algorithms, and models based on them. Section 3 compares the model proposed in this paper with previous models from three perspectives and demonstrates the superiority of the model proposed in this paper. Section 4 summarizes our contributions and presents our future work.

2. Materials and Methods

The efficient combination of feature selection and neural networks is one of the effective ways to deal with high-dimensional and massive data. After long-term theoretical development and practical exploration, it has some unique advantages. In this section, we first review some algorithms and techniques that are prerequisites for our work. Then, based on these algorithms and techniques, we constructed a fault diagnosis model for Tennessee Eastman chemical processes.

2.1. Feature Selection Phase

Selecting the most relevant features for the training phase is an essential step in many pattern recognition problems. Therefore, the critical question is how to find the excellent subset of elements matching the data categories to enhance the performance of pattern recognition models. To tackle this challenging task, many feature selection algorithms have been developed. This section first introduces support vector machine recursive feature elimination (SVM-RFE) and feature selection using particle swarm optimization (PSO)-SVM. Then, a two-level feature selection preprocessing model is constructed based on both.

2.1.1. Support Vector Machine Recursive Feature Elimination

Guyon et al. [31,32] first suggested the support vector machine recursive feature elimination (SVM-RFE) approach for extracting features while identifying cancer cells [33,34]. SVM-RFE is a sequential backward selection algorithm that is based on the SVM maximum interval principle. Consequently, the SVM-RFE ranking criteria are closely related to the SVM [35].
Give a training sample set x i , y i i = 1 N , x i R D , y i { + 1 , 1 } , where y i is the category label of x i , N is the number of training samples, and D is the feature dimension of the samples. Furthermore, the SVM seeks the optimal classification plane ω x + b = 0 , where ω is the weight vector of the optimal hyperplane and b is the threshold, so that the optimal classification plane not only separates the two classes of samples without error, but also maximizes the classification interval between the two classes.
In order to calculate the weight vector and threshold, the SVM needs to solve the following optimization problems:
min 1 2 ω 2 + C i = 1 N ζ i
and
y i ω · x i + b 1 + ς i ; i = 1 , 2 , , N
ζ i 0 ; i = 1 , 2 , , N
where C > 0 is the penalty parameter and  ζ i is the relaxation variable. The role of parameter C is to adjust the level of penalty for sample misclassification and to achieve a trade-off between the percentage of sample misclassification and the algorithm’s complexity.
By introducing Lagrange multipliers, the optimization problem of SVM can be transformed into the following pairwise programming problems:
min 1 2 i = 1 N i = 1 N α i α j y i y j x i · x j i = 1 N α i
and
i = 1 N y i α i = 0 ; 0 α i C , i = 1 , 2 , , N
where α i is the Lagrange multiplier.
The relationship between the weight vector and the solution of the pairwise optimization (2) is:
ω = i = 1 N α i y i x i
In SVM-RFE, the ranking criterion score of the ith feature is defined as:
c i = ω i 2
where ω i is the vector of weights of the optimal hyperplane.
Each round of recursive feature elimination needs training the SVM in order to obtain the scoring criterion. The feature with the lowest score is eliminated from the initial feature set in order to generate a new one, as it has the least impact on classification performance. The next iteration updates the feature set used to train the SVM. Repeat this procedure until all features have been removed, and then arrange the features in descending order of removal. The later eliminated elements are more important.

2.1.2. Feature Selection Using PSO-SVM

This paper uses binary particle swarm optimization (PSO) [36] as feature selection for classification problems. In each iteration, the particles are optimized according to their fitness and swarm fitness values. Using SVM to evaluate the fitness value of particle swarm optimization and by introducing a kernel function, the maximum edge hyperplane suitable for the classification problem structure is found in the high feature space, thereby improving the efficiency of the fitness value function.
According to particle swarm optimization rules, we first set the required number of particles and then randomly generate an initial binary-coded string for each particle. For example, when using particle swarm optimization to analyze an eight-dimensional dataset S n = H 1 H 2 H 3 H 4 H 5 H 6 H 7 H 8 (n = 8) to select features, we can select any number of features less than n. We can randomly select three features (m = 3), here S m = H 1 H 5 H 7 When calculating the adaptive value, these m features in each dataset represent the data dimension d, which is evaluated by SVM. When the sample size is large, the fitness value of SVM evolves according to the Holdout method. Moreover, the kernel function of SVM is radial basis function (RBF):
K ( x , y ) = exp γ x y 2 , y > 0
For different classification problems, support vector machines need to set different parameters. γ and C are important. By properly adjusting these parameters, a better classification hyperplane can be obtained and the classification accuracy can be improved. This paper did not optimize the SVM parameter setting, but set the parameters as γ = 2 0 and C = 2 12 according to relevant literature. The optimization can be used as a direction for follow-up research.
Each particle update is based on its adaptive value. The fitness function designed in this paper is shown in Equation (6):
fitness = ω 1 × accuracy S V M + ω 2 × f i 1
where ω 1 represents the weight of classification accuracy; accuracy S V M represents the classification accuracy of SVM; ω 2 represents the weight of feature dimension; and f i represents the relative offset of the i-th feature dimension in the mask, where f i = 1 means the feature is retained and  f i = 0 means the feature is filtered. The actual problem determines ω 1 and ω 2 . In this paper, ω 1 = 0.2, ω 2 = 0.8 .
The best fitness value updated by each particle is pbest, and the best fitness value in a group of pbest is gbest. Once we have pbest and gbest, we can track the location and velocity characteristics of pbest and gbest particles. Each particle is updated according to Equations (7) and (8):
v i d t + 1 = ω v i d t + c 1 × rand ( ) × p b e s t i d x i d t + c 2 × rand ( ) × g b e s t i d x i d t
x i d t + 1 = 1 rand < s i g m o i d v i d t + 1 0 rand s i g m o i d v i d t + 1
The updated features of the velocity value v i d t + 1 are calculated by the function s i g m o i d ( v i d t + 1 ) . If  s i g m o i d ( v i d t + 1 ) is greater than the random number within (0,1), its position value H n ( n = 1 , 2 , , m ) is represented as 1, and this feature should be retained in the next iteration. If  s i g m o i d ( v i d t + 1 ) is less than H n ( n = 1 , 2 , , m ) is represented as 0, then this feature will not appear in the next iteration.

2.1.3. Two-Level Feature Selection Preprocessing Model

When PSO-SVM makes a “one-to-one” feature selection, a variable number of redundant features are filtered in each iteration, resulting in an unpredictable number of residual features. In addition, the selection of SVM parameters substantially affects classification accuracy, frequently resulting in “missing the mark.” This study develops a two-level feature selection preprocessing model, PSO-SVM-RFE, to eliminate unanticipated mistakes produced by parameters impacting the accuracy of feature selection. Initially, the original data features are filtered using PSO-SVM, and then the filtered features are further filtered by SVM-RFE to produce the final feature selection results. The process of this model is shown in Figure 1.

2.2. Classification Phase

Artificial neural network (ANN) is a mathematical model that mimics the structure and function of biological neural networks and is used to estimate or approximate functions. Similar to other machine learning, neural networks have been used to solve various problems. However, ANN also has “fatal” shortcomings, which require some optimizers to make corresponding improvements. In this section, we first introduce the MSSA as an optimizer and then submit a variant of radial basis neural network, PNN, which is simple in structure and fast in training, especially suitable for pattern classification problems solution.

2.2.1. Modified Salp Swarm Algorithm

Mirjalili et al. presented the salp swarm algorithm (SSA) in 2017 as a heuristic algorithm based on mimicking the group behaviour of salps in nature. SSA developed a salp chain model for solving optimization issues and separated salp groups into the leader and follower categories. The leader is the individual at the front of the chain, while the remaining individuals follow each other directly or indirectly as followers. SSA employs the approach of survival of the fittest. SSA continues to approach the food source position by calculating all individuals’ adaptive values and comparing the current iterations’ adaptive values to the previous optimal ones. Thus, it is possible to model the foraging behaviour of salps to address the optimization problem [37,38,39].
Similar to other meta-heuristic algorithms, the original SSA had flaws, such as a sluggish convergence rate and a tendency to reach a local optimum during the optimization procedure quickly. This study offers a salp swarm method employing the Lévy flight strategy for the conditional update as a response.
Paul Lévy, a French mathematician, proposed the Lévy flight [40,41]. It is a distinct random walk approach. In the walking process, Lévy flight is accompanied by frequent short trips and occasional big distances, successfully balancing local development and global exploration capacity.
The random step size of Lévy’s flight obeys Lévy’s distribution, and its simplified form is:
Levy ( s ) = | s | 1 β , 0 < β < 2
where s is the random step size. Since Lévy flight is very complex, this paper adopts the algorithm proposed by Mantegna to calculate [42], and its equation is as follows:
s = u | v | 1 / β
where u and v is a random number that is normally distributed, u N 0 , σ u 2 , v s . N 0 , σ v 2 . σ u and σ v can be obtained from Equation (11):
σ u = Γ ( 1 + β ) · sin ( π β / 2 ) Γ [ ( 1 + β ) / 2 ] · β · 2 ( β 1 ) / 2 1 / β σ v = 1
where Γ is the integral operation and  β usually takes the value of 1.5.
The mathematical algorithm of MSSA is described in detail below.
Step 1: Initialization phase
At this stage, MSSA generates scattered initial random locations based on the size of the input dataset.
The target environment is defined as an N × D dimensional space, where N represents the number of populations and D represents the dimension of the space. The location of each salp is defined as X i = X i 1 , X i 2 , X i 3 , , X i D , i = 1 , 2 , 3 , N , and the target location is defined as F = F 1 , F 2 , F 3 , , F D . The upper bound of the search range of each dimension is U b = u b 1 , u b 2 , u b 3 , , u b D , and the lower bound is L b = l b 1 , l b 2 , l b 3 , , l b D . Finally, the initial population position is randomly obtained according to Equation (12):
X N X D = rand ( N , D ) × ( U b L b ) + L b
In the population, the value of each dimension of the leader is defined as X d 1 , and the value of each dimension of the follower is defined as X d n , where n = 2 , 3 , 4 , , N , d represents the dimension.
Step 2: Improve update strategy for leader positions
Lévy’s random flight step was used to improve the position update of the leader. Lévy flight strategy enables the algorithm to change randomly between long and short distances, and a small number of long hops are used to avoid the algorithm falling into local optimization to enhance the global search ability. Leaders update positions according to Equation (13):
X d 1 = F d + c 1 ( ( u b l b ) Levy ( λ ) + l b ) , c 3 0.5 F d c 1 ( ( u b l b ) Levy ( λ ) + l b ) , c 3 < 0.5
where F d is the base target position, Levy ( λ ) is the Lévy flight path, c 1 and c 3 are control parameters, c 3 is a random number between [0,1], which determines the direction and step size of the leader’s position update, and c 1 is the convergence factor, which is used to balance the convergence speed of the algorithm in the iterative process, as shown in Equation (14):
c 1 = 2 e 4 × l / l max 2
where l is the number of iterations and l m a x is the maximum number of iterations.
Step 3: Improve update strategy for follower positions
In the original SSA, the follower blindly followed the previous salp, making it miss the better fitness position. In the improved algorithm MSSA, a conditional “piecewise” location update is adopted: firstly, the fitness of the previous salp is compared with the current fitness to select the new location so that the new location is more inclined to the side with better fitness. Compared with blind random updating, this mechanism can approach the optimal solution faster, thus speeding up the convergence speed of the algorithm. Therefore, the position of followers is updated in the improved way shown in Equation (15):
X d n = ε X d n 1 + X d n
where ε is the coefficient of position offset, and its calculation equation is as follows:
ε = 0.5 × rand ( 0 , 1 ) f X d n 1 < f X d n 0.5 f X d n 1 = f X d n 1 0.5 × rand ( 0 , 1 ) f X d n 1 > f X d n
where f is the fitness function.
According to the update method of population individuals described above, MSSA can be obtained, as shown in Algorithm 1.
Algorithm 1 Modified salp swarm algorithm (Pseudo-code)
1:    Initialization parameters: population size N, dimension D, maximun number of iterations l max .
2:    Generate the initial population X N × D by Equation (11);
3:    Calculate the fintess value for each individual search agent.
4:    While l l max + 1 do
5:        Update c 1 by Equation (13);
6:        for i = 1 : N do
7:             if X i = 1 (leader) then
8:                   Update random numbers c 3 and β ;
9:                   Update the position of the leader salp as in Equation (19);
10:             else
11:             Update ε by Equation (16);
12:             Update the position of the follower salp as in Equation (20);
13:             end if
14:       end for
15:        set l = l + 1 ;
16:    end while
Output: Best classification and predication results.

2.2.2. Probabilistic Neural Network

The probabilistic neural network (PNN) is a radial basis function network based on the Bayesian decision theory. PNN features easy training, fast convergence, and arbitrary nonlinear methods. Due to the specificity of the functions it relies on, PNN is highly fault-tolerant and robust.
PNN comprises the input, pattern, summation, and output layers. Nodes in the input layer are a set of predicted values. The pattern layer consists of Gaussian functions centred on the prediction set. The summation layer averages each set of predicted values. The output layer determines the class label associated with it by voting. Figure 2 shows the structure of the PNN.
For the input vector, to match various features of the training set, each unit output is as follows:
ϕ ij ( x ) = 1 2 π d 2 σ d exp X x i j T X x i j σ 2
where X = x 1 , x 2 , x 3 , , x n T , n = 1 , 2 , , l ; d is the dimension of the feature vector; l is all training types; x i j is the jth centre of the ith training sample; and σ is the smooth factor.
The output weights of summation layer neurons are calculated as follows:
v i = j = 1 L ϕ i j L
where v i is output of ith type and L is the number of class ith neurons.
The output layer takes the type corresponding to the maximum output weight obtained by the summing layer as the output type, and the result is as follows:
Type v i = arg max v i

2.3. The Proposed PSO-SVM-RFE-MSSA-PNN Model

The PNN structure diagram reveals that the smoothing factor affects the classification performance of PNN straightforwardly. If it is too large or too little, the convergence of the network will be too quick or too slow, preventing the optimal solution from being identified. As a result, the diagnostic accuracy and classification performance of the PNN will be drastically diminished. Since MSSA has significant benefits over other optimization algorithms in terms of global search and population diversity, this research employs MSSA to improve the classification performance of PNN by locating an appropriate one. To save storage capacity and improve the accuracy of diagnosis, we use PSO-SVM-RFE for feature selection. It eliminates redundant features and minimizes the input’s feature dimension, thus obtaining a highly representative feature volume for various fault types. The design process of PSO-SVM-RFE-MSSA-PNN is shown in Figure 3, and the specific steps are as follows:
Step 1:
The data samples were entered into PSO-SVM-RFE and ranked in order of importance for each feature.
Step 2:
Select the specified number of features to construct feature subsets based on the ranking results, and obtain the simplified sample dataset based on the feature subsets.
Step 3:
The simplified data samples are preprocessed and then randomly input to PNN.
Step 4:
The initial parameters of MSSA are set as follows: the number of populations N, dimension d, and the maximum number of iterations l m a x . In addition, the population positions of MSSA are initialized by Equation (12).
Step 5:
The fitness of salp individuals in the initial population is calculated and ranked. The fitness function in this paper is set as the mean square error function, as shown in Equation (20):
f ( x ) = 1 n i = 1 N Y i O i 2
where Y i and O i are the training accuracy and testing accuracy under the effect of a particular smoothing factor, respectively.
Step 6:
The salp individual location with the best fitness was considered to be the current food location. Of the remaining N-1 salp individuals, the most adaptable salp is considered the leader, and the rest are considered followers.
Step 7:
Update c 1 according to Equation (14).
Step 8: Update the leader position by Equation (13) and the follower position by Equation (15).
Step 9:
The following process is continued if the maximum number of iterations is reached or the preset conditions are met. If not, return to step 6.
Step 10:
At the end of the training process, the optimized smoothing factor σ is input into PNN to obtain a PNN with global optimization performance. Then, input the test sample data into PNN to obtain the final diagnosis result.

3. Results and Discussions

3.1. Tennessee Eastman Process

An American chemical corporation established the TE process in 1993 as a chemical modeling and simulation platform. The TE process is a classic chemical process, commonly employed for process monitoring and problem diagnosis [43,44,45,46]. Figure 4 depicts an approximation of the TE process flowchart. There are four gaseous reactants in the TE process, A, D, C, and E, and two products, G and H.
Table 1 displays the defect categories in the TE process database. Fault categories 1–7 are order variable faults, 8–12 are random variable faults, 13 are slow drift faults in chemical reaction dynamics, 14–15 are corresponding viscous faults, 16–20 are unknown faults, and 21 are constant position faults. The TE process contains 41 measurements and 11 control variables, and Table 2 lists all the variables associated with this process.
The TE process database includes the following data sources. It has been set to three minutes for the sampling intervals. After 48 h of continuous operation under normal process conditions, 960 samples were obtained as standard data samples. After 8 h of normal operation, 21 defects were introduced, which lasted for 48 h until the chemical process was completed. As a result, of the 960 data samples collected during a failure, the first 160 were captured during normal operation, and the remaining 800 were collected after the failure. If the reader is interested in learning more about these datasets, they can be found at the following address: http://web.mit.edu/braatzgroup/links.html, accessed on 25 June 2022.

3.2. Three Experiments to Verify the Validity of the Proposed Model

This paper examines the performance of the PSO-SVM-RFE-MSSA-PNN model from three different angles. First, various feature selection approaches have varying effects on simplifying high-dimensional data into low-dimensional features significantly connected with fault categories. Second, different optimization techniques use different updating strategies for PNN smoothing factors. Third, since different classifiers use different classification rules, the diagnosis results will also differ. As a result, the following three separate experiments are defined by the above viewpoint. The MATLAB software environment was used for the above studies.

3.2.1. Influence of Different Feature Selection Algorithms on the Performance of Fault Diagnosis

TE datasets are high-dimensional, small imbalanced samples, and three basic feature selection methods exist filter, wrapper, and embedding. SVM-RFE, an embedded feature selection method, was utilized by Yang X et al. [47] to score and sort the features, and the best five features were then selected for classification studies. Table 3 shows the feature selection results. Xie Z et al. [48] utilized the filtered feature selection method random forest three-bagger (RFtb), randomly divided the dataset, incrementally grew the decision tree set from the given dataset, and measured and sorted the relative value of 52 features. The initial five characteristics are chosen for categorization experiments. The results of the feature selection are shown in Table 4. This study offers a packaged feature selection approach, PSO-SVM-RFE, which employs PSO-SVM and SVM-RFE to analyze the original dataset and produce a dataset with five features. The results of the feature selection are shown in Table 5.
First, the training samples under normal state and 21 fault categories are fed into the PSO-SVM-RFE model for feature selection, and the five highest-priority feature ranking sets are produced. The training and test samples are then simplified using our feature selection sort set and Yang X and Xie Z’s feature selection sort sets. Lastly, the three simplified samples and the original samples without simplification are entered into the optimization model MSSA-PNN established in this research to compare the performance of various feature selection approaches. The population of the model is set at 20, and the maximum number of iterations is 30.
Feature selection techniques such as PSO-SVM-RFE and others are shown in Table 6. As a result of the optimization, the diagnostic rate for the 21 faults was 91% for PSO-SVM-RFE, 88.8% for SVM-RFE, and 90% with RFtb. After feature selection with PSO-SVM-RFE, the smaller dataset is more accurate because of the following.
The PSO-SVM-RFE model offers a considerable advantage in feature selection for categories 3, 9, 11, 13, 14, 17, and 19, compared to the SVM-RFE model for these seven fault categories. Despite the fact that the PSO-SVM-RFE model does not outperform the SVM model in categories 8, 10, 15, and 16, its diagnosis rate is still higher than 80%. In the remaining categories of defects, the differences between them are insignificant. Thus, when it comes to selecting features, the PSO-SVM-RFE model with initial pre-screening by PSO-SVM does better than the SVM-RFE model.
For categories 3, 9, 11, 13, 14, 17, and 19, the PSO-SVM-RFE model has a significant advantage over the SVM-RFE model in feature selection for these seven fault categories. Although the PSO-SVM-RFE model does not have an advantage over the SVM-RFE model for categories 8, 10, 15, and 16, its diagnostic rate is still above 80.00%. In the other fault categories, the difference between them is not significant. Thus, the PSO-SVM-RFE model with initial pre-screening by PSO-SVM performs better in feature selection than the single SVM-RFE model.
PSO-SVM-RFE had an average diagnostic rate of 91%, whereas RFtb had an average diagnostic rate of 90%. The former has a minor edge in the average accuracy rate if there is little difference in the diagnostic rate of 21 fault categories. PSO-SVM-RFE has a better diagnostic rate for several fault types than RFtb. Figure 5 shows that the PSO-SVM-RFE algorithm gives a better diagnosis rate for the 21 fault categories while keeping a more or less smooth quasi-break rate.

3.2.2. Performance Analysis of Fault Classification for Different Optimized PNN Schemes

To compare the optimization performance of various heuristic methods on the smoothing factor of PNN, the training sample set and the test sample set are generated using the simplified feature set resulting from PSO-SVM-RFE feature selection, as shown in Table 5. The training set is then inputted into the MSSA-PNN, SSA-PNN, genetic algorithm (GA)-PNN, cuckoo algorithm (CS)-PNN, PSO-PNN, seagull optimization algorithm (SOA)-PNN, multi-verse optimizer (MVO)-PNN, and Unoptimized PNN models. Lastly, the test set is utilized to compare the respective fault correctness rates. All models have 20 populations and 50 iterations. GA-PNN sets the crossover probability to 0.7 and the mutation probability to 0.01. The CS-PNN sets the discovery probability Pa to 0.25, λ to 1, and the step size α to 0.4. PSO sets the maximum speed to 1, the minimum speed to 1, the solution space to [−5, 5], and the learning factor to 1.49445 [48]. The smoothing factor of the Unoptimized PNN is 0.8.
The results of the classification are provided in Table 7. As shown in the table, the average MSSA-PNN diagnostic rate is 88%, which is more than other approaches: 77% for the PNN, 86% for the SSA-PNN, 83% for the CS-PNN, 81% for the GA-PNN, 84% for the PSO-PNN, 82% for the MOV-PNN, and 86% for the SOA-PNN. The results are analyzed as follows.
Compared to the other optimization models, the diagnostic rate of PNN is lower on average, which demonstrates the inadequacies of the common PNN and the importance of the optimization model.
MSSA-PNN has a higher average diagnostic rate than SSA-PNN, demonstrating the advantages of optimized SSA and the soundness of this theory.
MSSA-PNN has the highest average diagnosis rate among the aforementioned optimization models, and its fault diagnosis rate is superior than those of other optimization models. In addition, categories 3, 9, 13, and 17 outperform all other categories. Although it is inferior in a few defect categories, the difference is minor. Further evidence demonstrates that utilizing MSSA to optimize PNN can boost performance.
“G at reactor feed” (characteristic 35) can indicate product quality. Failure categories 1, 2, 5, 6, 7, 8, 12, 13, 20, and 21 have a stronger influence on product quality than other failure categories [48]. The actual TE process is imitated, i.e., the actual situation is reproduced when the chemical process fails under normal circumstances. In this paper, 270 random samples are taken from the original dataset, and 570 random samples are taken from 12(13) categories of faults to make a “simulated sample set.” The chemical process with this set of samples is as follows: the sampling interval is 3 minutes, and the TE process is carried out normally when t ( 0 , 3 × 270 ) min; the TE process is carried out under normal conditions. t = ( 3 × 270 + 1 ) min marked the beginning of the category 12(13) fault, which persisted until t = 3 × ( 270 + 570 ) min was reached.
Figure 6 and Figure 7 show detailed process monitoring graphs for two typical faults (category 12 and category 13). These graphs combine the raw time trends with the dynamics of the TE process. Because of this, it is easier to see how well the various optimization models monitor the dynamics of the TE process for the category 12(13) failure, which has a significant impact on the product quality, and it is possible to visualize the data in Table 7, making the “numerical” experimental results more informative.
For category 12, Figure 6 shows that the original PNN has an overfitting phenomenon in diagnosing category 12 faults and failing to separate the normal state data. The GA-PNN also suffers from overfitting when optimizing the smoothing factor but behaves exactly opposite to the original PNN, failing to isolate the fault state data. SSA-PNN and SOA-PNN do not identify the normal state data very well, and the CS-PNN does not identify the fault state data very well. MSSA-PNNN, PSO-PNN, and MOV-PNN are generally close to each other in terms of fault detection performance. However, a closer look reveals that MSSA-PNN is more sensitive than PSO-PNN for fault state data and more sensitive than MOV-PNN for normal state data.
Similar to category 12, category 13 has a detrimental effect on the overall quality of the product. As can be seen in Figure 7, the CS-PNN, GA-PNN, PSO-PNN, MOV-PNN, and the original PNN fail to achieve the desired results for monitoring category 13 faults. The fault detection capabilities of the MSSA-PNN, SSA-PNN, and SOA-PNN are generally equal. However, MSSA-PNN and SSA-PNN have a slightly higher recognition rate for fault state data than SSA-PNN, with almost the same recognition rate for normal state data. Although SOA-PNN has a slightly higher recognition rate for normal state data than MSSA-PNN, its recognition rate for fault state data is significantly lower than that of MSSA-PNNN.
In summary, MSSA-PNN has high robustness and good diagnostic accuracy. Moreover, it is more sensitive than other optimization models and can accurately classify fault and non-fault data. It is also not prone to overfitting problems that degrade the model’s performance. Therefore, on the whole, the fault detection performance of MSSA-PNN is superior.

3.2.3. Analysis of Fault Diagnosis Performance Indicators of Different Classifiers

In the third experiment, the performance of the MSSA-PNN classification model was evaluated by diagnosing 21 types of errors in the TE process. The matching simplified data samples were generated based on the feature selection sorting set provided in Table 5 and then fed into the MSSA-PNN model and other commonly used classifiers to compare the classification outcomes. Similar to linear discriminant analysis (LDA) are quadratic discriminant analysis (QDA), KNN, SVM, and maximum entropy model (MaxEnt), which are used a lot in the literature [49,50,51,52,53]. In addition, we compare it to the hybrid back propagation (BP) neural network model described in the literature, such as (CS)-BP [54]. Moreover, the MSSA algorithm’s parameters are consistent with those established in the initial experiment. K in the KNN model has a value of 7. The MaxEnt model’s maximum step length is set to 10 (Max step = 10), and the probability distribution adopts empirical edge distribution probability, which is optimized by a geographic information system (GIS) algorithm. In the CS-BP model, the population size is set to twenty, and the probability of discovery is set to 0.25. Other models’ parameters retain the default values of MATLAB tools.
It is important to avoid individual classifiers from achieving the local optimum, i.e., [normal, fault] = [0, 1], and thus, producing “false positives” for the fault categories. This experiment also provides the expected diagnosis rate for the normal condition. Table 8 shows the expected diagnosis rates of different classifiers for both normal and problem states.
Table 8 shows that the fault diagnosis rate for MSSA-PNN is 91%, and 68% for LDA, 78% for QDA, 83% for KNN, 96% for SVM, 35% for MaxEnt, and 91% for CS-BP. If we only look at the rate of diagnosing each fault state, the diagnostic performance of the MSSA-PNN model is almost the same as that of the CS-BP model and is even worse than that of the SVM model. Table 8 also shows that SVM has “false positives” in 14 fault categories, such as categories 3, 8, 9, 10, and 11. Category 9, 10, 15, 16, and 19 faults have “local optimization” in CS-BP. Thus, judging the diagnostic performance of different classifiers only by the rate at which each fault category is found is unfair and insufficient.
In order to completely analyze the defect diagnostic performance of all classifiers and demonstrate the efficacy of the proposed model, the study chose the accuracy rate and F1-score as evaluation indices. The confusion matrix is a crucial metric for assessing the performance of classification models. Table 9 shows that it has four values: true positive (TP), true negative (TN), false positive (FP), and false-negative (FN). TP denotes the number of correctly predicted positive samples; TN denotes the number of correctly predicted negative samples; FP denotes the number of predicted positive samples that are actually negative; FN denotes the number of predicted negative samples that are actually positive.
Precision refers to the ratio between the number of samples that are correctly predicted as positive labels and the total number of samples that are predicted as positive labels, as shown in Equation (21):
precision = TP TP + FP
The recall is the ratio of the number of samples that were correctly predicted to have positive labels to the number of samples that were labeled positive, as shown in Equation (22):
recall = TP TP + FN
The two key evaluation indicators required in this paper can also be obtained from Table 9, and the calculation formula is as follows:
accuracy = TP + TN TN + TP + FN + FP
F 1 - score = 1 + β 2 precision × recall β 2 · precision + recall
In this paper, β = 1 , indicating that precision and recall are considered with the same weight.
The accuracy of MSSA-PNN and other classifiers is displayed in Table 10. MSSA-PNN’s diagnostic accuracy is 88%, which is greater than LDA’s 69%, QDA’s 81%, KNN’s 70%, SVM’s 75%, MaxEnt’s 52%, and CS-BP’s 84%. In addition, MSSA-PNN has a considerable advantage over other classifiers in eight fault categories: category 3, category 9, category 10, category 11, category 15, category 16, and category 19. Although it falls short in a few defect categories, the disparity is negligible. Therefore, MSSA-PNN provides more accurate fault diagnosis performance.
The F1-score of MSSA-PNN and other classifiers are displayed in Table 11. From comparing Table 8 and Table 11, it is evident that certain classifiers may have a greater rate of fault diagnosis than MSSA-PNN. However, the F1-score is lower than the MSSA-PNN score. For instance, in the SVM’s categories 8, 9, 14, etc., the fault diagnosis rates are more significant than those of MSSA-PNN, but their F1-score is significantly lower than that of MSSA-PNN. It shows that SVM’s category 8 and other fault categories display over-fitting. Consequently, the F1-score and the accuracy rate can be used together to give a complete picture of the classifier’s ability to find faults. Meanwhile, the average F1-score of MSSA-PNN is 91%, which is higher than that of other classification models, while LDA is at 74%, QDA is at 83%, KNN is at 79%, SVM is at 84%, MaxEnt is at 46%, and CS-BP is at 88%.
In conclusion, based on the results, the fault diagnosis rate, accuracy rate, and F1-score of MSSA-PNN are the same, indicating that there is nearly no overfitting issue with this model, the results are trustworthy, and the diagnosis model is persuasive. In general, the model suggested by this research does a better job of diagnosing problems than some commonly used classifiers.

4. Conclusions

This paper develops a basic fault diagnosis model based on feature selection and PNN. The main innovation of this paper is to use PSO-SVM to pre-screen the SVM-RFE feature selection and MSSA to optimize the PNN. PSO-SVM-RFE performs feature selection by two-level screening, which can remove redundant features, simplify the samples, and indirectly improve the classification performance. MSSA uses a unique optimization mechanism to update the parameters, giving the algorithm better global search capabilities.
In this paper, the fault diagnosis performance of the PSO-SVM-RFE-MSSA-PNN model is experimentally validated using experimental data provided during the TE chemistry process. The analysis of the experimental results shows that the PSO-SVM-RFE feature selection method can improve the classification accuracy, and MSSA can enhance the local convergence of PNN, making the combined model have better fault diagnosis performance. Therefore, PSO-SVM-RFE-MSSA-PNN is suitable for fault prediction and diagnostic classification of the Tennessee Eastman process.
Although the model proposed in this paper has achieved good results to some extent, there are still some limitations that need further improvement in future work:
  • This paper uses a two-stage feature selection algorithm to delete redundant features. Experiments verify the practicability and superiority of the algorithm, but the influence of operation time is ignored in the experiments. In future work, further simplification of the structure of the feature selection algorithm, such as adopting NSGA-II, should be considered to achieve simplification in the data preprocessing stage.
  • The quality of the original data of the TE process directly affects the diagnostic performance of the fault diagnosis model. By observing the sample data, it can be seen that each characteristic variable of the TE process fluctuates in a normal state, which can easily cause misjudgment, which may be the reason for the low diagnosis rate of some fault types. Therefore, it is necessary to scale out abnormal data before feature selection in practical applications.
  • The feature selection algorithm in this paper only filters redundant features at the data level. Next, we can combine the characteristics of the TE chemical process itself, explore the chemical connection between the characteristic variables, and ignore unnecessary variables from the chemical direction, which will be a new cross-optimization direction.
  • In a natural chemical process, faults can be divided into process and sensor faults. Process faults are characterized by multivariate coordination, while sensor faults are variable independent, and the fault variable is unique. The occurrence of a process fault means that the system’s operating state deviates from its normal value. In contrast, sensor faults interfere with the system’s stability and affect the operator’s judgment, which may lead to failures. Examples are drift, jitter, and stepping of data. This paper makes no distinction between the two, but they are uniformly classified as faults. Therefore, we should distinguish and differentiate between process and sensor faults in chemical processes in subsequent research.

Author Contributions

Conceptualization, H.X.; methodology, H.X.; software, H.X.; validation, Z.M. and T.R.; formal analysis, H.X.; writing—original draft preparation, H.X.; writing—review and editing, X.Y.; visualization, Z.M. and T.R.; funding acquisition, X.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China (51765042, 61963026).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Soui, M.; Mansouri, N.; Alhamad, R.; Kessentini, M.; Ghedira, K. NSGA-II as feature selection technique and AdaBoost classifier for COVID-19 prediction using patient’s symptoms. Nonlinear Dyn. 2021, 106, 1453–1475. [Google Scholar] [CrossRef] [PubMed]
  2. Nor, N.M.; Hassan, C.R.C.; Hussain, M.A. A review of data-driven fault detection and diagnosis methods: Applications in chemical process systems. Rev. Chem. Eng. 2020, 36, 513–553. [Google Scholar] [CrossRef]
  3. Wu, D.; Zhao, J. Process topology convolutional network model for chemical process fault diagnosis. Process Saf. Environ. Prot. 2021, 150, 93–109. [Google Scholar] [CrossRef]
  4. Yu, W.; Dillon, T.; Mostafa, F.; Rahayu, W.; Liu, Y. A global manufacturing big data ecosystem for fault detection in predictive maintenance. IEEE Trans. Ind. Inform. 2020, 16, 183–192. [Google Scholar] [CrossRef]
  5. Li, B.; Delpha, C.; Diallo, D.; Migan-Dubois, A. Application of Artificial Neural Networks to photovoltaic fault detection and diagnosis: A review. Renew. Sustain. Energy Rev. 2021, 138, 110512. [Google Scholar] [CrossRef]
  6. Huang, K.; Wu, Y.; Wang, C.; Xie, Y.; Yang, C.; Gui, W. A projective and discriminative dictionary learning for high-dimensional process monitoring with industrial applications. IEEE Trans. Ind. Inform. 2021, 17, 558–568. [Google Scholar] [CrossRef]
  7. Stief, A.; Ottewill, J.R.; Baranowski, J.; Orkisz, M. A PCA and two-stage Bayesian sensor fusion approach for diagnosing electrical and mechanical faults in induction motors. IEEE Trans. Ind. Electron. 2019, 66, 9510–9520. [Google Scholar] [CrossRef]
  8. Zhou, P.; Zhang, R.; Xie, J.; Liu, J.; Wang, H.; Chai, T. Data-driven monitoring and diagnosing of abnormal furnace conditions in blast furnace ironmaking: An integrated PCA-ICA method. IEEE Trans. Ind. Electron. 2021, 68, 622–631. [Google Scholar] [CrossRef]
  9. Wolf, R.C.; Rashidi, M.; Fritze, S.; Kubera, K.M.; Northoff, G.; Sambataro, F.; Calhoun, V.D.; Geiger, L.S.; Tost, H.; Hirjak, D. A neural signature of parkinsonism in patients with schizophrenia spectrum disorders: A multimodal MRI study using parallel ICA. Schizophr. Bull. 2020, 46, 999–1008. [Google Scholar] [CrossRef]
  10. Lina; Arisandi, D. Vowel Recognition Based on Face Images Using Fisher Linear Discriminant Analysis. IOP Conf. Ser. Mater. Sci. Eng. 2020, 852, 012130. [Google Scholar] [CrossRef]
  11. Liu, L.; Chu, M.; Gong, R.; Zhang, L. An improved nonparallel support vector machine. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 5129–5143. [Google Scholar] [CrossRef] [PubMed]
  12. Li, W.; Chen, Y.; Song, Y. Boosted K-nearest neighbor classifiers based on fuzzy granules. Knowl.-Based Syst. 2020, 195, 105606. [Google Scholar] [CrossRef]
  13. Acheampong, A.O.; Boateng, E.B. Modelling carbon emission intensity: Application of artificial neural network. J. Clean. Prod. 2019, 225, 833–856. [Google Scholar] [CrossRef]
  14. Ragab, A.; El-Koujok, M.; Poulin, B.; Amazouz, M.; Yacout, S. Fault diagnosis in industrial chemical processes using interpretable patterns based on Logical Analysis of Data. Expert Syst. Appl. 2018, 95, 368–383. [Google Scholar] [CrossRef]
  15. Zhang, S.; Bi, K.; Qiu, T. Bidirectional recurrent neural network-based chemical process fault diagnosis. Ind. Eng. Chem. Res. 2019, 59, 824–834. [Google Scholar] [CrossRef]
  16. Wang, Y.; Pan, Z.; Yuan, X.; Yang, C.; Gui, W. A novel deep learning based fault diagnosis approach for chemical process with extended deep belief network. ISA Trans. 2020, 96, 457–467. [Google Scholar] [CrossRef]
  17. Wang, N.; Yang, F.; Zhang, R.; Gao, F. Intelligent Fault Diagnosis for Chemical Processes Using Deep Learning Multimodel Fusion. IEEE Trans. Cybern. 2022, 52, 7121–7135. [Google Scholar] [CrossRef] [PubMed]
  18. Zhu, H.; Lu, L.; Yao, J.; Dai, S.; Hu, Y. Fault diagnosis approach for photovoltaic arrays based on unsupervised sample clustering and probabilistic neural network model. Sol. Energy 2018, 176, 395–405. [Google Scholar] [CrossRef]
  19. Yao, Y.; Wang, N. Fault diagnosis model of adaptive miniature circuit breaker based on fractal theory and probabilistic neural network. Mech. Syst. Signal Process. 2020, 142, 106772. [Google Scholar] [CrossRef]
  20. Ahmadipour, M.; Hizam, H.; Othman, M.L.; Radzi, M.A. Islanding detection method using ridgelet probabilistic neural network in distributed generation. Neurocomputing 2019, 329, 188–209. [Google Scholar] [CrossRef]
  21. Zhou, Y.; Yang, X.; Tao, L.; Yang, L. Transformer Fault Diagnosis Model Based on Improved Gray Wolf Optimizer and Probabilistic Neural Network. Energies 2021, 14, 3029. [Google Scholar] [CrossRef]
  22. Lashkari Ahangarani, M.; Ostadmahdi Aragh, N.; Mojeddifar, S.; Hemmati Chegeni, M. A combination of probabilistic neural network (PNN) and particle swarm optimization (PSO) algorithms to map hydrothermal alteration zones using ASTER data. Earth Sci. Inform. 2020, 13, 929–937. [Google Scholar] [CrossRef]
  23. Chen, M.; Shi, H.; Wu, J. Research on Transformer Fault Diagnosis Based on Sparrow Algorithm Optimization Probabilistic Neural Network. In Proceedings of the ICIIP 2021: 2021 6th International Conference on Intelligent Information Processing, Bucharest, Romania, 29–31 July 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 254–259. [Google Scholar] [CrossRef]
  24. Qais, M.H.; Hasanien, H.M.; Alghuwainem, S. Enhanced salp swarm algorithm: Application to variable speed wind generators. Eng. Appl. Artif. Intell. 2019, 80, 82–96. [Google Scholar] [CrossRef]
  25. Shehab, M.; Abualigah, L.; Al Hamad, H.; Alabool, H.; Alshinwan, M.; Khasawneh, A.M. Moth–flame optimization algorithm: Variants and applications. Neural Comput. Appl. 2020, 32, 9859–9884. [Google Scholar] [CrossRef]
  26. Nemati, S.; Basiri, M.E.; Ghasem-Aghaee, N.; Aghdam, M.H. A novel ACO–GA hybrid algorithm for feature selection in protein function prediction. Expert Syst. Appl. 2009, 36, 12086–12094. [Google Scholar] [CrossRef]
  27. Aghdam, M.H.; Ghasem-Aghaee, N.; Basiri, M.E. Text feature selection using ant colony optimization. Expert Syst. Appl. 2009, 36, 6843–6853. [Google Scholar] [CrossRef]
  28. Huang, C.L.; Dun, J.F. A distributed PSO–SVM hybrid system with feature selection and parameter optimization. Appl. Soft Comput. 2008, 8, 1381–1391. [Google Scholar] [CrossRef]
  29. Kothari, V.; Anuradha, J.; Shah, S.; Mittal, P. A Survey on Particle Swarm Optimization in Feature Selection. In Proceedings of the Global Trends in Information Systems and Software Applications, Vellore, TN, India, 9–11 December 2012; Krishna, P.V., Babu, M.R., Ariwa, E., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 192–201. [Google Scholar] [CrossRef]
  30. Xue, Y.; Zhang, L.; Wang, B.; Zhang, Z.; Li, F. Nonlinear feature selection using Gaussian kernel SVM-RFE for fault diagnosis. Appl. Intell. 2018, 48, 3306–3331. [Google Scholar] [CrossRef]
  31. Ding, X.; Yang, F.; Ma, F. An Efficient Model Selection for Linear Discrimination Function-based Recursive Feature Elimination. J. Biomed. Inform. 2022, 129, 104070. [Google Scholar] [CrossRef]
  32. Zhang, F.; Petersen, M.; Johnson, L.; Hall, J.; O’Bryant, S.E. Recursive Support Vector Machine Biomarker Selection for Alzheimer’s Disease. J. Alzheimer’s Dis. 2021, 79, 1691–1700. [Google Scholar] [CrossRef]
  33. Liu, Z.; Mi, M.; Li, X.; Zheng, X.; Wu, G.; Zhang, L. A lncRNA prognostic signature associated with immune infiltration and tumour mutation burden in breast cancer. J. Cell. Mol. Med. 2020, 24, 12444–12456. [Google Scholar] [CrossRef] [PubMed]
  34. Naorem, L.D.; Prakash, V.S.; Muthaiyan, M.; Venkatesan, A. Comprehensive analysis of dysregulated lncRNAs and their competing endogenous RNA network in triple-negative breast cancer. Int. J. Biol. Macromol. 2020, 145, 429–436. [Google Scholar] [CrossRef] [PubMed]
  35. Shi, Q.; Zhang, H. Fault diagnosis of an autonomous vehicle with an improved SVM algorithm subject to unbalanced datasets. IEEE Trans. Ind. Electron. 2021, 68, 6248–6256. [Google Scholar] [CrossRef]
  36. Yu, L.; Han, Y.; Mu, L. Improved quantum evolutionary particle swarm optimization for band selection of hyperspectral image. Remote Sens. Lett. 2020, 11, 866–875. [Google Scholar] [CrossRef]
  37. Mirjalili, S.; Gandomi, A.H.; Mirjalili, S.Z.; Saremi, S.; Faris, H.; Mirjalili, S.M. Salp Swarm Algorithm: A bio-inspired optimizer for engineering design problems. Adv. Eng. Softw. 2017, 114, 163–191. [Google Scholar] [CrossRef]
  38. Abadi, M.Q.H.; Rahmati, S.; Sharifi, A.; Ahmadi, M. HSSAGA: Designation and scheduling of nurses for taking care of COVID-19 patients using novel method of hybrid salp swarm algorithm and genetic algorithm. Appl. Soft Comput. 2021, 108, 107449. [Google Scholar] [CrossRef]
  39. Zhang, H.; Liu, T.; Ye, X.; Heidari, A.A.; Liang, G.; Chen, H.; Pan, Z. Differential evolution-assisted salp swarm algorithm with chaotic structure for real-world problems. Eng. Comput. 2022, 1–35. [Google Scholar] [CrossRef]
  40. Kanazawa, K.; Sano, T.G.; Cairoli, A.; Baule, A. Loopy Lévy flights enhance tracer diffusion in active suspensions. Nature 2020, 579, 364–367. [Google Scholar] [CrossRef] [Green Version]
  41. Alweshah, M.; Alkhalaileh, S.; Al-Betar, M.A.; Bakar, A.A. Coronavirus herd immunity optimizer with greedy crossover for feature selection in medical diagnosis. Knowl.-Based Syst. 2022, 235, 107629. [Google Scholar] [CrossRef]
  42. Tarkhaneh, O.; Shen, H. Training of feedforward neural networks for data classification using hybrid particle swarm optimization, Mantegna Lévy flight and neighborhood search. Heliyon 2019, 5, e01275. [Google Scholar] [CrossRef]
  43. Chadha, G.S.; Panambilly, A.; Schwung, A.; Ding, S.X. Bidirectional deep recurrent neural networks for process fault classification. ISA Trans. 2020, 106, 330–342. [Google Scholar] [CrossRef]
  44. Hao, H.; Zhang, K.; Ding, S.X.; Chen, Z.; Lei, Y. A data-driven multiplicative fault diagnosis approach for automation processes. ISA Trans. 2014, 53, 1436–1445. [Google Scholar] [CrossRef]
  45. Hajihosseini, P.; Anzehaee, M.M.; Behnam, B. Fault detection and isolation in the challenging Tennessee Eastman process by using image processing techniques. ISA Trans. 2018, 79, 137–146. [Google Scholar] [CrossRef]
  46. Zou, W.; Xia, Y.; Li, H. Fault diagnosis of Tennessee-Eastman process using orthogonal incremental extreme learning machine based on driving amount. IEEE Trans. Cybern. 2018, 48, 3403–3410. [Google Scholar] [CrossRef]
  47. Yang, X.; Zhou, J.; Xie, Z.; Ke, G. Chemical process fault diagnosis based on enchanted machine-learning approach. Can. J. Chem. Eng. 2019, 97, 3074–3086. [Google Scholar] [CrossRef]
  48. Xie, Z.; Yang, X.; Li, A.; Ji, Z. Fault diagnosis in industrial chemical processes using optimal probabilistic neural network. Can. J. Chem. Eng. 2019, 97, 2453–2464. [Google Scholar] [CrossRef]
  49. Xu, L.; Raitoharju, J.; Iosifidis, A.; Gabbouj, M. Saliency-Based Multilabel Linear Discriminant Analysis. IEEE Trans. Cybern. 2021, 1–14. [Google Scholar] [CrossRef]
  50. Wang, X.; Li, X.; Ma, R.; Li, Y.; Wang, W.; Huang, H.; Xu, C.; An, Y. Quadratic discriminant analysis model for assessing the risk of cadmium pollution for paddy fields in a county in China. Environ. Pollut. 2018, 236, 366–372. [Google Scholar] [CrossRef]
  51. Zhang, J.; Wang, T.; Ng, W.W.; Pedrycz, W. KNNENS: A k-Nearest Neighbor Ensemble-Based Method for Incremental Learning Under Data Stream With Emerging New Classes. IEEE Trans. Neural Netw. Learn. Syst. 2022, 1–8. [Google Scholar] [CrossRef]
  52. Tekin, S.; Guner, E.D.; Cilek, A.; Unal Cilek, M. Selection of renewable energy systems sites using the MaxEnt model in the Eastern Mediterranean region in Turkey. Environ. Sci. Pollut. Res. 2021, 28, 51405–51424. [Google Scholar] [CrossRef]
  53. Wu, X.; Zuo, W.; Lin, L.; Jia, W.; Zhang, D. F-SVM: Combination of Feature Transformation and SVM Learning via Convex Relaxation. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 5185–5199. [Google Scholar] [CrossRef]
  54. Qu, Z.; Mao, W.; Zhang, K.; Zhang, W.; Li, Z. Multi-step wind speed forecasting based on a hybrid decomposition technique and an improved back-propagation neural network. Renew. Energy 2019, 133, 919–929. [Google Scholar] [CrossRef]
Figure 1. The proposed PSO-SVM-RFE method for feature selection.
Figure 1. The proposed PSO-SVM-RFE method for feature selection.
Applsci 12 08868 g001
Figure 2. The structure of PNN.
Figure 2. The structure of PNN.
Applsci 12 08868 g002
Figure 3. The proposed PSO-SVM-RFE-MSSA-PNN method for fault diagnosis.
Figure 3. The proposed PSO-SVM-RFE-MSSA-PNN method for fault diagnosis.
Applsci 12 08868 g003
Figure 4. The Tennessee Eastman process diagram.
Figure 4. The Tennessee Eastman process diagram.
Applsci 12 08868 g004
Figure 5. Stability comparison of PSO-SVM-RFE and RFtb for 21 fault diagnosis rates.
Figure 5. Stability comparison of PSO-SVM-RFE and RFtb for 21 fault diagnosis rates.
Applsci 12 08868 g005
Figure 6. Monitoring performances for Fault 12.
Figure 6. Monitoring performances for Fault 12.
Applsci 12 08868 g006
Figure 7. Monitoring performances for Fault 13.
Figure 7. Monitoring performances for Fault 13.
Applsci 12 08868 g007
Table 1. Description of fault categories.
Table 1. Description of fault categories.
CategoryDescriptionType
1A/C Feed ratio, B composition constant (Stream 4)Step
2B composition, A/C ratio constant (Stream 4)Step
3D feed temperature (Stream 2)Step
4Reactor cooling water inlet temperatureStep
5Condenser cooling water inlet temperatureStep
6A feed loss (Stream 1)Step
7C header pressure loss (Stream 4)Step
8A, B, C, feed composition (Stream 4)Random variation
9D feed temperature (Stream 2)Random variation
10C feed temperature (Stream 4)Random variation
11Reactor cooling water inlet temperatureRandom variation
12Condenser cooling water inlet temperatureRandom variation
13Reaction kineticsSlow drift
14Reactor cooling water valveSticking
15Condenser cooling water valveSticking
16–20UnknownUnknown
21Valve (Stream 4)Constant position
Table 2. Measured and manipulated variables.
Table 2. Measured and manipulated variables.
NO.Process MeasurementsNO.Process Measurements
1A feed27E in reactor feed
2D feed28F in reactor feed
3E feed29A in reactor feed
4Total feed30B in reactor feed
5Recycle flow31C in reactor feed
6Reactor feed rate32D in reactor feed
7Reactor pressure33E in reactor feed
8Reactor level34F in reactor feed
9Reactor temperature35G in reactor feed
10Purge rate36H in reactor feed
11Product separator temperature37D in product flow
12Product separator level38E in product flow
13Product separator pressure39F in product flow
14Product separator underflow40G in product flow
15Stripper level41H in product flow
16Stripper pressure42D feed flow valve
17Stripper underflow43E feed flow valve
18Stripper temperature44A feed flow valve
19Stripper steam flow45Total feed flow valve
20Compressor work46Compressor recycle valve
21Reactor cooling water outlet temperature47Purge valve
22Separator cooling water outlet temperature48Separator pot liquid flow valve
23A in reactor is feed49Stripper liquid product flow valve
24B in reactor is feed50Stripper steam valve
25C in reactor is feed51Reactor cooling water flow
26D in reactor is feed52Condenser cooling water flow
Table 3. Feature selection results of SVM-RFE.
Table 3. Feature selection results of SVM-RFE.
CategoryFeatureCategoryFeatureCategoryFeature
118,16,7,46,44850,19,18,13,161517,52,18,7,20
210,7,47,20,19952,17,13,7,191617,52,48,12,7
352,17,11,19,181013,7,50,19,181721,7,13,9,51
451,9,21,18,191152,17,48,12,161852,17,50,18,20
552,17,11,19,18127,13,50,19,161952,17,48,12,20
61,44,36,26,101350,19,13,52,172052,17,13,7,20
745,7,35,25,161452,17,51,9,132117,52,19,18,50
Table 4. Feature selection results of RFtb.
Table 4. Feature selection results of RFtb.
CategoryFeatureCategoryFeatureCategoryFeature
11,20,22,44,46816,29,38,40,411516,19,20,39,40
210,34,39,46,47919,25,31,38,501618,19,38,46,50
318,20,37,40,411018,19,31,38,501721,38,46,50,51
419,38,47,50,51117,9,13,38,511816,19,22,41,50
517,18,38,50,52124,11,18,19,35195,13,20,46,50
61,20,37,44,46137,18,19,39,502019,39,41,46,50
719,38,45,46,50149,11,21,38,50217,16,19,45,50
Table 5. Feature selection results of PSO-SVM-RFE.
Table 5. Feature selection results of PSO-SVM-RFE.
CategoryFeatureCategoryFeatureCategoryFeature
140,42,13,44,18817,24,28,34,521528,29,34,39,35
25,29,40,47,51928,29,34,38,411624,26,29,33,40
319,18,37,41,401025,28,29,34,35177,50,20,23,38
429,35,40,42,511119,7,27,39,16184,17,22,32,52
58,17,22,35,52123,44,13,20,471918,25,29,32,35
646,26,13,16,201313,16,24,50,41207,37,14,41,35
77,16,20,31,451421,51,43,29,252111,18,35,37,50
Table 6. The fault diagnosis rates of various feature selection models.
Table 6. The fault diagnosis rates of various feature selection models.
CategoryPSO-SVM-RFESVM-RFERFtbUnoptimized
10.991.000.930.99
20.990.990.990.99
30.980.690.980.85
41.001.001.001.00
51.001.000.930.83
60.991.001.000.99
71.001.001.001.00
80.830.960.990.91
90.870.710.820.84
100.820.910.960.77
110.840.640.910.75
120.940.950.970.90
130.980.871.000.94
141.000.921.000.83
150.820.940.760.80
160.860.930.620.96
170.870.730.700.86
180.880.860.950.86
190.860.710.940.77
200.800.840.850.84
210.840.790.540.82
Mean0.910.880.900.88
Table 7. Troubleshooting rate of different optimized PNN schemes.
Table 7. Troubleshooting rate of different optimized PNN schemes.
CategoryMSSA-PNNSSA-PNNCS-PNNGA-PNNPSO-PNNMOV-PNNSOA-PNNPNN
11.001.001.000.990.990.990.960.99
20.980.980.980.980.980.960.970.97
30.980.940.700.700.710.720.930.66
41.001.001.001.001.000.991.000.97
51.001.000.990.990.991.000.990.67
60.990.940.980.990.890.980.900.95
71.001.001.000.971.000.661.001.00
80.800.800.820.830.840.780.800.67
90.820.790.750.820.770.700.790.65
100.790.730.810.810.820.670.730.67
110.780.790.690.630.670.740.740.67
120.880.840.590.350.940.910.760.66
130.980.960.890.760.790.900.950.82
141.001.001.001.001.001.000.991.00
150.760.760.800.720.840.720.780.66
160.780.740.800.790.800.740.690.66
170.840.830.590.720.730.750.840.66
180.840.730.830.860.820.820.850.85
190.790.770.790.760.670.720.800.65
200.710.690.620.640.640.680.710.66
210.810.830.800.790.750.830.820.66
Mean0.880.860.830.810.840.820.860.77
Table 8. Predicted diagnostic rates of normal states and fault state corresponding to each fault class for different classifiers (N represents normal states, F corresponds to fault states).
Table 8. Predicted diagnostic rates of normal states and fault state corresponding to each fault class for different classifiers (N represents normal states, F corresponds to fault states).
Category MSSA-PNNLDAQDAKNNSVMMaxEntCS-BP
NFNFNFNFNFNFNF
11.001.001.000.991.000.990.510.831.000.991.000.891.000.99
20.960.991.000.951.000.980.170.811.000.980.740.290.990.97
30.960.980.550.540.780.500.390.820.001.000.840.110.460.84
41.001.001.001.001.001.000.300.861.001.000.830.591.001.00
51.001.001.000.991.001.000.230.811.001.000.790.181.001.00
61.000.991.000.961.000.990.990.971.001.001.000.240.990.99
71.001.001.001.001.001.000.440.771.001.000.890.191.001.00
80.750.830.690.500.970.900.180.810.001.000.830.220.950.90
90.720.870.530.610.540.610.320.750.001.000.840.580.250.88
100.730.820.630.470.800.570.350.810.001.000.930.530.290.81
110.660.840.500.520.710.550.300.740.001.000.840.180.460.76
120.760.940.730.500.980.950.630.800.010.670.870.100.960.94
130.970.980.800.571.000.970.870.880.230.600.910.180.980.97
141.001.000.510.501.001.000.930.930.001.000.970.151.001.00
150.640.820.580.610.680.520.300.830.001.000.870.540.130.95
160.640.860.510.520.690.490.360.840.001.000.950.580.220.85
170.790.870.480.560.790.610.300.800.001.000.890.260.560.82
180.770.881.000.811.000.870.870.861.000.880.960.620.950.88
190.630.860.530.520.650.530.230.810.001.000.740.240.220.89
200.510.800.520.570.790.590.400.820.001.000.810.340.550.83
210.740.840.670.550.900.720.340.790.001.000.740.280.710.87
Mean0.820.910.720.680.870.780.450.830.350.960.870.350.700.91
Table 9. Confusion matrix for evaluating machine learning.
Table 9. Confusion matrix for evaluating machine learning.
Actual Class Predicted Class
PositiveNegative
PositiveTrue positive (TP)False Negative (FN)
NegativeFalse positive (FP)True Negative (TN)
Table 10. The accuracy of MSSA-PNN and other classification models.
Table 10. The accuracy of MSSA-PNN and other classification models.
CategoryMSSA-PNNLDAQDAKNNSVMMaxEntCS-BP
11.000.991.000.720.990.930.99
20.980.970.990.590.990.440.98
30.980.540.590.670.660.360.71
41.001.001.000.671.000.681.00
51.000.991.000.621.000.391.00
60.990.981.000.981.000.490.99
71.001.001.000.661.000.421.00
80.800.560.920.590.660.410.92
90.820.580.590.620.650.670.67
100.790.520.640.650.650.660.63
110.780.510.600.600.670.410.66
120.880.580.960.740.440.370.95
130.980.650.980.880.470.440.97
141.000.511.000.930.660.441.00
150.760.600.570.650.660.660.64
160.780.520.560.680.650.710.63
170.840.530.670.630.650.490.74
180.840.880.910.860.920.740.90
190.790.530.570.620.630.400.65
200.710.560.660.670.660.490.74
210.810.590.780.640.650.430.82
Mean0.880.690.810.700.750.520.84
Table 11. The F1-score of MSSA-PNN and other classification models.
Table 11. The F1-score of MSSA-PNN and other classification models.
CategoryMSSA-PNNLDAQDAKNNSVMMaxEntCS-BP
11.000.991.000.801.000.941.00
20.980.970.990.720.990.410.98
30.980.610.630.770.790.180.79
41.001.001.000.781.000.701.00
51.001.001.000.741.000.281.00
61.000.981.000.981.000.390.99
71.001.001.000.751.000.301.00
80.850.610.940.720.800.330.94
90.870.660.660.730.790.690.78
100.840.570.680.750.790.680.74
110.830.590.650.710.800.280.75
120.910.610.970.800.610.160.96
130.980.680.980.900.600.300.98
141.000.571.000.950.790.261.00
150.820.670.610.760.800.670.77
160.840.590.590.770.790.730.75
170.880.610.710.740.790.400.80
180.870.890.930.890.930.750.93
190.850.590.620.740.780.350.77
200.790.630.690.760.800.480.81
210.850.640.810.740.790.390.86
Mean0.910.740.830.790.840.460.88
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Xu, H.; Ren, T.; Mo, Z.; Yang, X. A Fault Diagnosis Model for Tennessee Eastman Processes Based on Feature Selection and Probabilistic Neural Network. Appl. Sci. 2022, 12, 8868. https://doi.org/10.3390/app12178868

AMA Style

Xu H, Ren T, Mo Z, Yang X. A Fault Diagnosis Model for Tennessee Eastman Processes Based on Feature Selection and Probabilistic Neural Network. Applied Sciences. 2022; 12(17):8868. https://doi.org/10.3390/app12178868

Chicago/Turabian Style

Xu, Haoxiang, Tongyao Ren, Zhuangda Mo, and Xiaohui Yang. 2022. "A Fault Diagnosis Model for Tennessee Eastman Processes Based on Feature Selection and Probabilistic Neural Network" Applied Sciences 12, no. 17: 8868. https://doi.org/10.3390/app12178868

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop