The Detection of Kiwifruit Sunscald Using Spectral Reflectance Data Combined with Machine Learning and CNNs

Wu, Ke; Jia, Zhicheng; Duan, Qifeng

doi:10.3390/agronomy13082137

Open AccessArticle

The Detection of Kiwifruit Sunscald Using Spectral Reflectance Data Combined with Machine Learning and CNNs

by

Ke Wu

¹

,

Zhicheng Jia

^2,* and

Qifeng Duan

²

¹

College of Science, Nanjing Forestry University, Nanjing 210037, China

²

Mechanical and Electronic Engineering College, Nanjing Forestry University, Nanjing 210037, China

^*

Author to whom correspondence should be addressed.

Agronomy 2023, 13(8), 2137; https://doi.org/10.3390/agronomy13082137

Submission received: 14 July 2023 / Revised: 10 August 2023 / Accepted: 11 August 2023 / Published: 15 August 2023

(This article belongs to the Special Issue Food and Agricultural Imaging Systems – An Outlook to the Future)

Download

Browse Figures

Versions Notes

Abstract

:

Sunscald in kiwifruit, an environmental stress caused by solar radiation during the summer, reduces fruit quality and yields and causes economic losses. The efficient and timely detection of sunscald and similar diseases is a challenging task but helps to implement measures to control stress. This study provides high-precision detection models and relevant spectral information on kiwifruit physiology for similar statuses, including early-stage sunscald, late-stage sunscald, anthracnose, and healthy. Primarily, in the laboratory, 429 groups of spectral reflectance data for leaves of four statuses were collected and analyzed using a hyperspectral reflection acquisition system. Then, multiple modeling approaches, including combined preprocessing methods, feature extraction algorithms, and classification algorithms, were designed to extract bands and evaluate the performance of the models to detect the statuses of kiwifruit. Finally, the detection of different stages of kiwifruit sunscald under anthracnose interference was accomplished. As influential bands, 694–713 nm, 758–777 nm, 780–799 nm, and 1303–1322 nm were extracted. The overall accuracy, precision, recall, and F1-score values of the models reached 100%, demonstrating an ability to detect all statuses with 100% accuracy. It was concluded that the combined processing of moving average and standard normal variable transformations (MS) could significantly improve the data; the near-infrared support vector machine and visible convolutional neural network with MS (NIR-MS-SVM and VIS-MS-CNN) were established as high-precision detection techniques for the classification of similar kiwifruit statuses, demonstrating 25.58% higher accuracy than the single support vector machine. The VIS-MS-CNN model reached convergence with a stable cross-entropy loss of 0.75 in training and 0.77 in validation. The techniques developed in this study will improve orchard management efficiency and yields and increase researchers’ understanding of kiwifruit physiology.

Keywords:

kiwifruit; sunscald; anthracnose; spectral reflectance; multilayer perceptron; random forests; support vector machine; convolutional neural network

1. Introduction

The kiwifruit (Actinidia chinensis Planch) belongs to the genus Kiwifruit in the family Rhododendron. According to FAO data for 2021 [1], China produces 2.381 million tons of kiwifruit; the world produces approximately 4.46 million tons. However, kiwifruit, an essential component of Chinese agriculture [2,3], is subjected to increasingly severe sunscald in its planted orchards [4,5]. Sunscald is an environmental stress caused by excessive solar radiation that frequently occurs in the summer and is common in a wide range of fruits, such as grapes, apples, and tomatoes [6,7,8]. Sunscald produces patches on the fruit and leaf surfaces [9,10] and reduces fruit yields and quality [11], negatively impacting farmers’ incomes. Due to large-scale planting, large areas of sunscald can occur in orchards under high-temperature conditions, resulting in yield reductions of more than 30% [12]. It is crucial to find methods to effectively reduce sunscald losses.

Field treatments for sunscald include evaporative cooling, shade nets, and inhibitors [13]. Spraying systems can reduce the ambient temperature and thus the potential for sunscald, but are accompanied by water and electricity consumption [14,15]. Shade nets can reduce the intensity of sunlight, lower plant canopy temperatures, and increase the relative humidity but increase the risk of fungal diseases [16]. As part of an integrated orchard management strategy, the timely detection of sunscald at different stages for early control strategies before large-scale occurrence is necessary for better resource utilization and disease protection. Problems such as high labor and time costs, inefficiency, and unsuitability for large areas exist for manual detection.

In summer, sunscald occurs after high-temperature exposure; the surface of kiwifruit begins to turn leathery and brown spots appear, and leaf water deficiency is accompanied by curling and drying. As the sunscald deepens, the fruit stops developing, becoming soft and ulcerating [17]. The symptoms of kiwifruit sunscald facilitate the extraction of information on the leaf status from the phenotype. Changes in the physiological status of early-stage sunscald on leaves and fruits are not noticeable compared with healthy fruits, posing a challenge for detection. In addition, studies summarizing early plant stress detection have concluded that it is necessary to consider the influence of other disease factors to determine the plant physiological status [18]. For kiwifruit, anthracnose and sunscald have similar symptoms and occur at overlapping times. Anthracnose is a fungal disease with rapid onset, forming irregularly shaped brown spots at the leaf margins or leaf tips that turn grayish-brown or grayish-white in late stages [19]. For the detection of sunscald, difficulties are caused by early detection and disease interference; practical field identification requires a long observation process and substantial labor costs.

Hyperspectral assays can capture rich and adequate plant information, which is convenient for the determination of the plant status. In agricultural research, it has been reported that hyperspectral methods can be used to detect sunscald and other physiological statuses. For example, spectral reflectance in the wavelength regions of 500–600 nm, 650–700 nm, and 800–850 nm was used to predict the sunscald grade of ‘Packham’s Triumph’ pears with accuracy of 94% [10]; a prediction model for apple sunscald used VIS-NIR reflectance data to determine the effect of predicting apple sunscald in advance [20]. To study the environmental stress of kiwifruit, Ge used spectral technology and SVM to detect chilling injuries in kiwifruit in 2023 and achieved 94.2% accuracy [21]. However, research on how to use hyperspectral data to detect kiwifruit sunscald is still scarce.

Traditionally, various techniques, such as autoregression (AR) [22,23], moving average (MA) [24,25], exponential smoothing (ES) [26], the hybrid method (HM) [27,28,29], and autoregressive integrated moving average (ARIMA) [30], have been used to construct detection models. In addition, some recent techniques, such as transfer learning and convolutional neural networks [31,32], have been used in the agricultural field for fruit image classification and annotation and disease classification [32,33]. Among these techniques, machine learning and CNN mostly outperform other techniques in terms of precision and accuracy [34,35]. In precision agriculture, classification algorithms are closely combined with spectral technology, generally divided into machine learning and deep learning. Researchers have considered multilayer perceptron, support vector machine, and random forest techniques as classic and popular machine learning methods. The multilayer perceptron is an artificial neural network algorithm [36] that relies on continuously adjusting the parameters in the network to improve the model. It has been used to detect avocado laurel wilt and achieved detection accuracy of 98% [37]. The support vector machine is a typical machine learning algorithm that has been used to distinguish between special and traditional coffee, achieving 96% detection accuracy [38]. Random forest is an algorithm based on ensemble learning that can run in the case of many variable inputs and has strong computational efficiency [39]. It has been used to detect diseases in apples, corn, potatoes, and other plants and has achieved overall accuracy of 96.1% [40]. The convolutional neural network is a popular deep learning algorithm that has been used to detect tomato diseases, including sunscald and anthracnose, with average accuracy of 99.64% [41]. It has also been used to analyze the phenotypes of diseased plant leaves, demonstrating good predictive performance and generalization ability [42].

This study aimed to improve the detection effect in four aspects: data source, preprocessing, feature extraction algorithms, and classification algorithms. The research hypothesis of this study was that the detection model could achieve higher accuracy and better detection of kiwifruit sunscald under anthracnose interference than the widely used plant status detection techniques through the specialized design of the model content described above. To determine the physiological statuses of plants, in addition to building an extensive sample database [43], informative bands should be selected from hyperspectral data [37,44]. The full spectrum (400 nm to 2400 nm), visible spectrum (400 nm to 760 nm), and near-infrared spectrum (761 nm to 2400 nm) were input into PLS-DA to detect ice plants (Aizoaceae), achieving good detection results. The Kappa value reached 0.9 [45]; preprocessing tools such as MA, SNV, and airPLS were used to enhance the noise resistance of the detection models [46,47,48], which helped to extract information from the spectral data; and spectral feature extraction methods, including PCA and RFE, have been involved in the task of detecting potato chlorophyll content [49].

The contributions of this study are as follows.

(1): Improving kiwifruit yield and quality: by detecting sunscald under the interference of anthracnose, farmers could identify potentially problematic plants early and cool or irrigate the plants to avoid further damage, effectively improving kiwifruit yields and quality in the orchard.
(2): Optimizing resource utilization: by accurately detecting sunscald in kiwifruit, farmers could avoid wasting resources such as water, nutrients, and pesticides on healthy plants, instead providing them to those already affected by sunscald. This will reduce the wastage of resources, lower costs, and contribute to the development of sustainable agriculture and the protection of the ecological environment.
(3): Understanding plant physiological processes: the hyperspectral reflectance technique was used to provide information on the spectral responses of plant leaves in different bands and to extract the relevant bands. Analyzing and interpreting this information helps researchers to understand plants’ growth processes, metabolic activities, and response mechanisms and further advance the research and development of plant biology beyond visual methods.
(4): Developing smart agriculture: this study emphasizes the implementation of plant status detection for early-stage sunscald and similar diseases, which can be implemented in combination with other modern agricultural technologies such as IoT, drones, and data analytics for smart agriculture applications.
(5): Comparison and selection of models: comparing and evaluating the detection effects of various models based on four parts, including the data source, preprocessing, feature extraction algorithms, and classification algorithms, can provide researchers and decision makers with a basis for selecting the best model. This helps to identify the most suitable model to solve a particular problem and can identify directions for the optimization of the model to improve the algorithm further and enhance the model’s performance.

To the best of our knowledge, this study is the first in the field of precision agriculture to describe the spectral characteristics of kiwifruit sunscald based on hyperspectral technology and develop high-precision detection technology.

2. Materials and Methods

2.1. Study Area and Plant Material

The study area was located in an orchard with both management and experiments in Liuhe District, Nanjing City, Jiangsu Province, China, as shown in Figure 1. Nanjing is situated between 118°22′–119°14′ E and 31°14′–32°37′ N, with distinct climate changes, abundant light, and an extensive annual range of temperatures; there are 1955.5 total sunshine hours in a year, the extreme yearly temperature is 39.7 °C, and the mean annual precipitation is 1106 mm. Kiwifruits were planted over approximately 1500 acres in the orchard—the main variety was Yuhuang—and their growth was observed and recorded for an extended period. In summer, the orchard often suffers from sunscald stress, the physiological status of fruit trees is destroyed, and production is restricted.

From July to September 2022, sunscald appeared on the kiwifruits in the orchard, as shown in Figure 1a,b. One hundred mature kiwifruit trees growing in an independent area within a radius of 2 m were selected, and it was ensured that all fruit trees grew healthily for more than one year. These principles ensured the consistency of the trees’ physiological status and structure. Due to the maintenance of orchards by fruit growers, kiwifruit trees are not exposed to other stresses or disease infections. After the visual observation of trees by experienced local farmers and botanists and PCR testing to assess the severity of kiwifruit sunscald and other diseases, a number of independent samples of healthy, early-stage sunscald, late-stage sunscald, and anthracnose kiwifruits were obtained.

Leaves were collected from 12:00 to 14:00 on a sunny day at the end of the month. The collection method was as follows: randomly collecting leaves from four places of the same tree and picking at least five leaves of similar size in each tree. The influence of the leaf growth position and size on the experimental results had to be avoided as much as possible. The collected leaves were placed in sealed bags to reduce the possibility of leaf deterioration and water loss, and then frozen in a sealed container at −20 °C, a standard method for storing plant leaf samples [50]. The time from picking leaf samples in the orchard to using the instrument to collect spectral data in the laboratory was controlled to within four hours, which ensured that the physiological status of the kiwifruit leaves was consistent with that of field leaves. The types and numbers of sample leaves are shown in Figure 2 and Table 1.

2.2. Experiment Apparatus and Data Acquisition

In this study, a hyperspectral data acquisition system was used to collect the spectral reflection data of kiwifruit leaves. The system consisted of a spectrometer, lamp, scanning table, whiteboard, power supply, laptop computer, and control software, as shown in Figure 3a. The spectrometer was an ASD FieldSpec 3 Spectroradiometer (ASD, Inc., Falls Church, VA, USA) with spectral coverage in the wavelength range of 350–2500 nm, 2151 spectral acquisition points, and a spectral resolution of 1.0 nm. The system provides high-resolution data in the visible and infrared bands. The light source power was 75 W, and the vertical incidence angle was 15 degrees. The calibrated reflectance panel was used for light intensity correction, which helped to reduce noise in the spectral data, and for sensitivity optimization, which helped to reduce the effect of noise in the circuit itself on the results. The scanning table was a clean, solid black, velvet-covered bench, and the ViewSpecPro 6.20 control software operated the spectrometer’s shooting process.

The laboratory in which the measurements were taken was a dark environment. The spectrometer and computer were fully charged before each collection of reflectance data. Before collecting spectral data, the spectrometer was first powered on, and the instrument was allowed to warm up for 15 min; subsequently, the lens was aligned to the calibration panel under illumination conditions, and the calibration panel covered the field of view of the lens to complete light intensity correction and spectrometer sensitivity optimization. Each kiwifruit leaf sample was fixed in the center of the scanning table, with the probe located 30 cm above the leaf surface. This ensured that the collected spectra were the reflectance spectra of the leaf’s central part and a fixed area. The obtained spectral reflectance data are shown in Figure 3b.

2.3. Data Analysis

In this study, a method was designed to determine the different statuses of kiwifruit based on the hyperspectral reflectance data of the leaves. The model-building process used a combination of band segmentation, preprocessing, feature extraction algorithms, and classification algorithms to establish multiple detection models. The data analysis process consisted of the following eight steps, as shown in the flowchart in Figure 4: (1) reflectance curve analysis to explore the spectral characteristics of different physiological statuses by calculating the average reflectance and sensitivity; (2) band segmentation to take 400–780 nm in the range of 350–2500 nm as the visible band, VIS, and 780–2500 nm as the near-infrared band, NIR, and to compare the input data in different bands on the model detection effects; (3) preprocessing, using a combination of MA, SNV, and airPLS to obtain MS and MAS methods to process the raw data, comparing the raw data (unprocessed) and the effect of different preprocessing procedures on the model’s detection effects; (4) feature extraction, for which three schemes were designed, namely no feature extraction (unprocessed), PCA, and RFE, which were used to select the essential variables and improve the model’s prediction performance; (5) machine learning classification, based on MLP, RF, and SVM; (6) deep learning classification, to build a convolutional neural network (CNN); (7) model evaluation, in which the OA, recall, precision, and F1-score were selected to evaluate the accuracy of detection, thus favoring the detection models; and (8) variable analysis, which analyzed the variables obtained from the model preferences and explained the reasons for the improved detection performance of the models from the perspectives of data distribution and significance analysis. The data analysis process was implemented through Python programming.

3. Theoretical Foundations

3.1. Reflection Curve Analysis

In this study, parameters were utilized to explore the spectral differences of leaves in different statuses: (1) average reflectance, which was the average of the reflectance of all samples with the same status; and (2) sensitivity, which was calculated by dividing the average reflectance of a stressed or diseased leaf by the average reflectance of healthy leaves.

3.2. Preprocessing

The main preprocessing methods for spectra include the moving average (MA), standard normal variable transformation (SNV), and adaptive iterative weighted penalized least squares (airPLS) [46,47,48]. In this study, different main preprocessing methods were combined in anticipation of enhancing the model’s predictive performance; the two combined preprocessing methods were MA-SNV (MS) and MA-airPLS-SNV (MAS).

MA is a digital signal processing method for smoothing spectral curves. Within 400–2500 nm, the reflectivity curve may fluctuate in parts of the spectrum due to signal noise. For the wavelength

t

, the value

x^{*}

after MA processing can be calculated as shown in Equation (1). Here,

γ

is a factor, and γ is taken as one third at a sliding period of 3.

x_{t}^{*} = γ^{1} x_{t - 1}^{*} + γ^{2} x_{t}^{*} + γ^{3} x_{t + 1}^{*}

(1)

SNV is a preprocessing method to remove the variance in the spectral signal by standardizing the spectral data to correct the spectral errors due to scattering. The value of

x_{t}^{s n v}

after SNV was calculated as follows: the corrected value was obtained by subtracting the mean value of the reflectance of the band and dividing it by its standard deviation. When the reflectance at wavelength t was

x_{t}

, the reflectance of all samples constituted the column vector

X = {(x_{t}^{1}, x_{t}^{2}, \dots, x_{t}^{n})}^{T}

, where

n

is the total number of samples.

x_{t}^{s n v} = \frac{x_{t} - m e a n (X)}{\sqrt{\frac{\sum_{i = 1}^{n} {(x_{i} - m e a n (X))}^{2}}{n - 1}}}

(2)

airPLS is based on least squares estimation to achieve the removal of the noise effect of baseline drift on spectral data without any a priori information and improve the accuracy of the detection performance of spectral signal peaks. It uses a least squares algorithm to fit the baseline of the spectral data, sets a penalty function to evaluate the smoothness of the fitted curve, and uses an adaptive iterative weighting strategy to adjust the parameters in the penalty function.

Theoretically, the combined preprocessing used in this study can correct the sensitivity of the spectral data to wavelength deviation and remove the noise generated during data acquisition, transmission, and storage to obtain more accurate and reliable data, and it reduces the impact of excessive noise on subsequent analyses.

3.3. Feature Extraction

Principal component analysis (PCA) is a commonly used feature extraction technique to extract critical data features by finding the significant variances in the data [38], simplifying the dataset, and reducing the effect of data noise. We analyzed the principal components in the raw data to identify the spectral bands with higher information content and were able to understand the data features better, thus enabling the efficient processing and simplification of the hyperspectral data. Principal components with a cumulative contribution of 95% or more were retained to build the detection models and provide input variables for the kiwifruit sunscald detection models.

Recursive feature elimination (RFE) is a method to eliminate non-important features to select an optimal feature subset for feature extraction [51,52,53]. This paper used RFE to rapidly downscale the data to improve the recognition accuracy. RFE removes a specific number of non-important features per generation. When the number of features reaches a predetermined value, the algorithm stops and determines the optimal subset of features for the elimination process based on the test results. In this study, two features were eliminated per generation for visible spectral data until twenty features remained. Ten features were eliminated per generation for near-infrared spectral data until twenty features remained. The subset of features selected by RFE with the best prediction was used to build the final detection models for the detection of the status of kiwifruits.

3.4. Classification Algorithm

In this study, three supervised learning algorithms were applied.

(1): Multilayer perceptron (MLP), a representative artificial neural network [54]. The MLP was used for analysis and training, and its structure includes input, hidden, and output layers. Neurons within each layer calculate the output, $y$ , using the following equation:

W^{*} = \sum_{i = 1}^{n} w_{i} \cdot x_{i} + b

(3)

y = f (W^{*})

(4)

where

w_{i}

denotes the weight of the input variable

x_{i}

connected to the neuron, and

b

denotes the bias term after multiplying the input variable by the weight.

W^{*}

denotes the linear combination of the input and its corresponding weight plus the bias term. Then, the activation function,

f (W^{*})

, is applied to introduce nonlinearity to obtain the neuron’s output,

y

. The gradient descent method was used to optimize network weights in the training process.

(2): Random forest (RF) is a classification algorithm based on the concept of ensemble learning [55], which makes a final prediction by constructing multiple decision trees and integrating the prediction results of each decision tree. The Gini coefficient method was used to divide the decision tree nodes and select a more appropriate way to classify the data. The equation for calculating the Gini coefficient is as follows:

G i n i (p) = 1 - \sum_{i = 1}^{c} {(p_{i})}^{2}

(5)

where

p_{i}

is the probability of each class label and

c

denotes the number of classes.

(3): Support vector machine (SVM) performs the classification of the data by constructing an optimal hyperplane [56]. The SVM algorithm was used for the processing and classification of the data. The SVM algorithm separated different types of data by constructing an optimal hyperplane and using a nonlinear function to map the data into a higher-dimensional space to classify the hyperspectral datasets. The SVM was also computed using Equations (3) and (4). However, unlike MLP, which uses an implicit layer structure, SVM finds decision boundaries by solving optimization problems to determine support vectors.

The optimal combinations of hyperparameters were obtained using a network search optimization method, i.e., traversing a predefined hyperparameter space, thus improving the accuracy and reliability of the classification. Cross-validation was used to evaluate the classifier’s performance and the optimization results after each training round utilizing the validation data.

Convolutional neural networks (CNNs) are a popular deep learning algorithm whose structures consist of convolutional layers, pooling layers, activation functions, and fully connected layers; they perform well in feature extraction and classification [57]. For the one-dimensional reflection curve data, we set up a one-dimensional convolutional neural network whose structure consisted of three convolutional layers with a sliding window size of 3, three pooling layers with a maximum pooling method, two fully connected layers, and RELU activation functions to achieve automated feature extraction and feature mapping, as shown in Figure 5. To produce comparable results, all CNNs were trained with the same hyperparameter settings: the number of epochs was set to 250, the batch size was set to 32, the initial learning rate was set to 0.001, and the model parameters were optimized using the Adam optimizer during the training process. Due to the difference in algorithmic construction, there is a difference in feature extraction capability between CNN, MLP, and SVM.

3.5. Model Evaluation

Based on the confusion matrix, the OA, precision, recall, and F1-score were used to evaluate the performance of the models. Overall accuracy (OA) was used to assess the overall performance of the models; precision evaluated the proportion of samples predicted by the models to be in a class that were actually in this class; recall evaluated the ratio of actual samples in a class that were determined to be in this class; and the F1-score is the summed average of precision and recall and was used to evaluate the model’s balanced performance [58,59]. The equations for the OA, precision, recall, and F1-score are shown below for multiclassification.

O A = \frac{1}{N} \sum_{i = 1}^{N} \frac{T P_{i} + T N_{i}}{T P_{i} + T N_{i} + F P_{i} + F N_{i}} \times 100 %

(6)

P r e c i s i o n = \frac{1}{N} \sum_{i = 1}^{N} \frac{T P_{i}}{T P_{i} + F P_{i}} \times 100 %

(7)

R e c a l l = \frac{1}{N} \sum_{i = 1}^{N} \frac{T P_{i}}{T P_{i} + F N_{i}} \times 100 %

(8)

F 1 - s c o r e = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} \times 100 %

(9)

4. Tests and Results

4.1. Spectral Reflection Curve Analysis

The average reflectance and sensitivity curves for different statuses of leaves were obtained from the spectral curve acquisition system, as shown in Figure 6. For the visible spectrum, the troughs of the average reflectance curves for healthy, early-stage sunscald, late-stage sunscald, and anthracnose status were located at 410 nm, 421 nm, 403 nm, and 400 nm, respectively. The peaks were concentrated at 760 nm. For the near-infrared spectrum, the troughs were located at 1928 nm, 1925 nm, 1932 nm, and 2491 nm, respectively, and the peaks were located at 1126 nm, 1119 nm, 1127 nm, and 1297 nm, respectively. Within the green band (495–570 nm), the maximum reflectance of the healthy leaves was 8.96%, the early-stage reflectance was 6.44%, the late-stage reflectance was 7.81%, and the reflectance was 8.89%. The average reflectance curves of the four plant statuses exhibited similar trends. In the near-infrared spectrum, the peak position and reflectance of anthracnose (1297 nm, 64.84%) showed significant positional separation and numerical differences from healthy (1126 nm, 57.12%), early-stage sunscald (1119 nm, 48.71%), and late-stage sunscald (1127 nm, 46.39%). The average reflectance curves of anthracnose were overall higher than those of other statuses.

There was a small range of fluctuation in the sensitivity values for the three unhealthy statuses within 400–530 nm. Early-stage and late-stage sunscald sensitivity ranges were 0.67–1.83 and 0.25–1.04, respectively. Anthrax sensitivity values ranged from 0.73 to 2.17, greater than 1.00 overall, and showed multiple peaks in the near-infrared range. The sensitivity of anthrax exhibited an increasing trend with increasing wavelength. It significantly differed from the sensitivity values of sunscald, indicating a more significant spectral difference between anthrax and health.

4.2. Feature Extraction

The results of PCA processing are shown in Table 2. In the visible spectrum, 96.46% of the data variation was explained by PC1 (89.76%) and PC2 (6.70%), and the 758–777 nm and 694–713 nm bands were identified as influential. In the near-infrared spectrum, 97.15% of the data variation was explained by PC1 (80.98%) and PC2 (16.47%), and the 1303–1322 nm and 780–799 nm bands were identified as influential. The scatter distribution of PC1 and PC2 obtained by extraction is shown in Figure 7. The confidence ellipse overlap region accounted for many of the leaf statuses that could not be separated effectively. In particular, the confidence ellipse of anthracnose exhibited a minor overlap with other statuses, and there was clear position separation.

The accuracy changes of MLP, SVM, and RF during the iterations of RFE are shown in Figure 8. With the RFE iteration, the number of input variables decreased, and the algorithm’s accuracy showed an improving trend. In all preprocessing procedures, the accuracy of SVM and RF tended to be flat after increasing, while the accuracy of MLP showed strong fluctuation after increasing. RFE identified different best features, including VIS-SVM (44), VIS-RF (162), VIS-MLP (28), VIS-MS-SVM (76), VIS-MS-RF (26), VIS-MS-MLP (34), VIS-MAS-SVM (80), VIS-MAS-RF (352), VIS-MAS-MLP (26), NIR-SVM (880), NIR-RF (1700), NIR-MLP (70), NIR-MS-SVM (580), NIR-MS-RF (1590), NIR-MS-MLP (370), NIR-MAS-SVM (1740), NIR-MAS-RF (1620), and NIR-MAS-MLP (80).

As shown in Figure 7, the unsupervised learning PCA method could not achieve good discrimination between different leaf statuses. PCA and RFE were modeled as feature extraction in combination with machine learning algorithms RF, SVM, and MLP. The ability of machine learning algorithms to detect kiwifruit statuses was determined by evaluating all methods using the untreated detection models as a benchmark.

4.3. Machine Learning

Preprocessing, feature extraction, and machine learning algorithms build multiclassification models for visible and near-infrared spectrum data source analysis. The multiclassification detection models under the unprocessed, MS, and MAS methods are as follows: MLP, RF, SVM, PCA-MLP, PCA-RF, PCA-SVM, RFE-MLP, RFE-RF, and RFE-SVM. Model performance results for the test data are shown in Table 3 and Table 4.

Table 3 shows the detection effectiveness metrics of the models in the visible spectrum (400–760 nm). The MS and MAS led to a significant improvement in the metrics of all models. In most cases, the SVM-based models were the optimal and suboptimal models in each group. RF (OA: 72.09%, precision: 72.97%, recall: 72.08%, and F1-score: 71.85%) appeared to be the only suboptimal model. The best model was VIS-MS-RFE-SVM (OA: 97.67%, precision: 97.87%, recall: 97.72%, and F1-score: 97.77%). Comparing VIS-MS-SVM and VIS-MAS-SVM, the airPLS increased the OA by 0.77%.

Table 4 shows the metrics of the models in the near-infrared spectrum (780–2500 nm). Again, MS and MAS showed an overall improvement in model performance, especially for SVM and MLP. The SVM-based models were generally optimal and suboptimal, with NIR-RFE-RF (OA: 74.42%, precision: 72.64%, recall: 72.59%, and F1-score: 72.57%) appearing to be the only RF-based suboptimal model. NIR-MS-SVM was the best predictor (OA: 100%, precision: 100%, recall: 100%, and F1-score: 100%). airPLS processing resulted in a 3.88% decrease in OA for NIR-MAS-SVM relative to NIR-MS-SVM.

4.4. Deep Learning

Figure 9 shows the convergence curves of all the deep learning models. The validation accuracy of VIS-MS-CNN was 63.33% at epoch one and gradually increased and stabilized to 100%, and the validation loss decreased and stabilized to 0.74. The validation accuracy of NIR-MS-CNN was 56.66% at epoch one and gradually increased and stabilized to 96.64%, and the validation loss decreased and stabilized to 0.79. After 200 epochs, VIS-MS-CNN and NIR-MS-CNN converged on the training and validation data without overfitting. The test results detailed in Table 5 show that the VIS-MS-CNN model achieved optimal performance in plant status detection (OA: 100%, precision: 100%, recall: 100%, and F1-score: 100%). Although MAS is a more sophisticated preprocessing procedure, the predictive metrics of the MAS-processed models were lower overall than those of the MS-processed models. In the deep learning algorithm, the visible spectrum obtained by band segmentation was enhanced in the prediction performance.

4.5. Model Comparison and Analysis

A confusion matrix of the predicted results is shown in Figure 10. The best machine learning models for each preprocessing and the deep learning models for both data sources were obtained by screening, as shown in Figure 11. As analyzed above, the introduction of MS and MAS improved the model prediction metrics. The number of models with extracted and unextracted RFE features in machine learning accounted for one half of the total. PCA was unsuitable for modeling kiwifruit sunscald detection, and the validity of the preferred subset of variables for RFE and RFE was demonstrated. SVM performed well among all machine learning algorithms. The prediction metrics of the deep learning models were higher than 95% for both spectral ranges. Among all models, NIR-MS-SVM and VIS-MS-CNN achieved the highest detection metrics.

To evaluate the effectiveness of the models in detecting various plant statuses, the detection accuracy was calculated based on the confusion matrix, as shown in Figure 12. NIR-MS-SVM and VIS-MS-CNN could detect all leaf statuses well (accuracy: 100%). In particular, the NIR-MS-CNN model achieved 100% accuracy in detecting early-stage sunscald, and NIR-MAS-RFE-SVM and VIS-MS-RFE achieved 100% accuracy in detecting anthracnose. However, some models were not adapted to the detection problem; the accuracy of NIR-RFE-SVM and VIS-SVM in detecting early-stage sunscald was below 70%.

The distribution of the reflectance values in the significant bands in the different kiwi statuses’ (healthy, early-stage sunscald, late-stage sunscald, and anthracnose) detection was compared, taking 695 nm as an example, as shown in Figure 13. The most significant difference between groups was found for the MS-processed data, with an F-value of 248.89. The significance levels of all data were <0.05, which was statistically significant.

5. Discussion and Future Work

The trends and characteristics of the spectral changes in kiwifruit leaves in each status determined in the laboratory are shown in Figure 6. As the sunscald developed, the pigment content of the kiwifruit leaves decreased, increasing the red light reflectance. Typically, healthy plant leaves absorb red and reflect near-infrared light. Comparing the visible spectrum, the difference in the average reflectance of kiwifruit leaves in the near-infrared band was greater, possibly due to the altered physiological status of the visible leaves caused by sunscald not visible to the human eye. In addition, the similarity of symptoms between early-stage and late-stage sunscald resulted in close reflectance values, which posed a challenge in correctly differentiating between early-stage and late-stage sunscald. The most dramatic changes in reflectance were observed in anthracnose due to the onset of large, black, disease-causing spots along the veins on the plant leaves, which affected the appearance and moisture content. As the water content of anthracnose leaves decreased, their reflectance in the near-infrared spectrum band increased significantly; Penuelas obtained results consistent with ours [60]. Early-stage and late-stage sunscald produced significant differences in sensitivity values in the visible spectrum band. The sensitivity values were overall higher for anthracnose than for sunscald. Differences in sensitivity regarding disease status were also obtained during the detection of tomato diseases by Jaffar using hyperspectral reflectance data [54].

PCA can identify features with more valuable data by calculating the entropy information [61]. As shown in Table 2, PCA focuses on a specific band range among the bands extracted from the raw data. The visible spectrum consists of bars within the red spectrum (630–760 nm), reflecting the ability of kiwifruit leaves to reflect red light. The near-infrared spectrum group consists of short wavelengths of near-infrared range (780–1400 nm), reflecting the moisture content of kiwifruit leaves; Balasundram reported that the spectral region between 500 and 800 nm showed the greatest discriminatory power, with the upper limit of effective wavelengths up to 1100 nm for the detection of citrus peel ulcers [62]. Different physiological processes reflect information on the leaf spectra, resulting in deviations from the anthracnose confidence ellipse. RFE effectively removes feature redundancy and is a feature extraction method that facilitates improvements in model prediction performance.

The detection accuracy of the SVM, RF, and MLP base models varied widely under different preprocessing types, where MAS processing yielded stable and uniform prediction results. Using MAS-enhanced data, it was easier to obtain relevant information [63,64]. The MLP accuracy results fluctuated drastically, probably due to the inability of the MLP algorithm to converge within a limited number of iterations under limited data conditions to obtain a model with stable prediction results [65,66]. The extraction of wavebands promotes the use of inexpensive equipment to discriminate plant statuses, such as multispectral cameras or drones.

In machine learning, the MS-SVM with full-band inputs of the NIR spectrum had the highest prediction index and achieved 100% detection for all plant statuses. Better detection metrics were obtained for the models without feature extraction; other researchers have observed similar findings. Sankaran used full-band data (350–2500 nm) as input features to the detection models and obtained 98% detection accuracy for citrus yellow dragon disease [67].

Studies have shown that SVM outperforms RF and MLP in detection. The advantage of SVM is that it is suitable for training on small sample datasets and can perform well with many features [68]. RF has good resistance to overfitting; however, it requires larger samples to meet the training requirements [55]. Winston Pinheiro collected 1048 coffee bean samples for special and conventional coffee beans and obtained 97% and 88% prediction accuracy for SVM and RF, respectively [38]. With the large number of parameters computed during the training of the MLP [69], the study demonstrated low accuracy under small sample conditions. On the other hand, the computational efficiency advantage of SVM will promote its application in large survey area applications.

Detecting early-stage and late-stage sunscald is challenging, making misclassification by models with poor VIS-SV and NIR-RFE-SVM detection capabilities possible. In field management, early and late treatments are not timely, causing tree destruction and economic losses; the misclassification of different statuses of plants affects the protection of tree health, or even kills them [70,71]. In 2022, Zhao used hyperspectral and continuous wavelet analysis (CWA) to distinguish oil tea sunscald from similar diseases; the accuracy of sunscald detection was 82.50% to 83.91%, and the anthracnose detection accuracy was 94.12% to 94.28% [72]. Additionally, Samah Alhazmi obtained a recall value of 93.94% using a convolutional neural network to detect diseases, including anthracnose [73]. However, after considering the effects of early-stage sunscald and disease interference, the MS-SVM and VIS-MS-CNN models presented in this study eventually achieved 100% detection of all leaf statuses, with greater robustness. In addition, the accuracy of NIR-MS-SVM and NIR-MS-CNN improved by 25.58% compared with the best model in traditional machine learning (VIS-SVM). After MS and MAS processing, the data distributions of different classes of spectra were improved, and the significance of their differences was increased by MS, which facilitated the models in finding better information for leaf status detection.

In the last few years, many breakthroughs have been made in sensors for plant trait analysis [18], including multispectral sensing and hyperspectral instruments, which provide basic conditions for the development and extension of acquisition systems. In kiwifruit disease detection, the combination of preprocessing, feature extraction, machine learning, and deep algorithms has the advantages of high accuracy, diverse methods, and scalability [74]. Machine learning detection is suitable for small- and medium-sized growers. Although deep learning has higher data volume requirements, preprocessing is vital.

Furthermore, collecting large datasets that include more kiwi varieties, more similar diseases, and more plant species would provide opportunities to improve the detection performance of machine learning and CNN models. To further reduce the number of epochs for convergence and the possibility of overfitting, it is beneficial to compute the parameters using a strategic heuristic optimization algorithm. At the same time, this is a future research direction for the more efficient extraction of spectral band information for similar statuses.

With global warming, sunscald is more likely to occur and needs more attention as a worldwide environmental stress [75,76]. Despite its wide distribution and abundant germplasm in China, kiwifruit’s quality is affected by stresses that prevent it from entering the global kiwifruit consumer market. For kiwifruit sunscald outbreaks, existing pesticide-spraying helicopters operating autonomously can be used for control [77]. Research on kiwifruit sunscald and related technologies is essential to improve the production and management practices of the kiwifruit industry in China and around the world.

6. Conclusions

Our results have demonstrated that NIR-MS-SVM and VIS-MS-CNN are the best detection models for the identification of similar plant statuses, such as healthy, early-stage sunscald, late-stage sunscald, and anthracnose. MS and MAS preprocessing can significantly improve the ability of machine learning and deep learning models’ ability to predict kiwifruit’s physiological status; MS is considered the best preprocessing method. Information in the visible and near-infrared spectrum bands enabled machine learning and deep learning to distinguish kiwifruit health, sunscald, and anthracnose compared with full-band inputs. PCA and RFE machine learning increased the model’s detection capability. In addition, we selected information-rich spectrum bands based on PCA; the 694–713 nm, 758–777 nm, 780–799 nm, and 1303–1322 nm bands provided more practical information for the nondestructive determination of the plant status. The results of this study will be used to develop agricultural machines that automatically identify sunscald and similar diseases, improving the efficiency of orchard management. At the same time, the information extracted will support the work of botanists and growers in disease management and plant protection to improve kiwifruit yields and quality.

Author Contributions

Conceptualization, K.W. and Z.J.; Data curation, Z.J.; Formal analysis, K.W.; Funding acquisition, Z.J.; Investigation, K.W. and Q.D.; Methodology, K.W., Z.J. and Q.D.; Project administration, Z.J.; Resources, K.W. and Z.J.; Software, K.W.; Supervision, K.W. and Z.J.; Validation, K.W.; Visualization, K.W.; Writing—original draft, K.W.; Writing—review and editing, K.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Modern Agricultural Machinery Equipment and Technology Demonstration and Promotion Project of Jiangsu Province, grant number NJ2022-12; and the Jiangsu Agricultural Science and Technology Innovation Fund, grant number CX (19)3075.

Data Availability Statement

The datasets generated for this study are available on request to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

FAO. Faostat. Available online: https://www.fao.org/faostat/zh/#data/QCL (accessed on 1 May 2023).
He, H.; Lin, G.; Chen, C.; Yang, Z.; Xu, J. Leveraging Advantages of University to Eradicate Poverty—Practice and Discussion on Targeted Poverty Alleviation Implemented by University of Science and Technology of China in Liuzhi Special District, Guizhou Province, China. Bull. Chin. Acad. Sci. 2020, 35, 371–377. [Google Scholar]
Zhong, C.; Li, D. Poverty Alleviation Through Kiwifruit Scientific and Technological Achievements of Chinese Academy of Sciences. Bull. Chin. Acad. Sci. 2020, 35, 45–56. [Google Scholar]
Zhong, M.; Zhang, W.; Zou, L.; Huang, Q.; Chen, L.; Huang, C.; Tao, J.; Xu, X. Diurnal Variation of Photosynthesis and Chlorophyll Fluore-Scence Characteristics in KiwifruitUnder High Temperature Condition. Acta Agric. Univ. Jiangxiensis 2018, 40, 7. [Google Scholar] [CrossRef]
Dong, X.; Chen, Y.; Chen, L.; Yan, S.; Yu, H.; Zhang, C.; Ma, C.; Wang, L.; Xu, W.; Wang, S. Effect of high temperature and strong light after sustained heavy rainfall on the photosynthesisand root metabolism in kiwi trees. J. Fruit Sci. 2018, 35, 11. [Google Scholar] [CrossRef]
Tarara, J.M.; Spayd, S.D. Tackling ‘sunburn’ in Red Wine Grapes Through Temperature and Sunlight Exposure. Good Fruit Grow. 2005, 56, 40–41. [Google Scholar]
Parchomchuk, P.; Meheriuk, M. Orchard Cooling with Pulsed Overtree Irrigation to Prevent Solar Injury and Improve Fruit Quality of ‘Jonagold’ Apples. HortScience A Publ. Am. Soc. Hortic. Sci. 1996, 31, 802–804. [Google Scholar] [CrossRef]
Rabinowitch, H.D.; Kedar, N.; Budowski, P. Induction of sunscald damage in tomatoes under natural and controlled conditions. Sci. Hortic. 1974, 2, 265–272. [Google Scholar] [CrossRef]
Yang, X.; Tang, J.; Mustard, J.F.; Wu, J.; Zhao, K.; Serbin, S.; Lee, J.E. Seasonal variability of multiple leaf traits captured by leaf spectroscopy at two temperate deciduous forests. Remote Sens. Environ. 2016, 179, 1–12. [Google Scholar] [CrossRef] [Green Version]
Torres, C.A.; Mogollon, R. Characterization of sun-injury and prediction of sunscald on Tackham’s Triumph’ pears using Vis-NIR spectroscopy. Postharvest Biol. Technol. 2022, 184, 111776. [Google Scholar] [CrossRef]
Ambrózy, Z.S.; Daood, H.; Nagy, Z.S.; Ledo, D.H.; Helyes, L. Effect of net shading technology and harvest times on yield and fruit quality of sweet pepper. Appl. Ecol. Environ. Res. 2016, 14, 99–109. [Google Scholar] [CrossRef]
Sun, Y.; Lu, L.; You, W.; Liu, G.; Chen, X.; Liu, Z.; Li, R. Occurrence and control technology of sunburn of Actinidia arguta. China Fruits 2020, 5, 120–121. [Google Scholar] [CrossRef]
Munne-Bosch, S.; Vincent, C. Physiological Mechanisms Underlying Fruit Sunburn. Crit. Rev. Plant Sci. 2019, 38, 140–157. [Google Scholar] [CrossRef]
Schrader, L.; Sun, J.S.; Zhang, J.G.; Felicetti, D.; Tian, J. Heat and Light-Induced Apple Skin Disorders: Causes and Prevention. In Proceedings of the International Symposium on Enhancing Economic and Environmental Sustainability of Fruit Production in a Global Economy held at the 27th International Horticultural Congress, Seoul, Republic of Korea, 13–19 August 2006; pp. 51–58. [Google Scholar]
Li, G.; Tang, L.; Zhang, X.; Dong, J.; Xiao, M. Factors affecting greenhouse microclimate and its regulating techniques: A review. In Proceedings of the 8th International Conference on Environment Science and Engineering (ICESE), Barcelona, Spain, 11–13 March 2018. [Google Scholar]
Lal, N.; Sahu, N. Management Strategies of Sun Burn in Fruit Crops-A Review. Int. J. Curr. Microbiol. Appl. Sci. 2017, 6, 1126–1138. [Google Scholar] [CrossRef]
Hao, Y.; Li, W.; Chang, Y. Summary on the Mechanism of Fruit Sunburn Development and Protective Methods. J. Shanxi Agric. Univ. 2006, 26, 4. [Google Scholar] [CrossRef]
Terentev, A.; Dolzhenko, V.; Fedotov, A.; Eremenko, D. Current state of hyperspectral remote sensing for early plant disease detection: A review. Sensors 2022, 22, 757. [Google Scholar] [CrossRef]
Yuan, X.; Liu, B.; Xiong, G.; Li, G.; Li, B.; Tu, G.; Jiang, J. Identification of the pathogen of kiwifruit anthracnose in Fengxin County, Jiangxi Province. Acta Phytopathol. Sin. 2023, 1–4. [Google Scholar] [CrossRef]
Grandón, S.; Sanchez-Contreras, J.; Torres, C.A. Prediction models for sunscald on apples (Malus domestica Borkh.) cv. Granny Smith using Vis-NIR reflectance. Postharvest Biol. Technol. 2019, 151, 36. [Google Scholar] [CrossRef]
Ge, Y.; Tu, S. Identification of Chilling Injury in Kiwifruit Using Hyperspectral Structured-Illumination Reflectance Imaging System (SIRI) with Support Vector Machine (SVM) Modelling. Anal. Lett. 2022, 56, 2040–2052. [Google Scholar] [CrossRef]
Jiang, C.; Jiang, M.; Xu, Q.; Huang, X. Expectile regression neural network model with applications. Neurocomputing 2017, 247, 73–86. [Google Scholar] [CrossRef]
Castaeda-Miranda, A.; Castao-Meneses, V.M. Internet of things for smart farming and frost intelligent control in greenhouses. Comput. Electron. Agric. 2020, 176, 105614. [Google Scholar] [CrossRef]
Arora, S.; Taylor, J.W. Rule-based autoregressive moving average models for forecasting load on special days: A case study for France. Eur. J. Oper. Res. 2018, 266, 259–268. [Google Scholar] [CrossRef] [Green Version]
Hassan, M.M.; Huda, S.; Yearwood, J.; Jelinek, H.F.; Almogren, A. Multistage fusion approaches based on a generative model and multivariate exponentially weighted moving average for diagnosis of cardiovascular autonomic nerve dysfunction. Inf. Fusion 2018, 41, 105–108. [Google Scholar] [CrossRef]
Barrow, D.K.; Kourentzes, N.; Sandberg, R.; Niklewski, J. Automatic robust estimation for exponential smoothing: Perspectives from statistics and machine learning. Expert Syst. Appl. 2020, 160, 113637. [Google Scholar] [CrossRef]
Baffour, A.A.; Feng, J.; Taylorb, E.K. A hybrid artificial neural network-GJR modeling approach to forecasting currency exchange rate volatility. Neurocomputing 2019, 365, 285–301. [Google Scholar] [CrossRef]
Castaneda-Miranda, A.; Castano-Meneses, V.M. Smart frost measurement for anti-disaster intelligent control in greenhouses via embedding IoT and hybrid AI methods. Measurement 2020, 164, 108043. [Google Scholar] [CrossRef]
Pradeepkumar, D.; Ravi, V. Soft Computing Hybrids for FOREX Rate Prediction: A Comprehensive Review. Comput. Oper. Res. 2018, 99, 262–284. [Google Scholar] [CrossRef]
Panigrahi, S.; Behera, H.S. A hybrid ETS ANN model for time series forecasting. Eng. Appl. Artif. Intell. 2017, 66, 49–59. [Google Scholar] [CrossRef]
Aggarwal, S.; Gupta, S.; Gupta, D.; Gulzar, Y.; Juneja, S.; Alwan, A.A.; Nauman, A. An artificial intelligence-based stacked ensemble approach for prediction of protein subcellular localization in confocal microscopy images. Sustainability 2023, 15, 1695. [Google Scholar] [CrossRef]
Gulzar, Y. Fruit image classification model based on MobileNetV2 with deep transfer learning technique. Sustainability 2023, 15, 1906. [Google Scholar] [CrossRef]
Mamat, N.; Othman, M.F.; Abdulghafor, R.; Alwan, A.A.; Gulzar, Y. Enhancing image annotation technique of fruit classification using a deep learning approach. Sustainability 2023, 15, 901. [Google Scholar] [CrossRef]
Dhiman, P.; Kaur, A.; Balasaraswathi, V.; Gulzar, Y.; Alwan, A.A.; Hamid, Y. Image Acquisition, Preprocessing and Classification of Citrus Fruit Diseases: A Systematic Literature Review. Sustainability 2023, 15, 9643. [Google Scholar] [CrossRef]
Hassan, F.; Hussain, S.F.; Qaisar, S.M. Fusion of multivariate EEG signals for schizophrenia detection using CNN and machine learning techniques. Inf. Fusion 2023, 92, 466–478. [Google Scholar] [CrossRef]
Ihsan, M.F.; Sunyoto, A.; Arief, M.R. Gray Level Co-Occurrence Matrix Algorithm and Backpropagation Neural Networks for Herbal Plants Identification. In Proceedings of the 2022 5th International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, Indonesia, 24–25 August 2022; pp. 373–378. [Google Scholar]
Abdulridha, J.; Ehsani, R.; Castro, A.D. Detection and Differentiation between Laurel Wilt Disease, Phytophthora Disease, and Salinity Damage Using a Hyperspectral Sensing Technique. Agriculture 2016, 6, 56. [Google Scholar] [CrossRef] [Green Version]
Gomes, W.P.C.; Gonçalves, L.; da Silva, C.B.; Melchert, W.R. Application of multispectral imaging combined with machine learning models to discriminate special and traditional green coffee. Comput. Electron. Agric. 2022, 198, 107097. [Google Scholar] [CrossRef]
Mo, Y.; Zhong, R.; Cao, C. Orbita hyperspectral satellite image for land cover classification using random forest classifier. J. Appl. Remote Sens. 2021, 15, 014519. [Google Scholar] [CrossRef]
Singh, A.K.; Sreenivasu, S.; Mahalaxmi, U.; Sharma, H.; Patil, D.D.; Asenso, E. Hybrid feature-based disease detection in plant leaf using convolutional neural network, bayesian optimized SVM, and random forest classifier. J. Food Qual. 2022, 2022, 2845320. [Google Scholar] [CrossRef]
Wang, Q.; Qi, F.; Sun, M.; Qu, J.; Xue, J. Identification of tomato disease types and detection of infected areas based on deep convolutional neural networks and object detection techniques. Comput. Intell. Neurosci. 2019, 2019, 9142753. [Google Scholar] [CrossRef]
Zhou, L.; Xiao, Q.; Taha, M.F.; Xu, C.; Zhang, C. Phenotypic Analysis of Diseased Plant Leaves Using Supervised and Weakly Supervised Deep Learning. Plant Phenomics 2023, 5, 0022. [Google Scholar] [CrossRef]
Maire, G.L.; François, C.; Soudani, K.; Berveiller, D.; Pontailler, J.Y.; Bréda, N.; Genet, H.; Davi, H.; Dufrêne, E. Calibration and validation of hyperspectral indices for the estimation of broadleaved forest leaf chlorophyll content, leaf mass per area, leaf area index and leaf canopy biomass. Remote Sens. Environ. 2008, 112, 3846–3864. [Google Scholar]
Kawamura, K.; Watanabe, N.; Lee, H.J.; Inoue, Y.; Odagawa, S. Testing genetic algorithm as a tool to select relevant wavebands from field hyperspectral data for estimating pasture mass and quality in a mixed sown pasture using partial least squares regression. Grassl. Sci. 2010, 56, 205–216. [Google Scholar] [CrossRef]
Heim, R.H.-J.; Jürgens, N.; Große-Stoltenberg, A.; Oldeland, J. The effect of epidermal structures on leaf spectral signatures of ice plants (Aizoaceae). Remote Sens. 2015, 7, 16901–16914. [Google Scholar] [CrossRef] [Green Version]
Monowar, M.M.; Hamid, M.A.; Kateb, F.A.; Ohi, A.Q.; Mridha, M. Self-Supervised Clustering for Leaf Disease Identification. Agriculture 2022, 12, 814. [Google Scholar] [CrossRef]
Wang, X.; Xie, F.; Yang, Y.; Zhao, J.; Wu, G.; Wang, S. Rapid Diagnosis of Ductal Carcinoma In Situ and Breast Cancer Based on Raman Spectroscopy of Serum Combined with Convolutional Neural Network. Bioengineering 2023, 10, 65. [Google Scholar] [CrossRef]
Pérez-Roncal, C.; López-Maestresalas, A.; Lopez-Molina, C.; Jarén, C.; Urrestarazu, J.; Santesteban, L.G.; Arazuri, S. Hyperspectral Imaging to Assess the Presence of Powdery Mildew (Erysiphe necator) in cv. Carignan Noir Grapevine Bunches. Agronomy 2020, 10, 88. [Google Scholar] [CrossRef] [Green Version]
Yang, H.; Hu, Y.; Zheng, Z.; Qiao, Y.; Zhang, K.; Guo, T.; Chen, J. Estimation of Potato Chlorophyll Content from UAV Multispectral Images with Stacking Ensemble Algorithm. Agronomy 2022, 12, 2318. [Google Scholar] [CrossRef]
Szkolnik, M. Techniques involved in greenhouse evaluation of deciduous tree fruit fungicides. Annu. Rev. Phytopathol. 1978, 16, 103–129. [Google Scholar] [CrossRef]
Zhang, Z.; Feng, L.; Ma, Y.; Du, Q.; Williams, P.; Drewry, J.; Luck, B. Alfalfa Yield Prediction Using UAV-Based Hyperspectral Imagery and Ensemble Learning. Remote Sens. 2020, 12, 2028. [Google Scholar]
Elavarasan, D.; Vincent PM, D.R.; Srinivasan, K.; Chang, C.Y. A Hybrid CFS Filter and RF-RFE Wrapper-Based Feature Extraction for Enhanced Agricultural Crop Yield Prediction Modeling. Agriculture 2020, 10, 400. [Google Scholar] [CrossRef]
Guo, J.; Wang, K.; Jin, S. Mapping of Soil pH Based on SVM-RFE Feature Selection Algorithm. Agronomy 2022, 12, 2742. [Google Scholar] [CrossRef]
Abdulridha, J.; Ampatzidis, Y.; Kakarla, S.C.; Roberts, P. Detection of target spot and bacterial spot diseases in tomato using UAV-based and benchtop-based hyperspectral imaging techniques. Precis. Agric. 2020, 21, 955–978. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Ye, H.; Huang, W.; Huang, S.; Cui, B.; Dong, Y.; Guo, A.; Ren, Y.; Jin, Y. Identification of banana fusarium wilt using supervised classification algorithms with UAV-based multi-spectral imagery. Int. J. Agric. Biol. Eng. 2020, 13, 136–142. [Google Scholar] [CrossRef]
Ferentinos, K.P. Deep learning models for plant disease detection and diagnosis. Comput. Electron. Agric. 2018, 145, 311–318. [Google Scholar] [CrossRef]
Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]
Grandini, M.; Bagli, E.; Visani, G. Metrics for Multi-Class Classification: An Overview. arXiv 2020, arXiv:2008.05756. [Google Scholar]
Peuelas, J.; Gamon, J.A.; Fredeen, A.L.; Merino, J.; Field, C.B. Reflectance indices associated with physiological changes in nitrogen- and water-limited sunflower leaves. Remote Sens. Environ. 1994, 48, 135–146. [Google Scholar] [CrossRef]
Cui, Y.; Fang, Y. Research on PCA Data Dimension Reduction Algorithm Based on Entropy Weight Method. In Proceedings of the 2020 2nd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 23–25 October 2020. [Google Scholar]
Balasundaram, D.; Burks, T.F.; Bulanon, D.M.; Schubert, T.; Lee, W.S. Spectral reflectance characteristics of citrus canker and other peel conditions of grapefruit. Postharvest Biol. Technol. 2009, 51, 220–226. [Google Scholar] [CrossRef]
Mahum, R.; Munir, H.; Mughal, Z.-U.-N.; Awais, M.; Sher Khan, F.; Saqlain, M.; Mahamad, S.; Tlili, I. A novel framework for potato leaf disease detection using an efficient deep learning model. Hum. Ecol. Risk Assess. Int. J. 2022, 29, 303–326. [Google Scholar] [CrossRef]
Abenina, M.I.A.; Maja, J.M.; Cutulle, M.; Melgar, J.C.; Liu, H. Prediction of potassium in peach leaves using hyperspectral imaging and multivariate analysis. AgriEngineering 2022, 4, 400–413. [Google Scholar] [CrossRef]
Suresh Babu, C.; Thota, L.S. Optimum Learning Rate for Classification Problem with Mlp in Data Mining. Int. J. Adv. Eng. Technol. 2013, 6, 35. [Google Scholar]
Perugachi-Diaz, Y.; Tomczak, J.M.; Bhulai, S. Deep learning for white cabbage seedling prediction. Comput. Electron. Agric. 2021, 184, 106059. [Google Scholar] [CrossRef]
Sankaran, S.; Mishra, A.; Maja, J.M.; Ehsani, R. Visible-near infrared spectroscopy for detection of Huanglongbing in citrus orchards. Comput. Electron. Agric. 2011, 77, 127–134. [Google Scholar] [CrossRef]
Zhang, J.; Liu, L.; Chen, Y.; Rao, Y.; Zhang, X.; Jin, X. The Nondestructive Model of Near-Infrared Spectroscopy with Different Pretreatment Transformation for Predicting “Dangshan” Pear Woolliness Disease. Agronomy 2023, 13, 1420. [Google Scholar] [CrossRef]
Harsányi, E.; Bashir, B.; Arshad, S.; Ocwa, A.; Vad, A.; Alsalman, A.; Bácskai, I.; Rátonyi, T.; Hijazi, O.; Széles, A. Data Mining and Machine Learning Algorithms for Optimizing Maize Yield Forecasting in Central Europe. Agronomy 2023, 13, 1297. [Google Scholar] [CrossRef]
Ampatzidis, Y.; De Bellis, L.; Luvisi, A. iPathology: Robotic Applications and Management of Plants and Plant Diseases. Sustainability 2017, 9, 1010. [Google Scholar] [CrossRef] [Green Version]
Andrea, L.; Yiannis, A.; Luigi, D.B. Plant Pathology and Information Technology: Opportunity for Management of Disease Outbreak and Applications in Regulation Frameworks. Sustainability 2016, 8, 831. [Google Scholar]
Zhao, X.; Zhang, J.; Huang, Y.; Tian, Y.; Yuan, L. Detection and discrimination of disease and insect stress of tea plants using hyperspectral imaging combined with wavelet analysis. Comput. Electron. Agric. 2022, 193, 106717. [Google Scholar] [CrossRef]
Alhazmi, S. Different Stages of Watermelon Diseases Detection Using Optimized CNN. In Soft Computing: Theories and Applications: Proceedings of the SoCTA 2022, Online, 16–12 December 2022; Springer: Berlin/Heidelberg, Germany, 2023; pp. 121–133. [Google Scholar]
Daniya, T.; Vigneshwari, D.S. A Review on Machine Learning Techniques for Rice Plant Disease Detection in Agricultural Research. System 2019, 28, 49–62. [Google Scholar]
Laine, A.-L. Plant disease risk is modified by multiple global change drivers. Curr. Biol. 2023, 33, R574–R583. [Google Scholar] [CrossRef]
Sugiura, T. Three climate change adaptation strategies for fruit production. In Climate Smart Agriculture for the Small-Scale Farmers in the Asian and Pacific Region; NARO: Tsukuba, Japan, 2019; pp. 277–292. [Google Scholar]
Fang, S.; Ru, Y.; Hu, C.; Yang, F. Planning of takeoff/landing site location, dispatch route, and spraying route for a pesticide application helicopter. Eur. J. Agron. 2023, 146, 126814. [Google Scholar] [CrossRef]

Figure 1. Sunscald kiwifruit orchard in Liuhe District, Nanjing, Jiangsu Province, China, showing (a) early-stage sunscald and (b) late-stage sunscald.

Figure 2. Statuses of kiwifruit leaves: (a) healthy; (b) early-stage sunscald; (c) late-stage sunscald; and (d) anthracnose.

Figure 3. Laboratory data acquisition: (a) hyperspectral data acquisition system with all components, including a light, calibrated reflectance panel, scanning table, spectrometer, and laptop computer; (b) spectral reflectance data.

Figure 4. Flowchart.

Figure 5. Structure of the convolutional neural networks. (a) Reflectance data in the visible range were used as inputs to the CNNs; (b) reflectance data in the near-infrared range were used as inputs to the CNNs.

Figure 6. Two parameter curves of kiwifruit leaf samples: (a) average reflectance; (b) sensitivity.

Figure 7. Distributions of principal components and 95% confidence ellipses are shown: (a) principal components of the visible spectrum; (b) principal components of the near-infrared spectrum.

Figure 8. Accuracy variation under different preprocessing procedures and the number of iterations corresponding to the highest accuracy: (a) unprocessed visible spectrum; (b) MS-processed visible spectrum; (c) MAS-processed visible spectrum; (d) unprocessed near-infrared spectrum; (e) MS-processed near-infrared spectrum; and (f) MAS-processed near-infrared spectrum.

Figure 9. Model convergence curves: (a) accuracy of VIS-CNN; (b) accuracy of VIS-MS-CNN; (c) accuracy of VIS-MAS-CNN; (d) loss of VIS-CNN; (e) loss of VIS-MS-CNN; (f) loss of VIS-MAS-CNN; (g) accuracy of NIR-CNN; (h) accuracy of NIR-MS-CNN; (i) accuracy of NIR-MAS-CNN; (j) loss of NIR-CNN; (k) loss of NIR-MS-CNN; and (l) loss of NIR-MAS-CNN.

Figure 10. The confusion matrixes: (a) VIS-SVM; (b) VIS-MS-RFE-SVM; (c) VIS-MAS-SVM; (d) NIR-RFE-SVM; (e) NIR-MS-SVM; (f) NIR-MAS-RFE-SVM; (g) VIS-MS-SVM; and (h) NIR-MS-SVM. *Early: Early-stage sunscald. *Late: Late-stage sunscald.

Figure 11. Comparison of predictive indicators of different models.

Figure 12. Comparison of the prediction accuracy of different leaf statuses.

Figure 13. Distribution of reflectance data at 695 nm in different statuses under different preprocessing procedures and their F and p statistics (obtained by ANOVA): (a) unprocessed; (b) MS-processed; and (c) MAS-processed.

Table 1. Sample numbers of different leaf statuses.

Statuses of Leaves	Number of Leaves
Healthy	105
Early-stage sunscald	90
Late-stage sunscald	130
Anthracnose	104
Total	429

Table 2. PCA-selected bands.

Dataset	Principal Component	Contribution Rate	Wavelengths (nm)
VIS	PC1	89.76%	758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777
VIS	PC2	6.70%	694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713
NIR	PC1	80.98%	1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322
NIR	PC2	16.47%	780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799

Table 3. Performance of machine learning model detection using the visible spectrum. The best models are shown in bold; the second-best models are underlined.

Data	Model	Model Performance
		OA (%)	Precision (%)	Recall (%)	F1-score (%)
Unprocessed	MLP	49.61	67.08	50.14	46.29
	RF	72.09	72.97	72.08	71.85
	SVM	74.42	74.08	74.12	73.76
	PCA-MLP	37.98	47.96	39.43	34.11
	PCA-RF	51.16	50.61	50.04	50.13
	PCA-SVM	56.59	59.42	55.22	55.26
	RFE-MLP	72.09	70.61	70.17	70.27
	RFE-RF	71.32	70.43	70.68	70.34
	RFE-SVM	72.87	70.09	68.61	68.70
		OA (%)	Precision (%)	Recall (%)	F1-score (%)
MS-processed	MLP	86.05	88.68	85.98	86.73
	RF	95.35	94.49	94.97	94.64
	SVM	96.90	96.32	96.95	96.60
	PCA-MLP	66.67	73.26	66.62	63.51
	PCA-RF	84.50	84.37	84.01	83.93
	PCA-SVM	87.60	89.81	87.40	87.07
	RFE-MLP	88.37	88.41	88.85	88.19
	RFE-RF	93.02	93.25	92.65	92.89
	RFE-SVM	97.67	97.87	97.72	97.77
		OA (%)	Precision (%)	Recall (%)	F1-score (%)
MAS-processed	MLP	60.63	74.92	63.15	57.02
	RF	90.70	90.26	90.31	90.25
	SVM	97.67	97.41	97.44	97.41
	PCA-MLP	56.59	61.86	54.49	53.17
	PCA-RF	87.60	87.99	87.20	87.26
	PCA-SVM	82.17	80.63	81.67	80.73
	RFE-MLP	89.92	90.55	89.11	89.46
	RFE-RF	88.37	87.79	87.35	87.49
	RFE-SVM	93.80	93.71	92.08	92.45

Table 4. Performance of machine learning model detection using near-infrared spectrum. The best models are shown in bold; the second-best models are underlined.

Data	Model	Model Performance
		OA (%)	Precision (%)	Recall (%)	F1-score (%)
Unprocessed	MLP	58.59	67.90	55.73	55.81
	RF	70.54	71.28	71.43	70.28
	SVM	70.54	69.64	70.51	69.49
	PCA-MLP	43.41	54.53	44.15	41.94
	PCA-RF	55.81	57.04	56.18	55.57
	PCA-SVM	59.69	66.62	57.13	51.96
	RFE-MLP	72.09	73.46	69.00	69.25
	RFE-RF	74.42	72.64	72.59	72.57
	RFE-SVM	79.07	78.71	76.73	77.20
		OA (%)	Precision (%)	Recall (%)	F1-score (%)
MS-processed	MLP	86.82	87.94	85.47	86.53
	RF	86.82	86.12	85.74	85.80
	SVM	100.00	100.00	100.00	100.00
	PCA-MLP	68.22	69.78	67.43	67.62
	PCA-RF	84.50	84.16	83.71	83.84
	PCA-SVM	78.29	78.01	78.12	77.83
	RFE-MLP	84.50	83.26	84.16	83.41
	RFE-RF	94.57	94.06	94.59	94.30
	RFE-SVM	96.12	96.32	95.88	95.98
		OA (%)	Precision (%)	Recall (%)	F1-score (%)
MAS-processed	MLP	94.57	95.83	94.66	95.21
	RF	90.70	90.32	91.06	90.30
	SVM	96.12	95.88	96.15	95.86
	PCA-MLP	71.32	75.53	71.91	72.51
	PCA-RF	87.60	90.02	86.53	87.25
	PCA-SVM	94.57	94.13	95.40	94.62
	RFE-MLP	96.12	96.23	96.54	96.31
	RFE-RF	93.80	93.70	94.23	93.81
	RFE-SVM	98.45	98.48	98.32	98.37

Table 5. Detection performance of the deep learning models on the test data. The best models are shown in bold.

Data	Model	Model Performance
		OA (%)	Precision (%)	Recall (%)	F1-score (%)
VIS	CNN	77.52	78.33	75.91	76.52
	MS-CNN	100.00	100.00	100.00	100.00
	MAS-CNN	98.45	98.22	98.75	98.45
		OA (%)	Precision (%)	Recall (%)	F1-score (%)
NIR	CNN	51.16	25.66	48.52	33.56
	MS-CNN	96.90	96.26	97.08	96.55
	MAS-CNN	91.47	91.23	91.90	91.24

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, K.; Jia, Z.; Duan, Q. The Detection of Kiwifruit Sunscald Using Spectral Reflectance Data Combined with Machine Learning and CNNs. Agronomy 2023, 13, 2137. https://doi.org/10.3390/agronomy13082137

AMA Style

Wu K, Jia Z, Duan Q. The Detection of Kiwifruit Sunscald Using Spectral Reflectance Data Combined with Machine Learning and CNNs. Agronomy. 2023; 13(8):2137. https://doi.org/10.3390/agronomy13082137

Chicago/Turabian Style

Wu, Ke, Zhicheng Jia, and Qifeng Duan. 2023. "The Detection of Kiwifruit Sunscald Using Spectral Reflectance Data Combined with Machine Learning and CNNs" Agronomy 13, no. 8: 2137. https://doi.org/10.3390/agronomy13082137

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Detection of Kiwifruit Sunscald Using Spectral Reflectance Data Combined with Machine Learning and CNNs

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Plant Material

2.2. Experiment Apparatus and Data Acquisition

2.3. Data Analysis

3. Theoretical Foundations

3.1. Reflection Curve Analysis

3.2. Preprocessing

3.3. Feature Extraction

3.4. Classification Algorithm

3.5. Model Evaluation

4. Tests and Results

4.1. Spectral Reflection Curve Analysis

4.2. Feature Extraction

4.3. Machine Learning

4.4. Deep Learning

4.5. Model Comparison and Analysis

5. Discussion and Future Work

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI