Next Article in Journal
Study on Farmers’ Willingness to Accept for Chemical Fertilizer Reduction Based on the Choice Experiment Method: A Case Study of Communities Surrounding Poyang Lake, China
Next Article in Special Issue
Effects of Aquaculture and Thalassia testudinum on Sediment Organic Carbon in Xincun Bay, Hainan Island
Previous Article in Journal
Study on the Removal Characteristics of IBP and DCF in Wastewater by CW-MFC with Different Co-Substrates
Previous Article in Special Issue
A Novel Approach of Monitoring Ulva pertusa Green Tide on the Basis of UAV and Deep Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Algorithm for the Retrieval of Chlorophyll a in Marine Environments Using Deep Learning

1
College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China
2
Ecological Spatiotemporal Big Data Perception Service Laboratory, Guilin University of Technology, Guilin 541006, China
*
Author to whom correspondence should be addressed.
Water 2023, 15(21), 3864; https://doi.org/10.3390/w15213864
Submission received: 9 September 2023 / Revised: 26 October 2023 / Accepted: 2 November 2023 / Published: 6 November 2023
(This article belongs to the Special Issue Conservation and Monitoring of Marine Ecosystem)

Abstract

:
Chlorophyll a (Chla) is a crucial pigment in phytoplankton, playing a vital role in determining phytoplankton biomass and water nutrient status. However, in optically complex water bodies, Chla concentration is no longer the primary factor influencing remote sensing spectral reflectance signals, leading to significant errors in traditional Chla concentration estimation methods. With advancements in in situ measurements, synchronized satellite data, and computer technology, machine learning algorithms have become popular in Chla concentration retrieval. Nevertheless, when using machine learning methods to estimate Chla concentration, abrupt changes in Chla values can disrupt the spatiotemporal smoothness of the retrieval results. Therefore, this study proposes a two-stage approach to enhance the accuracy of Chla concentration estimation in optically complex water bodies. In the first stage, a one-dimensional convolutional neural network (1D CNN) is employed for precise Chla retrieval, and in the second stage, the regression layer of the 1DCNN is replaced with support vector regression (SVR). The research findings are as follows: (1) In the first stage, the performance metrics (R2, RMSE, RMLSE, Bias, MAE) of the 1D CNN outperform state-of-the-art algorithms (OCI, SVR, RFR) on the test dataset. (2) After the second stage, the performance further improves, with the metrics achieving values of 0.892, 11.243, 0.052, 1.056, and 1.444, respectively. (3) In mid- to high-latitude regions, the inversion performance of 1D CNN\SVR is superior to other algorithms, exhibiting richer details and higher noise tolerance in nearshore areas. (4) 1D CNN\SVR demonstrates high inversion capabilities in water bodies with medium-to-high nutrient levels.

1. Introduction

Chlorophyll a (Chla) is an important biological indicator of phytoplankton biomass in aquatic ecosystems, and it plays a crucial role in measuring the primary productivity of the ocean and assessing the ecological quality of water bodies [1]. Phytoplankton absorb carbon dioxide and produce oxygen through photosynthesis, and their presence in appropriate amounts can improve water quality, as well as help to reduce greenhouse gas emissions [2]. However, human activities have had a particularly significant impact on coastal waters, leading to local eutrophication and rapid increases in the surface biomass of phytoplankton [3]. Harmful algal blooms caused by marine eutrophication are serious aquatic ecological disasters that can severely damage the ecological environment of water bodies and pose a threat to human society [4,5]. In summary, monitoring and analyzing Chla concentrations can improve the ecological quality of water bodies, achieve sustainable water resource management, and provide essential scientific evidence for addressing climate change and protecting marine ecosystems. Therefore, the construction of a global ocean Chla concentration field is of crucial significance for improving the ecological quality of water bodies.
The traditional method involves collecting water samples and measuring marine environmental parameters through buoys and cruises. However, this approach has several drawbacks, including low temporal and spatial resolution, high costs, and time-consuming processes, which limit its application on large and long-term scales [6]. In contrast, remote sensing technology offers significant advantages over traditional methods, including high spatiotemporal resolution, low cost, and high efficiency. It can effectively overcome these limitations [7,8,9,10]. Currently, commonly used satellite sensors [11] include the Sea-viewing Wide Field-of-view Sensor (SeaWiFS), launched by NASA in 1997, the Moderate Resolution Imaging Spectroradiometer (MODIS), jointly launched by NASA and the US Geological Survey (USGS) in 1999, the Medium Resolution Imaging Spectrometer (MERIS), launched by the European Space Agency (ESA) in 2002, and the Ocean and Land Colour Instrument (OLCI), launched by ESA in 2016 [12]. Generic methods for Chla inversion in open ocean areas using these sensors have been well established, such as the OCx algorithm for Chla concentrations greater than 0.20 mg/m3 [7] and the CI algorithm for Chla concentrations less than 0.15 mg/m3 [13]. However, these algorithms have poor accuracy in Chla inversion in complex water bodies such as coastal waters, which cannot meet application requirements and, therefore, require further research and exploration.
Currently, there are mainly two types of Chla inversion algorithms for coastal waters, including band ratio algorithms [14,15] and fluorescence-based algorithms [16]. Some two-, three-, and four-band ratio algorithms in the band ratio method can consider the impact of water components and perform well in coastal waters, but their models only hold under certain assumptions and are difficult to adapt to highly turbid water bodies [15]. The fluorescence-based algorithms, including the Fluorescence Line Height (FLH), Normalized Fluorescence Height (NFH), and Fluorescence Envelope Area (FEA) methods, can reduce the impact of suspended particles, yellow substances, and aerosols on remote sensing reflectance and achieve good accuracy in regional coastal chlorophyll inversion. However, the fluorescence peak is influenced by chlorophyll concentration, and the rapid changes in the water environment in coastal waters can limit the accuracy of this method [17,18]. The above-mentioned algorithms for Chla in coastal waters only yield ideal results in specific water areas and are difficult to extend to other coastal regions, making it challenging to determine their applicability and limitations on a global scale. To address this issue, classification or segmented inversion algorithms based on water component types have been widely used. For example, Neil et al. [19] divided the global inland and coastal aquatic systems into 13 different optical water types and used a dynamic ensemble algorithm to determine the inversion model parameters for specific water bodies, achieving a correlation coefficient of 0.89 for the inversion results. While this algorithm has high universality, its inversion results are directly limited by the optical water classification criteria and require the establishment of fusion algorithms between different optical water types, making it relatively complex. Therefore, a more objective and simpler algorithm is needed.
Due to the ability of machine learning algorithms to eliminate the limitations of Chla inversion based on water component classification and the fact that they do not require any prior knowledge to be established between response and prediction variables, Chla inversion based on machine learning algorithms has received increasing attention [20,21]. The Chla concentration in water affects the absorption and reflection characteristics of spectra. Based on this feature, remote sensing reflectance (Rrs) can be used as an input feature of machine learning models to predict Chla concentrations [22]. Among them, multilayer perceptron (MLP), Gaussian process regression (GPR), support vector regression (SVR), and random forest regression (RFR) have been proven to have potential in Chla inversion in complex water bodies [23,24,25,26]. However, traditional machine learning algorithms have limitations in dealing with large-scale high-dimensional data, model parameter adjustment, and nonlinear model establishment compared to deep learning algorithms, which have better scalability and the ability to automatically learn feature patterns [27]. Among them, convolutional neural networks (CNNs) are a neural network architecture that can extract high-dimensional or complex features from raw data. As long as the training dataset covers a wide range of data, CNNs can effectively process spectral information in remote sensing data, thereby improving the accuracy of Chla inversion [25]. However, research on a general method based on one-dimensional convolutional neural networks (1D CNN) for Chla inversion is still relatively limited.
This paper proposes a universal method for Chla inversion in coastal waters, which combines 1D CNN and other traditional machine learning algorithms to establish a relationship model between remote sensing reflectance (Rrs) and Chla concentration. We use the original Rrs as input features to predict Chla concentration and demonstrate the performance of the model. Through comparison with other algorithms, we verify the high accuracy of the model in coastal waters with different nutrient levels. Finally, we conduct Chla inversion and relevant analysis in coastal waters based on this model. The proposed method provides an effective solution for global Chla inversion in coastal waters. By conducting higher precision monitoring and analysis of Chla concentrations, it becomes possible to gain a more accurate understanding of water nutrient levels. This, in turn, enables timely resolution of aquatic ecosystem issues, providing a crucial scientific foundation for achieving sustainable water resource management and safeguarding marine ecosystems. Additionally, it plays a proactive role in addressing climate change.

2. Data and Preprocessing

2.1. Data Source

This paper is based on the data from the Aerosol Robotic Network—Ocean Color (AERONET-OC) and utilizes the validation system provided by the NASA Ocean Biology Processing Group (OBPG) through the SeaBASS website (https://seabass.gsfc.nasa.gov) to perform spatiotemporal matching of sensor and in situ data to obtain a remote sensing in-situ matched dataset, access time for the 2002 to 2017. In situ data for Chla concentration were obtained through cruise measurements, and the values obtained from both fluorescence and ion chromatography methods were found to be consistent. Therefore, in this paper, the Chla concentration values obtained from both methods are considered to be identical and are treated as true values. The Rrs values in this dataset were obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) aboard the Aqua satellite. Figure 1 shows the spatiotemporal distribution of MODIS-Aqua matched with true values. The matched data mainly cover open and coastal waters from low to high latitudes in 2003 and 2004. The remote sensing in-situ matched dataset was divided into a training set and a validation set in a 4:1 ratio, with their spatial distribution shown in Figure 2.
The inversion data are sourced from the Ocean Color SMI: Standard Mapped Image MODIS Aqua data provided by OBPG (https://oceancolor.gsfc.nasa.gov), acquired from 2002 to 2017, with a spatial resolution of 4 km and including parameters such as Rrs and Chla concentrations.

2.2. Data Preprocessing

Based on the spectral characteristics of Chla and the complex features of coastal areas, the reflectance data from ten bands (412, 443, 469, 488, 531, 547, 555, 645, 667, and 678) were selected. To reduce noise in the dataset, paired points with Chla concentrations greater than 50 mg/m3 and negative Rrs values were excluded. Through these steps, a more consistent spectral reflectance curve was obtained, and the impact on the true situation was minimized, as shown in Table 1. The reflectance data were then standardized, and the Chla data were transformed using a log10 function. After preprocessing in this manner, both datasets contained no outliers or invalid values and were in approximate ranges, which was beneficial for building the inversion model. The final dataset contained 1271 matched pairs, reduced from the original 1351.

3. Model Development

3.1. 1D CNN/SVR Model Design

The model used in this paper consists of two algorithms, 1D CNN and SVR, to form a 1D CNN/SVR inversion model. The 1D CNN module is responsible for the automatic feature extraction of Rrs, while the SVR module performs regression for fitting Chla concentration. Figure 3 shows the construction process of the inversion model.
Traditional CNN structures usually consist of multiple convolutional layers, pooling layers, and fully connected layers [28]. The convolutional layers are used for feature extraction, while the pooling layers reduce the size and number of features, thereby reducing the computational complexity of the model. Finally, the fully connected layers transform the features into classification or regression results. However, 1D CNNs differ slightly from traditional CNNs in structure, as the input to the convolutional layer is a three-dimensional vector consisting of samples, time steps, and features, and the output is also a three-dimensional vector consisting of samples, time steps, and channels [29]. In short, 1D CNNs are a type of convolutional neural network designed specifically for processing one-dimensional sequence data. The original dataset used in this paper consisted of two-dimensional vectors, including samples and features, which needed to be converted into three-dimensional vectors. In addition, having too few features may limit the number of convolutional layers that can be used. Therefore, this paper adopted a method of adding features by expanding the feature volume through a fully connected layer. Specifically, we used a fully connected layer with n neuron nodes to perform a full connection operation (F, 1) on the input vector, with an output of (F, n). After the reshape operation, the volume can be restored to its original dimension. However, because 1D CNN is a deep learning model, it may overfit the training set, while SVR is a nonlinear regression model based on kernel functions with good generalization capabilities. Therefore, in this paper, the final fully connected layer of the 1D CNN was replaced with an SVR module to achieve regression.
As shown in Figure 4, we constructed the first module of the 1D CNN/SVR inversion model, namely the feature extraction module, based on TensorFlow. The second module, on the other hand, was built using scikit-learn (sklearn) to create the regression module. Firstly, a fully connected layer is used to expand the 10 original features (412, 443, 469, 488, 531, 547, 555, 645, 667, and 678) to 600 new input features. The first convolutional layer uses a kernel size of 5, stride of 5, and 64 filters to extract 1200 × 64 high-dimensional features. Then, three CNN blocks with the same parameters and structure are used to further process the features, each consisting of two convolutional layers and one pooling layer. The kernel size of the convolutional layer is 3, the stride is 1, and there are 32 filters, while the pooling layer uses max pooling with a size of 2. The activation function is uniformly set to the ReLU function. The final regression module is composed of an SVR model, with a regularization parameter of 1 and a kernel function of the polynomial kernel and radial basis function, and the other parameters are set to the default values of sklearn.

3.2. Inversion Model Evaluation Metrics

Due to the limitations of standard statistical metrics in Chla inversion algorithms, this paper uses both raw and log-transformed Chla metrics for performance evaluation. The metrics used are as follows:
R 2 = 1 i = 1 n ( log 10 ( M i ) log 10 ( E i ) ) 2 i = 1 n ( log 10 ( M i ) mean ( log 10 ( E i ) ) ) 2
R M S E = 1 n i = 1 n ( E i M i ) 2
R M S L E = 1 n i = 1 n ( log 10 ( E i ) log 10 ( M i ) ) 2
M A E = 10 n i = 1 n | log 10 ( E i ) log 10 ( M i ) |
B i a s = 10 n i = 1 n log 10 ( E i ) log 10 ( M i )
In the Chla concentration prediction process after log10 transformation, R2 is used to evaluate the fitting degree of the regression model, RMSLE is used to measure the difference between the predicted values and the true values, and MAE and Bias are used to calculate the mean absolute error between the predicted values and the true values, and the average error of the predicted values, respectively. In addition, the RMSE of the untransformed Chla concentration is also calculated to evaluate the standard deviation.
To further explore the model’s performance in real space, we take the in situ data as the reference and consider the difference between the model predictions and the actual values as the inversion error. We employ several key metrics to evaluate the performance, including the minimum value X(1), maximum value X(n), median X ( n + 1 2 ) , first quartile X ( n + 1 4 ) , third quartile X ( 3 ( n + 1 ) 4 ) , outliers, and interquartile range. The formula for calculating the interquartile range is as follows:
I Q R = Q 3 Q 1
IQR represents the interquartile range, where Q3 stands for the third quartile, and Q1 is the first quartile.

4. Experiments and Results

4.1. Model Performance Evaluation

To evaluate the performance metrics of the 1D CNN/SVR inversion model, it was compared with the OCI algorithm, as well as SVR, RFR, and 1D CNN models, using the product dataset. To ensure the objectivity of the experiments, the original inputs of all the models were kept consistent, i.e., 10 original features, the same training data (N = 1016) and validation data (N = 255), and the validation results are shown in Table 2 and Figure 5.
The performance in the logarithmic space is shown in Figure 5A–E. The OCI algorithm has poor predictive ability because it is based on an empirical model constructed using global ocean data, while the Chla concentration in coastal waters is influenced by many nonlinear factors, leading to limited prediction accuracy. Among the machine learning algorithms, the SVR algorithm has poor predictive ability, with a total prediction error of 1.081. The predictive ability of the RFR algorithm is better than that of the SVR, with an R2 of 0.871 and a reduced prediction error of 1.053. Although the 1D CNN algorithm has better predictive ability than the RFR, the prediction error is as high as 1.144. Finally, the 1D CNN/SVR algorithm has an R2 of 0.892, explaining 89.2% of the variance of the target variable. The average error is 11.243 mg/m3, and the RMSLE is 0.052, indicating that the prediction error follows a log-normal distribution; the Bias is 1.056, and the MAE is 1.444. These metrics indicate that the model has the strongest predictive ability, the smallest prediction error, and the smallest deviation between the predicted and actual values overall. Therefore, according to the results of this experiment, the 1D CNN/SVR model is the most suitable for Chla concentration inversion in coastal waters.

4.2. Evaluation of the Inversion Capability of the Model at Different Trophic Levels

In this section, we evaluated the inversion capability of the model at different trophic levels on a global spatial scale using the monthly average data from August 2003. Figure 6A–E shows the global spatial distribution of Chla concentration inversion results based on the OCI, SVR, RFR, and 1D CNN/SVR algorithms. The inversion results of these algorithms exhibit similar spatial patterns, such as the characteristics of upwelling near the equator and high Chla concentrations in coastal waters. However, the SVR algorithm shows overestimated Chla concentrations and noise points in the spatial regions of 30°–60° S and 60°–90° N, indicating that the inversion results of this algorithm are sensitive to noise in mid- to high-latitude regions, and the inversion capability is relatively weak. However, the 1D CNN/SVR algorithm did not exhibit this phenomenon, indicating that the 1D CNN used in the first phase of the model can effectively improve this drawback of the SVR algorithm. In addition, there are some differences in performance among the algorithms, as shown in Table 2, but it is difficult to observe the performance differences between the algorithms in the global Chla inversion results, and it is difficult to observe the spatial smoothness of Chla concentrations across different trophic levels, especially in coastal waters. Coastal areas are influenced by nutrient inputs from rivers, streams, and groundwater, as well as the presence of suspended particles such as sediments and organic matter. These factors contribute to the complexity of the coastal water body. Therefore, we further reduced the spatial scale and selected two regions for inversion, the coastal waters of the North Atlantic (60°–80° W, 30°–50° N) and the southern Indian Ocean (40°–60° E, 30°–50° S), represented as roi_1 (purple box) and roi_2 (blue box), respectively, as shown in Figure 1. This is because roi_1 and roi_2 are located in mid- to high-latitude regions, have more in situ measured points, and have coastal waters with low-nutrient, nutrient-rich, and eutrophic environments [30], which can fully verify the inversion capability of the model in mid- to high-latitude regions with different trophic levels and ensure the accuracy of the inversion results.
The inversion results of roi_1 are shown in Figure 7A–E. The inversion results of the 1D CNN/SVR model are smoother than those of other algorithms, and there are almost no noise points in the relatively open sea areas. The Chla concentration exhibits a smooth transition in the mesotrophic and eutrophic zones, while the OC3M, SVR, and RFR algorithms show sudden increases. This indicates that other algorithms may be more sensitive to noise, resulting in sudden changes or outliers. In contrast, the 1D CNN/SVR model can better capture and smooth the noise and outliers in the data, thereby improving the accuracy and stability of the inversion results for Chla concentration.
Figure 8 reflects the errors of the four algorithms, OCI, SVR, RFR, and 1D CNN/SVR, in the inversion results of roi_1 compared to the true values, using 100 in situ data points (N = 100). Figure 8A shows that in areas with high Chla concentrations, the OCI algorithm has poor overall fit to the true values, while the SVR and RFR algorithms have improved overall fit but may overestimate Chla concentration. The 1D CNN/SVR algorithm has a higher overall fit and has improved the overestimation of Chla concentration. From Figure 8B and Table 3, it can be seen that the inversion error of RFR is the smallest, but the inversion smoothness of RFR is poor. In addition, the average inversion error of the 1D CNN/SVR algorithm is lower than that of SVR but higher than that of 1D CNN, while the maximum and minimum inversion errors are lower than those of 1D CNN. This indicates that 1D CNN/SVR is a combination of SVR and 1D CNN algorithms, complementing each other’s disadvantages. However, the accuracy of the model still needs to be improved in practical predictions.
The inversion results of roi_2 are shown in Figure 9A–E. The inversion results of the 1D CNN/SVR model are similar to those of OC3M, indicating that the model has some ability to invert Chla concentration in low-nutrient areas. However, the spatial smoothness of the inversion results is greatly reduced. This is because the training data are mostly concentrated in the nearshore areas, and the 1D CNN/SVR model may pay more attention to the features and patterns of the nearshore areas during the training process, resulting in insufficient feature learning for low-nutrient areas and poor inversion results in these areas.
To further observe the inversion errors of the algorithms, we reduced the number of in situ data points (N = 27) and compared the inversion results of the four algorithms to the true values in roi_2, as shown in Figure 10 and Table 4. It can be observed from Figure 10A that the inversion errors of the algorithms are generally low, but their ability to handle outliers is poor, which is consistent with the inversion results in roi_1. From Figure 10B and Table 4, it can be seen that the minimum and maximum prediction errors of 1D CNN/SVR are lower than those of 1D CNN, confirming that 1D CNN/SVR is a combination of SVR and 1D CNN algorithms.
In summary, compared with other algorithms, the 1D CNN/SVR model can better capture and smooth the noise and outliers in the data, improve the accuracy and stability of the inversion results, and may show more details and fluctuations in the spatial smoothness of the inversion results. However, the ability of the model to invert Chla concentration in low-nutrient areas still needs to be improved.

5. Conclusions

In this study, we developed a global ocean surface Chla inversion model based on machine learning. Using ocean Chla concentration as the research object, we standardized and removed outliers from the original Rrs before inputting it into the model for prediction. The performance evaluation experiments demonstrate that the 1DCNN\SVR model outperforms current mainstream algorithms, namely OC3M, SVR, RFR, and 1DCNN. Moreover, the respective R2, RMSE, RMLSE, Bias, and MAE in logarithmic space achieved values of 0.892, 0.879, 11.243 (mg/m3), 0.052, 1.056, and 1.444. Evaluation of inversion capabilities in different nutrient levels showed that the 1D CNN/SVR model addresses the weakness of SVR in inverting Chla concentration in middle and high latitudes and exhibits richer details and higher noise tolerance in the inversion results in nearshore areas, making it a viable alternative for inverting Chla concentration in nearshore areas. At the same time, the model also has the ability to invert Chla concentration in different nutrient waters, although its performance in low-nutrient areas is slightly weaker, which is a direction for further research. This model not only overcomes the complexity and inefficiency of traditional models but also excels in constructing a high-precision global ocean Chla concentration field. Monitoring and analysis of Chla concentration enables the timely detection of water nutrient levels. Additionally, it provides a crucial scientific basis for addressing climate change and protecting marine ecosystems.

Author Contributions

Conceptualization, D.F.; methodology, Y.Z. and D.F.; software, T.L.; validation, T.L.; resources, D.F. and H.H.; writing—original draft, Y.Z. and T.L.; supervision, D.F.; project administration, H.H.; funding acquisition, D.F. and H.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Natural Science Foundation of Guangxi Province under Grant 2022GXNSFBA035637, in part by the Basic Scientific Research Ability Improvement Project for Young and Middle-Aged Teachers of Universities in Guangxi under Grant 2021KY0255, and in part by the “BaGui Scholars” Program of the Provincial Government of Guangxi.

Data Availability Statement

Using NASA’s Marine biological treatment group (OBPG) through SeaBASS website (https://seabass.gsfc.nasa.gov) provides authentication system, time-space matching of sensors and in situ data, remote sensing in situ matching data set. (1 January 2000 to 31 December 2017).

Acknowledgments

The authors thank all researchers who contributed to the SeaBASS data achieved. We also appreciate the reviewers for their comments and suggestions, which helped to improve the quality of this manuscript.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Kasprzak, P.; Padisák, J.; Koschel, R.; Krienitz, L.; Gervais, F. Chlorophyll a concentration across a trophic gradient of lakes: An estimator of phytoplankton biomass? Limnologica 2008, 38, 327–338. [Google Scholar] [CrossRef]
  2. Amin, S.A.; Hmelo, L.R.; Van Tol, H.M.; Durham, B.P.; Carlson, L.T.; Heal, K.R.; Morales, R.L.; Berthiaume, C.T.; Parker, M.S.; Djunaedi, B.; et al. Interaction and signalling between a cosmopolitan phytoplankton and associated bacteria. Nature 2015, 522, 98–101. [Google Scholar] [CrossRef]
  3. Ma, J.; Qin, B.; Wu, P.; Zhou, J.; Niu, C.; Deng, J.; Niu, H. Controlling cyanobacterial blooms by managing nutrient ratio and limitation in a large hyper-eutrophic lake: Lake Taihu, China. J. Environ. Sci. 2015, 27, 80–86. [Google Scholar] [CrossRef] [PubMed]
  4. Brooks, B.W.; Lazorchak, J.M.; Howard, M.D.; Johnson, M.V.V.; Morton, S.L.; Perkins, D.A.; Reavie, E.D.; Scott, G.I.; Smith, S.A.; Steevens, J.A. Are harmful algal blooms becoming the greatest inland water quality threat to public health and aquatic ecosystems? Environ. Toxicol. Chem. 2016, 35, 6–13. [Google Scholar] [CrossRef] [PubMed]
  5. Lopez, C.B.; Jewett, E.B.; Dortch QT, W.B.; Walton, B.T.; Hudnell, H.K. Scientific Assessment of Freshwater Harmful Algal Blooms; Interagency Working Group on Harmful Algal Blooms, Hypoxia, and Human Health of the Joint Subcommittee on Ocean Science and Technology: Washington, DC, USA, 2008.
  6. Madrid, Y.; Zayas, Z.P. Water sampling: Traditional methods and new approaches in water sampling strategy. TrAC Trends Anal. Chem. 2007, 26, 293–299. [Google Scholar] [CrossRef]
  7. O’Reilly, J.E.; Werdell, P.J. Chlorophyll algorithms for ocean color sensors—OC4, OC5 & OC6. Remote Sens. Environ. 2019, 229, 32–47. [Google Scholar] [PubMed]
  8. Hu, C. A novel ocean color index to detect floating algae in the global oceans. Remote Sens. Environ. 2009, 113, 2118–2129. [Google Scholar] [CrossRef]
  9. Li, Y.; Guo, J.; Guo, X.; Hu, Z.; Tian, Y. PlanktonmDetection with Adversarial Learning and a Densely Connected Deep Learning Model for Class Imbalanced Distribution. J. Mar. Sci. Eng. 2021, 9, 636. [Google Scholar] [CrossRef]
  10. Bayindir, C. Predicting the Ocean Currents using Deep Learning. arXiv 2019, arXiv:1906.08066. [Google Scholar]
  11. Blondeau-Patissier, D.; Gower, J.F.; Dekker, A.G.; Phinn, S.R.; Brando, V.E. A review of ocean color remote sensing methods and statistical techniques for the detection, mapping and analysis of phytoplankton blooms in coastal and open oceans. Prog. Oceanogr. 2014, 123, 123–144. [Google Scholar] [CrossRef]
  12. Tilstone, G.H.; Pardo, S.; Dall’Olmo, G.; Brewin, R.J.; Nencioli, F.; Dessailly, D.; Kwiatkowska, E.; Casal, T.; Donlon, C. Performance of Ocean Colour Chlorophyll a algorithms for Sentinel-3 OLCI, MODIS-Aqua and Suomi-VIIRS in open-ocean waters of the Atlantic. Remote Sens. Environ. 2021, 260, 112444. [Google Scholar] [CrossRef]
  13. Hu, C.; And, Z.L.; Franz, B. Chlorophyll algorithms for oligotrophic oceans: A novel approach based on three-band reflectance difference. J. Geophys. Res. Ocean. 2012, 117, C1. [Google Scholar] [CrossRef]
  14. Gurlin, D.; Gitelson, A.A.; Moses, W.J. Remote estimation of chl-a concentration in turbid productive waters—Return to a simple two-band NIR-red model? Remote Sens. Environ. 2011, 115, 3479–3490. [Google Scholar] [CrossRef]
  15. Dall Olmo, G.; Gitelson, A.A. Effect of bio-optical parameter variability on the remote estimation of chlorophyll-a concentration in turbid productive waters: Experimental results. Appl. Opt. 2005, 44, 412–422. [Google Scholar] [CrossRef]
  16. Li, L.; Yin, Q.; Xu, H.; Gong, C.; Chen, Z. Estimating chlorophyll a concentration in lake water using space-borne hyperspectral data. In Proceedings of the 2010 IEEE International Geoscience & Remote Sensing Symposium, Honolulu, HI, USA, 25–30 July 2010. [Google Scholar]
  17. Liu, F.F.; Chen, C.Q.; Tang, S.L.; Liu, D.Z. Retrieval of chlorophyll a concentration from a fluorescence enveloped area using hyperspectral data. Int. J. Remote Sens. 2011, 32, 3611–3623. [Google Scholar] [CrossRef]
  18. Gower, J. On the use of satellite-measured chlorophyll fluorescence for monitoring coastal waters. Int. J. Remote Sens. 2015, 37, 2077–2086. [Google Scholar] [CrossRef]
  19. Neil, C.; Spyrakos, E.; Hunter, P.D.; Tyler, A.N. A global approach for chlorophyll-a retrieval across optically complex inland waters based on optical water types. Remote Sens. Environ. 2019, 229, 159–178. [Google Scholar] [CrossRef]
  20. Pahlevan, N.; Smith, B.; Schalles, J.; Binding, C.; Cao, Z.; Ma, R.; Alikas, K.; Kangro, K.; Gurlin, D.; Hà, N.; et al. Seamless retrievals of chlorophyll-a from Sentinel-2 (MSI) and Sentinel-3 (OLCI) in inland and coastal waters: A machine-learning approach. Remote Sens. Environ. 2020, 240, 111604. [Google Scholar] [CrossRef]
  21. Hafeez, S.; Wong, M.S.; Ho, H.C.; Nazeer, M.; Nichol, J.; Abbas, S.; Tang, D.; Lee, K.H.; Pun, L. Comparison of Machine Learning Algorithms for Retrieval of Water Quality Indicators in Case-II Waters: A Case Study of Hong Kong. Remote Sens. 2019, 11, 617. [Google Scholar] [CrossRef]
  22. Le, C.; Hu, C.; Cannizzaro, J.; English, D.; Muller-Karger, F.; Lee, Z. Evaluation of chlorophyll-a remote sensing algorithms for an optically complex estuary. Remote Sens. Environ. 2013, 129, 75–89. [Google Scholar] [CrossRef]
  23. Sadaiappan, B.; Balakrishnan, P.; Vishal, C.R.; Vijayan, N.T.; Subramanian, M.; Gauns, M.U. Applications of Machine Learning in Chemical and Biological Oceanography. ACS Omega 2023, 8, 15831–15853. [Google Scholar] [CrossRef] [PubMed]
  24. Yu, B.; Xu, L.; Peng, J.; Hu, Z.; Wong, A. Global chlorophyll-a concentration estimation from moderate resolution imaging spectroradiometer using convolutional neural networks. J. Appl. Remote Sens. 2020, 14, 034520. [Google Scholar] [CrossRef]
  25. Wang, W.; Shi, K.; Zhang, Y.; Li, N.; Sun, X.; Zhang, D.; Zhang, Y.; Qin, B.; Zhu, G. A ground-based remote sensing system for high-frequency and real-time monitoring of phytoplankton blooms. J. Hazard. Mater. 2022, 439, 129623. [Google Scholar] [CrossRef] [PubMed]
  26. Lei, F.; Yu, Y.; Zhang, D.; Feng, L.; Guo, J.; Zhang, Y.; Fang, F. Water remote sensing eutrophication inversion algorithm based on multilayer convolutional neural network. J. Intell. Fuzzy Syst. 2020, 39, 5319–5327. [Google Scholar] [CrossRef]
  27. Zhao, X.; Xu, H.; Ding, Z.; Wang, D.; Deng, Z.; Wang, Y.; Wu, T.; Li, W.; Lu, Z.; Wang, G. Comparing deep learning with several typical methods in prediction of assessing chlorophyll-a by remote sensing: A case study in Taihu Lake, China. Water Supply 2021, 21, 3710–3724. [Google Scholar] [CrossRef]
  28. Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar] [CrossRef]
  29. Tang, W.; Long, G.; Liu, L.; Zhou, T.; Jiang, J.; Blumenstein, M. Rethinking 1D-CNN for Time Series Classification: A Stronger Baseline. arXiv 2020, arXiv:2002.10061. [Google Scholar]
  30. Seegers, B.N.; Stumpf, R.P.; Schaeffer, B.A.; Loftin, K.A.; Werdell, P.J. Performance metrics for the assessment of satellite data products: An ocean color case study. Opt. Express 2018, 26, 7404–7422. [Google Scholar] [CrossRef]
Figure 1. Shows the spatiotemporal distribution of MODIS-Aqua matched with true values. The purple box represents Roi_1, while the blue box represents Roi_2.
Figure 1. Shows the spatiotemporal distribution of MODIS-Aqua matched with true values. The purple box represents Roi_1, while the blue box represents Roi_2.
Water 15 03864 g001
Figure 2. Shows the spatial distribution of the training set (left) and validation set (right) point locations.
Figure 2. Shows the spatial distribution of the training set (left) and validation set (right) point locations.
Water 15 03864 g002
Figure 3. Illustrates the process of constructing the inversion model.
Figure 3. Illustrates the process of constructing the inversion model.
Water 15 03864 g003
Figure 4. Shows the structure of the 1D CNN/SVR inversion model.
Figure 4. Shows the structure of the 1D CNN/SVR inversion model.
Water 15 03864 g004
Figure 5. Validation set prediction results.
Figure 5. Validation set prediction results.
Water 15 03864 g005
Figure 6. Global Chla concentration inversion (August 2003).
Figure 6. Global Chla concentration inversion (August 2003).
Water 15 03864 g006
Figure 7. Inversion results of roi_1 (60°–80° W, 30°–50° N).
Figure 7. Inversion results of roi_1 (60°–80° W, 30°–50° N).
Water 15 03864 g007
Figure 8. Comparison of inversion errors in roi_1. (A) True Chla concentration at pairing points. (B) True Chla concentration error at pairing points.
Figure 8. Comparison of inversion errors in roi_1. (A) True Chla concentration at pairing points. (B) True Chla concentration error at pairing points.
Water 15 03864 g008
Figure 9. Inversion results of roi_2 (40°–60° E, 30°–50° S).
Figure 9. Inversion results of roi_2 (40°–60° E, 30°–50° S).
Water 15 03864 g009
Figure 10. Comparison of inversion errors in roi_2. (A) True Chla concentration at pairing points. (B) True Chla concentration error at pairing points.
Figure 10. Comparison of inversion errors in roi_2. (A) True Chla concentration at pairing points. (B) True Chla concentration error at pairing points.
Water 15 03864 g010
Table 1. Statistical results before and after data preprocessing.
Table 1. Statistical results before and after data preprocessing.
Data TypeBefore PreprocessingAfter Preprocessing
MinMaxMeanMinMaxMean
Rrs_412 (sr−1)−0.003540.019140.003360.000010.019140.00355
Rrs_443 (sr−1)−0.002010.023930.003270.000090.023930.00344
Rrs_469 (sr−1)−0.001290.029730.003730.000550.029730.00388
Rrs_488 (sr−1)−0.000730.031740.003780.000490.031740.00392
Rrs_531 (sr−1)0.0008830.027650.004150.000880.027650.00425
Rrs_547 (sr−1)0.0008460.025390.004180.001020.025390.00427
Rrs_555 (sr−1)0.0007950.023060.004030.001020.023060.00410
Rrs_645 (sr−1)−0.000470.014380.001560.000010.014380.00159
Rrs_667 (sr−1)−0.000410.012770.001270.000010.012770.00130
Rrs_678 (sr−1)−0.000320.012260.001300.000020.012260.00133
Chla (mg/m3)0.01958.0994.9450.01946.3504.708
Table 2. Numerical values of evaluation metrics on the validation.
Table 2. Numerical values of evaluation metrics on the validation.
AlgorithmR2SlopeRMSE (mg/m3)RMLSEBiasMAE
OCI0.8080.92322.1020.0890.8531.662
SVR0.8290.91416.5720.0821.0811.524
RFR0.8710.84912.5650.0621.0531.512
1DCNN0.8740.88818.9680.0601.1441.494
1DCNN/SVR0.8920.87911.2430.0521.0561.444
Table 3. Minimum, maximum, and average inversion errors (roi_1).
Table 3. Minimum, maximum, and average inversion errors (roi_1).
OCISVRRFR1DCNN1DCNN\SVR
Min−12.804−5.747−5.092−5.429−4.651
Max14.6196.8105.96812.94012.669
Average−1.416−0.296−0.130−0.154−0.190
Table 4. Minimum, maximum, and average inversion errors (roi_2).
Table 4. Minimum, maximum, and average inversion errors (roi_2).
OCISVRRFR1DCNN1DCNN\SVR
Min−11.695−2.096−4.446−3.254−2.667
Max11.1564.5531.50718.6118.173
Average−1.007−0.219−0.009−0.254−0.268
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zeng, Y.; Liang, T.; Fan, D.; He, H. A Novel Algorithm for the Retrieval of Chlorophyll a in Marine Environments Using Deep Learning. Water 2023, 15, 3864. https://doi.org/10.3390/w15213864

AMA Style

Zeng Y, Liang T, Fan D, He H. A Novel Algorithm for the Retrieval of Chlorophyll a in Marine Environments Using Deep Learning. Water. 2023; 15(21):3864. https://doi.org/10.3390/w15213864

Chicago/Turabian Style

Zeng, You, Tianlong Liang, Donglin Fan, and Hongchang He. 2023. "A Novel Algorithm for the Retrieval of Chlorophyll a in Marine Environments Using Deep Learning" Water 15, no. 21: 3864. https://doi.org/10.3390/w15213864

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop