Cervical Cancer Diagnosis Based on Multi-Domain Features Using Deep Learning Enhanced by Handcrafted Descriptors

Attallah, Omneya

doi:10.3390/app13031916

Open AccessArticle

Cervical Cancer Diagnosis Based on Multi-Domain Features Using Deep Learning Enhanced by Handcrafted Descriptors

by

Omneya Attallah

Department of Electronics and Communications Engineering, College of Engineering and Technology, Arab Academy for Science, Technology and Maritime Transport, Alexandria 1029, Egypt

Appl. Sci. 2023, 13(3), 1916; https://doi.org/10.3390/app13031916

Submission received: 26 December 2022 / Revised: 25 January 2023 / Accepted: 26 January 2023 / Published: 2 February 2023

(This article belongs to the Special Issue Artificial Intelligence (AI) in Healthcare)

Download

Browse Figures

Versions Notes

Abstract

:

Cervical cancer, among the most frequent adverse cancers in women, could be avoided through routine checks. The Pap smear check is a widespread screening methodology for the timely identification of cervical cancer, but it is susceptible to human mistakes. Artificial Intelligence-reliant computer-aided diagnostic (CAD) methods have been extensively explored to identify cervical cancer in order to enhance the conventional testing procedure. In order to attain remarkable classification results, most current CAD systems require pre-segmentation steps for the extraction of cervical cells from a pap smear slide, which is a complicated task. Furthermore, some CAD models use only hand-crafted feature extraction methods which cannot guarantee the sufficiency of classification phases. In addition, if there are few data samples, such as in cervical cell datasets, the use of deep learning (DL) alone is not the perfect choice. In addition, most existing CAD systems obtain attributes from one domain, but the integration of features from multiple domains usually increases performance. Hence, this article presents a CAD model based on extracting features from multiple domains not only one domain. It does not require a pre-segmentation process thus it is less complex than existing methods. It employs three compact DL models to obtain high-level spatial deep features rather than utilizing an individual DL model with large number of parameters and layers as used in current CADs. Moreover, it retrieves several statistical and textural descriptors from multiple domains including spatial and time–frequency domains instead of employing features from a single domain to demonstrate a clearer representation of cervical cancer features, which is not the case in most existing CADs. It examines the influence of each set of handcrafted attributes on diagnostic accuracy independently and hybrid. It then examines the consequences of combining each DL feature set obtained from each CNN with the combined handcrafted features. Finally, it uses principal component analysis to merge the entire DL features with the combined handcrafted features to investigate the effect of merging numerous DL features with various handcrafted features on classification results. With only 35 principal components, the accuracy achieved by the quatric SVM of the proposed CAD reached 100%. The performance of the described CAD proves that combining several DL features with numerous handcrafted descriptors from multiple domains is able to boost diagnostic accuracy. Additionally, the comparative performance analysis, along with other present studies, shows the competing capacity of the proposed CAD.

Keywords:

cervical cancer; pap smear test; discrete wavelet transform; grey-level co-occurrence matrix; gabor wavelet transform; convolutional neural networks

1. Introduction

Cervical cancer continues to be the fourth most common malignant tumor in women worldwide, accounting for 6.6 percent of the total of all female cancers diagnosed in 2018 [1]. Every year, more than 500 thousand women are diagnosed with cervical cancer, with less than 300 thousand dying from the disease worldwide [2]. In 2020, around 604,000 women were diagnosed with cervical cancer. Nearly 90% of almost 342,000 cervical cancer deaths in 2020 occurred in low- and middle-income countries [3]. More than 85% of cervical cancer patients live in developing countries, with Africa having the most occurrences. Cancer has a greater chance of being fatal in these countries [4]. This is due to a lack of knowledge about the disease and restricted healthcare availability. Developed countries, on the other hand, have strategies in place to permit accurate and efficient screening tools, allowing pre-cancerous lesions to be discovered and treated at an initial phase [5]. It is widely accepted that early identification and treatment of premalignant lesions could indeed inhibit cancer progression in nearly 90% of cervical cancer patients. As a result, detecting cervical cancer in its early stages is critical.

Based on detailed microscopic examination, the Pap smear test is regarded as a well-known screening tool for the detection of cervical pre-cancerous lesions or premalignant cells. Cervical cancer is diagnosed using either the standard approach or liquid-based cytology (LBC), which is strongly suggested by a clinician based on subjective clinical examination. An excellent Pap smear test report determines the level of the tumour, if there is any, and subsequently affirms the cervix cancer identification type based on The Bethesda System (TBS) [6]. Because LBC can produce a relatively clean and more homogeneous slide for microstructural analysis than the traditional techniques, it is confirmed to be a much more effective and convenient method than the standard approach [7,8]. Pap smear cell films may be characterized into various subgroups. The most difficult aspect of identifying these cells is that several of the cell groupings appear to be identical in terms of the size of the cell nucleus appearance. A thorough examination of such cells for tumor identification is liable to specialists’ expertise and cancer aetiology, leading to misdiagnosis in certain contexts and postponed treatment [9]. Consequently, professionals found routine examination is exhausting and prone to human error. To automate the classification process, a superior resolution to this issue is required.

Recent developments in computer technology have made it possible for pathologists and doctors to detect and diagnose several medical tumors and diseases using computer-aided diagnostic (CAD) systems. These automated systems are composed of two groups of families. The first group is the traditional CAD models based on standard machine learning approaches. Feature extraction is a critical step in any traditional CAD model to diagnose and classify abnormalities in medical images. Nevertheless, various groups of features, such as texture features, statistical features, and shape descriptors, must be extracted for better classification of normal and infected pap smear images. Texture features of an image give statistics about the spatial arrangement of occurrences. Texture descriptors, such as grey level co-occurrence matrix (GLCM), Gabor wavelets (GW), Discrete wavelet transform (DWT), and others, can be used to extract texture features [10]. Since those techniques focus solely on a single descriptor of the photo, they could fail with various kinds of images. A method that retrieves shape-based attributes, for instance, might not be capable of extracting other descriptors in the image that includes texture information.

The more recent group of CAD systems is based on modern deep learning (DL) techniques which are capable of automatically extracting deep features from medical images. DL techniques could be employed for various sorts of image attributes and characteristics. DL-based methods, in addition to traditional feature extraction techniques, can be used to obtain discriminatory features from raw image data. However, when it comes to classification issues, DL-based algorithms perform much better than conventional approaches [11]. Although some studies showed that co6mbining DL features with traditional handcrafted features could improve diagnostic performance. Among DL architectures, Convolutional neural networks (CNNs) have attained significant results in a wide range of health [12,13] and medical imaging applications in recent years [14,15], particularly in mammogram [16], facial images [17], histopathology [18,19,20,21,22,23], magnetic resonance imaging [24], fundus imaging [25], computed tomography scanning [26,27,28]. Motivated by the CNNs’ great success in several medical and health domains, they have been adopted in several CAD models for cervical cancer diagnosis. Large amounts of data are required for CNN models in order to avoid over-fitting and poor generalization. Because labeling cervical cell photos are challenging, transfer learning (TL) is intended to share the knowledge out of a source domain to a target domain in order to prevent over-fitting. CNN previously trained on ImageNet with TL can be applied to cell images [29,30,31].

This study proposes a CAD system based on multiple DL features and traditional handcrafted features. It employs three compact CNNs to obtain DL features from models of distinct structures. The introduced CAD also extracts statistical features, as well as textural descriptors including GLCM, DWT, and Gabor. It examines the impact of each group of features independently on diagnostic accuracy. Then it investigates the influence of fusing each DL feature set attained from every CNN with the combined handcrafted features. Finally, it merges the whole DL features with the combined handcrafted features, using principal component analysis to explore the effect of merging multiple DL features with various handcrafted features on classification accuracy.

The main contributions and novelty of the proposed CAD are as follows:

Developing an effective CAD based on multiple compact CNNs with lower deep layers and less parameters, and several handcrafted feature extraction approaches instead of using a single methodology like the existing CADs which employs a single CNN having huge deep layers and parameters or one handcrafted approach.
The proposed CAD does not need any pre-segmentation or enhancement steps which are required by several existing CADs.
Merging features from multiple domains including spatial DL features and texture features from the time-frequency domain such as DWT and Gabor Wavelets (GW) rather than utilizing one type of feature extraction method extracted from a single domain thus improving classification accuracy.
The proposed CAD also obtains texture GLCM features as well as statistical features from the time/spatial domain.
Examining the influence of blending multiple handcrafted features with each DL feature set retrieved from every single CNN independently which is not commonly used on existing CADs.
Aggregating the multiple DL feature sets with numerous handcrafted features via a feature reduction technique such as PCA to diminish the size of the features and lower the training duration.

2. Literature Review

An overview of relevant CAD systems that are utilized to analyze the pap smear images for diagnosing cervical cancer will be discussed in this section. This section is will first illustrate conventional CAD for cervical cancer diagnosis. Then it will discuss DL based CAD for cervical cancer diagnosis. Finally, it will demonstrate hybrid-based CADs.

2.1. Conventional CADs for Cervical Cancer Diagnosis

As mentioned before, conventional CAD models have utilized a classical ML approach that depends on extracting handcrafted descriptors from pap smear slides. Among them, the study [32] employed discrete cosine transform (DCT) and discrete wavelet transform (DWT) to retrieve features. Then, the fractional coefficient approach is used to reduce the dimension of these merged features. Finally, these reduced features are fed to seven ML classifiers to differentiate between different subgroups of cervical cancer leading to an accuracy of 81.11%. In another study [33], the authors used C-means clustering to segment cervical cells and then extracted texture features, including GLCM and geometrical descriptors from these cells. Subsequently, the authors used principal component analysis (PCA) to decrease the size of the features. Later, KNN was employed to classify cervical cells reaching an accuracy of 94.86%. Similarly, the research article [34] used C-means clustering to segment cervical cells and then attained shape and textural features such as the binary histogram Fourier algorithm (BHF). Next, the quantum-based grasshopper computing algorithm (QGH) was utilized to select features and apply these selected features as inputs to classifiers. Whereas the article [35] presented a CAD based on two phases. The first phase’s goal was to extract texture descriptors from the cytoplasm and nucleolus together. The pap smear slides were segmented using a thresholding method. Then, to describe the local textural features, a texture descriptor called modified uniform local ternary patterns (MULTP) was proposed. Second, these descriptors were fed to an artificial neural network where its parameters were optimized using a genetic algorithm attaining accuracy of 98.9%.

2.2. DL-Based CADs for Cervical Cancer Diagnosis

ML-based CAD models are more efficient and have lesser computational costs, but their accuracy is typically limited. Primary components and complementary clues may be neglected by extracting features and then selecting among them [36]. Given the difficult detection tasks of abnormal pap smear slides [37], focusing solely on hand-crafted features could be insufficient to obtain the interconnections of cell attributes. DL-based CAD methods, apart from ML-based techniques, are not hampered by drawbacks in extracting features and selecting descriptors. CNN approaches are the most commonly used DL techniques for image analysis [38]. A CAD model was presented [39], it is divided into three sections: cervical cell segmentation, DL-based cervical cell identification, and envisioned human-assisted classification. Images are first segmented employing sped-up robust features (SURF) and Otsu thresholding methodologies to retrieve cell images. Such photos are then forwarded to CompactVGG. Lastly, the envisioned human-assisted diagnosis layer accomplishes classification by incorporating the visualization performance and the classification results of all cell images. Conversely, using morphological operations, the study [40] image contents of cervical cancer slides. The fragmented photos were then subjected to the dual-tree complex wavelet transform (DTCWT). The DTCWT output was inputted into an altered ResNet-18 model, which achieved 97.98% accuracy. In contrast, the research article [41] presented an adapted firefly optimization technique with a DL algorithm. The suggested framework initially utilized a filtering method to eliminate noise. Furthermore, the influenced areas are identified via an entropy-based segmentation technique. The EfficientNet model was also employed to develop features. Finally, an image was classified using a Stacked Sparse Denoising Autoencoder (SSDA) method. Alternatively, the authors created a novel network for pap smear image analysis based on an adaptive pruning deep transfer learning approach [30]. The network was enhanced by altering the convolution layer and removing some convolution kernels that could interact with the intended classification problem. The highest level of accuracy achieved was 98%.

2.3. Hybrid Based CADs for Cervical Cancer Diagnosis

Conversely, other studies employed hybrid CNNs to perform the classification task. Likewise, the CAD proposed in [42] obtained deep features from four CNNs including ResNet-50, VGG-16, VGG-19, and Xception, and then concatenated them. While the study [8] employed six distinct CNN structures and combined predictions of the best three CNNs to form an ensemble classification system achieving an AUC of 97%. Alternatively, the study [43] collected high-level features from ShuffleNet and a custom-designed network named Cervical Net. Then, canonical component analysis (CCA) was employed to combine these features, resulting in 544 attributes. Such attributes were then used as inputs to several classifiers, resulting in a 99.1% accuracy. On the other hand, the study [44] implemented a CAD and named it CVM-Cervix, which merged DL features of Xception with a visual transformer and used them to train a multilayer perceptron classifier achieving an accuracy of 91.72%.

Most existing CAD relied on a single classification structure, such as DL or hand-crafted methods, which have heavy computational complexity and low accuracy. Few studies have combined both DL features with handcrafted descriptors. Among these studies, the paper [45] combined automated features obtained by VGG-16 and handcrafted features involving geometric and texture features. The study employed a correlation feature selection method to decrease features dimension and then an SVM classifier reaching an accuracy of 98.7%. Likewise, the study [46] merged 29 different attributes from several domains with deep features to improve classification results. The experimental results showed that merging features from multiple domains including handcrafted and deep features was capable of improving the F1-score by 3.2%.

3. Motivation

Despite the fact that DL models produce competitive results, the extracted features lack clear significance and artificial foreknowledge. The high reliance on massive data and manual labels increases the complexities in medical evaluation. Additionally, DL requires countless parameters to be updated and fine-tuned. Furthermore, the current methods are often solitary models operating in a single domain; however, hybrid systems operating in multiple domains that can accomplish the classification process more effectively are not widely used in the cervical cancer diagnosis literature [45,47,48], which inspired our work to develop a hybrid CAD based on DL and handcrafted descriptors for accurate cervical cancer diagnosis. Moreover, the majority of current methods relied on DL models that have many deep layers with a huge number of parameters that need high computational ability and extremely long training duration. Nevertheless, the proposed CAD is based on lightweight CNNs models with fewer layers and parameters. Furthermore, many existing CADs depend on obtaining deep features from a single CNN, however, retrieving deep features from CNN of various architectures is superior. Thus, the proposed CAD extracts multiple DL features from three CNNs having different structures. In addition, it employs a feature reduction method to fuse features obtained from the multiple domains (DL spatial features and handcrafted features) resulting in a lower number of features. In contrast to existing CAD, the proposed CAD does not depend on any image pre-segmentation or enhancement steps.

4. Materials and Methods

4.1. Mendeley LBC Pap Smear Slides Dataset

Among the popular cervical screening tools is LBC. The dataset acquired in Mendeley LBC [49] includes 963 photos split into 4 classes that represent the different classes of pre-cancerous and cancerous tumors of cervical cancer according to The Bethesda System. The no intraepithelial malignancy, or normal, class provides 613 photos, whereas the anomalous category contains the remainder of 350 photos (as shown in Figure 1). These pap smear slides are acquired from 460 infected cases at 40× magnification factor and afterward gathered and arranged utilizing LBC methodology.

4.2. Design of the Introduced CAD

The introduced CAD has a series of five steps involving pap smear image preparation, DL feature extraction, handcrafted descriptors mining, multi-domains feature combination and reduction, and diagnosis. In the first step, pap smear slides passed through several preparation phases, such as dimension alteration and augmentation. Subsequently, three compact pre-trained CNNs were constructed and retrained using these images. Spatial DL features were obtained from these CNNs. In the meantime, these images were also employed to retrieve handcrafted features, including several texture features and statistical features from multiple domains. Next, those handcrafted features were integrated along with the multiple DL features using PCA to lower their dimension in the multi-domains feature combination and reduction step. Ultimately, multiple SVM classifiers were applied to these decreased descriptors to perform the diagnosis procedure. The workflow of the introduced CAD is displayed in Figure 2.

4.2.1. Pap Smear Image Preparation

Pap smear training images were initially augmented to increase the amount of data available in the training procedure. Augmentation is an important procedure that is usually performed to improve the learning procedure of the DL models and prevent overfitting. There are numerous methods for augmentation, in the introduced CAD, flipping, rotation, scaling, and shearing techniques were employed. Next, the dimensions of the augmented images, as well as the original images of the entire Mendeley LBC dataset, were changed to fit the input layers size of the three CNNs which is equal to 224 × 224 × 3. These CNNs included MobileNet, ShuffleNet, and ResNet-18. CNN models’ effectiveness is because of the complexity and intensity of their structure. The quantity of parameters included in the model grows in proportion to the model’s complexity [50]. During the learning phase, presented CNN models involved huge amounts of hyperparameter adjustments. Nevertheless, the substantial number of parameters could lower the network’s generalization performance and end up causing overfitting [51]. The reduction of parameters and layers by using compact DL models are ways to prevent overfitting caused by the model’s complexity [22,50]. As a result, three compact DL models were utilized in this study.

MobileNet has been used in this study since it is a compact CNN with fewer parameters and deeper layers that can accomplish accurate results despite being compact. The depth-wise separable convolution [52] is the fundamental basis of the MobileNet layout. The main benefit of depth separable convolution compared to standard convolution is the requirement for less computation time when dealing with huge and complex convolutional networks [53]. Similarly, ShuffleNet is a compact CNN, however, it employs channel shuffle and pointwise group convolution to decrease computation expense while retaining precision. When training ShuffleNet with the ImageNet dataset, it attained less top-1 error than the MobileNet and obtained a 13× real increase in speed over AlexNet while preserving similar performance [54]. On the other hand, ResNet-18 basic element is the residual module. These residual blocks are using a skip linkage or “shortcut” among every two layers in addition to direct links between all layers. The above enables the network to consider taking activation through one layer and give it into another deep layer in the CNN, thereby enduring the network’s learning hyper-parameters in deeper layers.

4.2.2. DL Feature Extraction

Building a CNN from scratch requires a huge amount of data and updating this network’s enormous number of parameters, which increases the complexity of the training procedure. Instead, transfer learning (TL) can be used to solve these challenges. TL is a prevalent ML method that permits the reusability of an effectual CNN model constructed to deal with one problem using a huge dataset like ImageNet, as a preliminary step for handling some other classification issue in a relevant area. TL could significantly reduce the requirement for enormous computation power and model development time [55]. For this reason, TL was employed to reuse three CNNs that were formally trained on ImageNet to tackle the problem of a cervical cancer diagnosis. Initially, TL was utilized to alter the total sum of fully connected (FC) layers of each of the MobileNet, ShuffleNet, and ResNet-18 compact CNNs to be equal to 4 corresponding to the categories of the Mendeley LBC dataset. Afterward, these CNNs were relearned with the slides of the Mendeley LBC dataset. When the relearning process was done, TL was further applied to extract DL features from the final FC layers of each DL structure. Each CNN was made up of several deep layers; preliminary layers discovered basic components from a photo, while fairly late deep layers learned high-level detailed characteristics out of the photo. As a result, the very last FC layers before the softmax layer was chosen to retrieve feature representations. The length of features obtained from every CNN was 4.

4.2.3. Handcrafted Descriptors Mining

Several textural descriptors were mined from pap smear photos from the spatial and time-frequency domains. Spatial domain features include GLCM, whereas time–frequency features involve DWT as well as GW. Additionally, numerous statistical features were obtained from the spatial domain. This section will describe the methods of feature extraction.

Spatial Statistical and Texture Features

A technique for extracting statistical features from a signal or image is known as statistical explanatory feature extraction. Nine statistical features, including variance, Root Mean Square (RMS), kurtosis, entropy, mean, skewness, Inverse Difference Moment (IDM) [56], smoothness, and standard deviation (std), were among the statistical variables. Moreover, four texture features were calculated from the spatial domain involving contrast, energy, correlation, and homogeneity. Equations used to extract these features from the pap smear slides are shown below (1)–(17).

M e a n (μ) = \frac{1}{N M} \sum_{i, j = 1}^{N M} A (i, j)

(1)

V a r i a n c e = \frac{1}{(N - 1) (M - 1)} \sum_{i, j = 1}^{M N} {(A (i, j) - μ)}^{2}

(2)

S t d (σ) = \sqrt (V a r i a n c e)

(3)

S k e w n e s s = \frac{1}{M N} \sum_{i, j = 1}^{M N} {[\frac{A (i, j) - μ}{σ}]}^{3}

(4)

E n t r o p y = - \sum_{g = 0}^{G - 1} p r_{g} \times \log p r_{g}

(5)

I D M = \sum_{i}^{M} \sum_{j}^{N} \frac{1}{1 + {(i, j)}^{2}} A (i, j)

(6)

K u r t o i s i s (A_{1} \dots \dots A_{N}) = {\frac{1}{M N} \sum_{i, j = 1}^{M N} {[\frac{A (i, j) - μ}{σ}]}^{4}} - 3

(7)

R M S = \sum_{i}^{M} \sum_{j}^{N} P (i, j) (\frac{\sqrt{\sum_{i, j = 1}^{M N} {| µ_{i, j} |}^{2}}}{G^{2}})

(8)

S m o o t h n e s s = 1 - \frac{1}{1 + \sum_{i}^{M} \sum_{j}^{N} A (i, j)}

(9)

C o n t r a s t = \sum_{g = 0}^{G - 1} n^{2} {\sum_{i}^{M} \sum_{J}^{N} p r_{g} (i, j)}

(10)

C o r r e l a t i o n = \sum_{i}^{M} \sum_{j}^{N} \frac{{i * j} * p r_{g} (i, j) - {μ_{x} * μ_{y}}}{σ_{x} * σ_{y}}

(11)

E n e r g y = \sum_{i}^{M} \sum_{j}^{N} {(p r_{g} (i, j))}^{2}

(12)

H o m o g e n e i t y = \sum_{i}^{M} \sum_{j}^{N} \frac{p r_{g} (i, j)}{1 + | i - j |}

(13)

where A(i,j) is the pixel value at location i and j in an image, µ is the mean, G is the number of grey levels.

p r_{i}

is the probability of a pixel having gray level g, N and M are the length and width of the image.

Grey Level Co-Occurrence Matrix Texture Features

The GLCM strategy is a 2nd order statistical method that calculates the intensity of adjacent pixels in an image that has identical grey-levels and applies extra information acquired from spatial pixel relations [57]. A co-occurrence matrix was used to retrieve textural details about grey-level transfers among two pixels. This co-occurrence matrix illustrated the common distribution of gray-level pairs of adjacent pixels given a spatial association described between pixels in a texture. As a result, changing the spatial correlation yields matrices with various data (different directions or distances between pixels). Such matrices were used to extract descriptors. The dimension of the co-occurrence matrix was determined solely by the texture’s grey levels and not by the image size [58]. The four orientations used in this study were 0, 45, 90, and 135, and the number of grey levels was 8. Four GLCM texture features were determined involving contrast, correlation, energy, and homogeneity; the same statistical features were also calculated from the co-occurrence matrix of GLCM.

C o n t r a s t = \sum_{g = 0}^{G - 1} n^{2} {\sum_{i}^{G} \sum_{J}^{G} P (i, j)}, | i - j | = g

(14)

C o r r e l a t i o n = \sum_{i}^{G} \sum_{j}^{G} \frac{{i * j} * P (i, j) - {μ_{x} * μ_{y}}}{σ_{x} * σ_{y}}

(15)

E n e r g y = \sum_{i}^{G - 1} \sum_{j}^{G - 1} {(P (i, j))}^{2}

(16)

H o m o g e n e i t y = \sum_{i}^{G - 1} \sum_{j}^{G - 1} \frac{P (i, j)}{1 + | i - j |}

(17)

where P(i,j) is a marginal joint probability gray-level co-occurrence matrix. x and y are two adjacent pixels.

Discrete Wavelet Transform Textural Features

The discrete wavelet transforms (DWT) a popular image processing technique, analyses images in both time-frequency domains. DWT utilizes filter banks made up of several filters to break down images into low and high pass elements [59]. The low pass portion includes details about slowly changing image attributes, whereas the high pass part provides details about dramatic shifts in image characteristics. The coefficients obtained by applying low-pass filtering to both the rows and columns of the image are called low-low (LL). Such coefficients reflect the entire energy within photos. Whereas, if low pass filtration is imposed on the row values and high pass filtration is implemented to the column values, the resulting coefficients are named high-low HL which include the image’s vertical information. While the low-high (LH) coefficients are produced by high pass filtering of rows and low pass filtering of columns that encompass the image’s horizontal description. Finally, high pass filtration, including the row and column qualities yields the HH coefficients, that hold the image’s diagonal description. To obtain the following stiffer scale of wavelet coefficients, decomposition was performed on sub-band LL. In the current study, the “Haar” wavelet function was employed, and the number of decomposition levels is 4. The fourth LL sub-band was further analyzed using GLCM and then the previous 13 features were calculated after this analysis (4 GLCM features and 9 statistical features).

Gabor Wavelet Transform Textural Features

The Gabor wavelet (GW) transform is a widely known feature extraction methodology. The outcomes of the GW transform include both real and imaginary components. Because it can yield discriminatory details in a variety of orientations and dimensions, GW is commonly utilized in the area of medical image analysis. By converging the Gabor kernels with the image, GW sub-bands showing different levels and directions are produced [60,61], which is its main advantage. Therefore, GW transform was used in this study to analyze pap smear images and extract textural descriptors. The number of features obtained after GW was 42.

4.2.4. Multi-Domains Feature Combination and Reduction

The multidomain features obtained in the previous step were combined in three scenarios. First, the whole handcrafted descriptors were concatenated. Next, the combined handcrafted descriptors were concatenated independently with each DL feature set acquired from every CNN. Later, the three DL feature sets retrieved from the three CNNs were fused with the merged handcrafted features using PCA. PCA is an unsupervised statistical method for obtaining information from multivariate sets of data. Such procedure is accomplished by determining the principal components (PC), that represent the linear mixtures of the genuine attributes. The initial principal component depicts the greatest variability of the authentic multivariate dataset, while the second describes the highest variances of the remaining data set. Whereas the third would then explain the most significant variability in the subsequent leftover dataset, and so on. In multidimensional data space, the eigenvalues of the entire PCs are orthonormal to each other, based on the hypothesis of least squares. For this reason, PCA was employed to fuse multi-domain features where a reduced set of features were generated after PCA.

4.2.5. Diagnosis

For the diagnosis step, several SVM classifiers of different kernels were adapted to classify pap smear images. These kernels include linear, quadratic, cubic, and gaussian. Note that, the diagnosis procedure was executed in three configurations. The former configuration corresponded to using individual and combined handcrafted features to train the SVMs. Afterward, the following configuration involved the procedure where SVM was fed with each DL feature set individually concatenated with the combined handcrafted features. In the final configuration, the SVMs were constructed with the fused features of PCA obtained after combining all DL features along with the combined handcrafted features, 5-fold cross-validation methodologywais employed to validate the performance of the proposed CAD.

4.3. Evaluation Indices and Networks Hyper-Parameters

4.3.1. Evaluation Indices

To consider the efficiency of the suggested CAD, four indices were computed: True Positive (TP), True Negative (TN), False Negative (FN), and False Positive (FP). Such indices describe the example numbers that are perfectly or incorrectly recognized as positives or negatives. The obtained indices are used to compute evaluation metrics, such as sensitivity, specificity, accuracy, F1-score, precision, and Mathew correlation coefficient (MCC). The following equation describes the evaluation metrics:

A c c u r a c y = \frac{T P + T N}{T N + F P + F N + T P}

(18)

S e n s i t i v i t y = \frac{T P}{T P + F N}

(19)

P r e c i s i o n = \frac{T P}{T P + F P}

(20)

M C C = \frac{T P \times T N - F P \times F N}{\sqrt{(T P + F P) (T P + F N) (T N + F P) (T N + F N)}}

(21)

F 1 - S c o r e = \frac{2 \times T P}{(2 \times T P) + F P + F N}

(22)

S p e c i f i c i t y = \frac{T N}{T N + F P}

(23)

4.3.2. Networks Hyper-Parameters

The three CNNs had some hyper-parameters that were adjusted before the re-training procedure. The mini-batch dimension was chosen to be 4, the iterations of epochs were 30, the learning pace was 0.0001, and, finally, the frequency of validation was 169. The stochastic gradient descent with momentum approach was utilized to learn the three CNNs. In reference [62], it was revealed that increasing the batch size diminished the CNN model’s effectiveness, as measured by the network’s generalisability. In both the testing and training procedures, large batch dimensions usually corresponded to sharp minimizers. Sharp minima decreased the generalization of the results. Conversely, a very small size commonly converged to soft minimizers and generally attained the highest generalization performance [63], so it was selected to be only 30. While functioning toward the least error function, the learning rate indicated the step size at every iteration. High learning rates permitted the model to train speedily, but at the expense of an unsatisfactory final set of weights. Lower learning rates, on the other hand, may have allowed the model to comprehend a slightly more optimal, or even globally optimal, set of weights—resulting in a reasonably long training period. Besides that, high learning rates will lead to significant weight updates, changing the model’s efficiency remarkably over the training phase. Weight deviation causes varying performance. Slow learning rates, on the other hand, may never converge or even become able to stick at a suboptimal remedy. As a result, the learning rate was set to 0.0001 in the experimental observations, a value that is not too low or too high. Furthermore, the validation frequency was adjusted to 169 to determine the validation error only once at the end of each training epoch.

5. Results

This section illustrates the three diagnostic configurations executed in the diagnosis step. First, the results of SVM classifiers fed with the individuals and combined handcrafted feature sets were demonstrated (Configuration I). Next, the performance of the SVM classifiers separately trained with each DL feature set obtained for every CNN was compared to that when concatenated with the combined handcrafted features (Configuration II). Finally, the whole DL feature sets of the three CNNs were fused with the combined handcrafted features using PCA to lower their dimension.

5.1. Configuration I SVM Classifiers Performance

Configuration I SVM classifiers’ performance is illustrated in this section. Table 1 shows a comparison among four SVM classifiers constructed with each handcrafted descriptor set independently with the fusion of all handcrafted features obtained from the multiple domains. The table demonstrates that among individually handcrafted descriptors, the SVM classifiers learned with the statistical features obtained from the spatial domain achieved the highest accuracy of 85.7%, 92.8%, 91.0%, and 90.2% for linear, quadratic, cubic, and gaussian SVMs, respectively. In contrast to features extracted using DWT, GLCM, and GW, features based on statistics yielded far fewer relevant, non-redundant, easy to interpret, and distinctive features [64], therefore statistical features obtained higher results. However, these accuracies were boosted to reach 87.8%,94.2%, 95.1%, and 91.3% using the same classifiers when trained with the combined handcrafted descriptors. The studies [33,34] showed that combining multiple handcrafted features can lead to the enhancement of the performance of classification. Thus, these results verify that fusing handcrafted features from multiple domains can improve the results and is superior to using features from a single domain.

5.2. Configuration II SVM Classifiers Performance

The results of concatenating each DL feature set with the combined handcrafted features are shown in Table 2. The results in Table 2 indicate that merging the DL features of each CNN with the combined handcrafted features enhanced the diagnostic performance. This is clear as the accuracies attained after integrating both feature types were 99.1%, 99.2%, 99.2%, and 98.8% for linear, quadratic, cubic, and gaussian SVMs, respectively, (trained with the DL features of MobileNet and the combined handcrafted features) which are better than those attained before fusion. Similarly, when fusing the DL features of DarkNet-19 and the combined handcrafted features, the accuracies reached 99.2%, 99.6%, 99.6%, and 97.9 % for the same classifiers. Additionally, when incorporating the DL features of ResNet-18 and the combined handcrafted features, both the quadric and cubic SVMs have higher accuracies than that obtained by using either group of features. Note these accuracies are attained with 85 features. As indicated in studies [65,66,67], fusing DL features of an individual CNN with multiple handcrafted features improved classification accuracy. For this reason, the results in Table 2 confirm this statement of combining deep features obtained from one CNN with ensemble handcrafted features boosts classification performance.

5.3. Configuration III SVM Classifiers Performance

The performance of the SVMs trained with the fusion of the three DL feature sets with the combined handcrafted features using PCA is shown in this section. An ablation study was made to show the number of PC versus the diagnostic accuracy and is displayed in Table 3. As illustrated in Table 3, for the linear SVM the highest accuracy of 99.7% was attained with 20 PCs, whereas for quadratic SVM, the peak accuracy of 100% was achieved with 35 PCs. On the other hand, for the cubic SVM, the maximum accuracy of 99.9% was accomplished by 40 PCs. In the case of gaussian SVM, 99.3% peak accuracy was reached with 10 PCs. These accuracies were greater than those obtained in the previous scenarios, except for the gaussian SVM. The studies [45,68] showed that combining multiple CNN features with multi-domain handcrafted features is capable of enhancing the diagnostic performance, therefore the results demonstrated in Table 3 were better than those obtained in Table 1 and Table 2, which were based on either using handcrafted features only, or merging deep features attained from a single CNN with handcrafted features. This is because using hand-crafted features may only prevent the classification model from mining the interrelatedness of cervical cancer features. Furthermore, the current CADs are mostly relying on single models working in the time/spatial domain, while hybrid approaches operating in multiple domains can attain better performance [45]. On the other hand, despite the fact that DL models produce promising results, depending on deep features alone has limitations. This is because the obtained deep features lose physical significance and arbitrary previous understanding. The CNN structure has many parameters and is heavily reliant on massive data and manual labels, which adds extra to the challenges in medical applications [45].

The confusion matrices for the linear, quadratic, cubic, and gaussian SVMs trained with the fused DL features and the combined handcrafted features via PCA are shown in Figure 3. The receiving operating characteristics (ROC) curve and the area under ROC (AUC) for the quadratic SVM classifier trained with the 35 PCs are shown in Figure 4. Figure 4 shows that the AUC is equal to 1.

More evaluation indices were used to access the performance of the presented CAD and are revealed in Table 4. Such indices involve sensitivity, specificity, accuracy, F1-score, precision, and MCC. Table 4 demonstrates that all of these metrics were equal to 1 for the quadratic SVM. These results prove that the proposed system is reliable.

5.4. Comparison between Configurations Results

This section displays a comparison between the highest accuracy reached in each configuration along with the number of features used to train the SVMs. Figure 5 shows that the maximum accuracy attained in configuration II was higher than the one obtained in configuration I. These results prove that fusing multiple handcrafted features acquired from several domains with a single DL feature set could improve diagnostic performance. In addition, the peak accuracy attained in configuration III was greater than that achieved in configuration II. This confirms that fusing numerous DL features with multiple handcrafted descriptors mined from numerous domains with PCA was superior to employing either one individual DL feature set of a single CNN or one set of handcrafted features. The performance of configuration III also verifies that PCA is capable of reducing the dimension of features with an enhancement in diagnostic accuracy.

6. Discussion

This study proposes a CAD for the automated diagnosis of cervical cancer from pap smear LBC data. The presented CAD system is built via multiple CNNs as an alternative to utilizing individual CNN which benefit from their distinct constructions advantages. Instead of using features from a single domain. The introduced CAD extracts several descriptors from multiple domains. It retrieves multiple DL features from the spatial domain. Furthermore, it obtains numerous statistical and textural handcrafted features from spatial and time-frequency domains including DWT, GW transform, and GLCM. The diagnosis procedure is carried out in three configurations. The first one corresponds to training the SVMs with individual and combined handcrafted features. After that, the second configuration involves feeding SVM with each DL feature set individually concatenated with the combined handcrafted features. In the final configuration, the SVMs are built with the fused PCA features obtained by combining all DL features as well as the combined handcrafted features.

6.1. Comparative Performance Analysis

To prove the efficiency of the introduced CAD and its competing capacity, a comparison was executed to compare its performance with the state-of-the-art methods for cervical cancer diagnosis using the Mendeley LBC dataset. The methods described in Table 5 are summarized as follows. The study [69] cropped each photo in the database and used such clipped photos to learn DarkNet-19 and DarkNet-53 individually. Because the attributes extracted out of these CNNs were large, they were fed separately to neighborhood component analysis (NCA) to reduce their sizes. Lastly, the above lowered features were then utilized to construct an SVM classifier, which achieved accuracies of 98.26% and 99.47% for two datasets. In addition, the authors of reference [70] proposed a CAD that utilized a two-step dimension reduction strategy that utilized PCA and the grey wolf optimizer (GWO) that diminished features extracted from numerous CNN architectures. The lowered attributes were used to train an SVM classifier, which generated final predictions. Whilst the reference [71] procured confidence values from the Inception, MobileNet, and InceptionResNet CNNs, and then gathered such ratings using a fuzzy distance-based hybrid approach with multiple distance measures. On the other hand, the study [72] utilized three CNNs, including Inception V3, MobileNet V2, and Inception ResNet V2, with extra layers to discover data-specific attributes. The authors suggest a new ensemble methodology based on the reducing of error values to combine the results of such models using three distance measures. To calculate the final predictions, the authors defuzzified the above distance metrics using the product rule reaching an accuracy of 99.23%. Conversely, in [73], a cervical cell image creation model (CCG-taming transformers) and a classification model that used Tokens-to-Token Vision Transformers (T2T-ViT) with transfer learning was introduced, which attained 98.89% accuracy. Whereas the study [74] employed GoogleNet and ResNet Individually to extract features. Genetic algorithm was then utilized to select features. The 720 selected features were then used to train SVM classifier, reaching an accuracy of 99.07%.

The results included in Table 5 confirm the superior capacity of the introduced CAD over other state-of-the-art methods. it is obvious from the table that the proposed CAD achieved greater performance measures than all of the competing approaches. The superiority of the proposed CAD is due to using multiple compact CNNs to extract deep features which is not the case in the studies [69,70,71,72,73,74], which employed individual CNNs with more deep layers and huge parameters. Additionally, it combined features from multiple domains including time/spatial and time-frequency domains in contrast to methods that relied on extracting features from a single domain [69,70,71,72,73,74]. In addition, it did not necessitate the use of any pre-segmentation process to reach accurate results like the study [69]. Furthermore, it accomplished better results than studies [69,70,74] which used complicated methods to reduce features like genetic algorithms. These results were attained even with a lower number of features (35 PCs) compared to 1000, 796, and 730 used in [69,70,74]. In addition, the results show that merging multiple DL features from several CNNs with different handcrafted features from multiple domains utilized by the proposed CAD is better than using features from a single domain as done by previous studies [69,70,71,72,73,74].

6.2. Limitations and Upcoming Work

The described CAD has numerous limitations such as it does not apply any fine-tuning or optimization in the CNNs. Furthermore, it is presently only used for the classification of pap smear images and has not been tested on other tasks. Furthermore, one of the study’s constraints is that no correlation analysis was performed to figure out the link between deep learned/handcrafted representations and clinical findings in pap smear slides. Future directions will tackle these issues by first utilizing an optimization approach to fine-tune DL models’ hyper-parameters. Second, upcoming work will conduct a correlation analysis to find out the association among handcrafted descriptors/DL features and clinical interpretations. In addition, the implementation of the presented CAD will be tested on new datasets of different imaging modalities and other classification problems.

7. Conclusions

Cervical cancer is the second most common cancer in women globally. It is critical to discover it as early as possible using low-cost, high-accuracy smart health monitoring systems, particularly in countries that have restricted medical resources. The complications of cervical cancer are extremely concerning. Healthcare experts have made considerable attempts to tackle this issue. Nevertheless, due to the growing population, it is indeed critical to investigate CAD methods in order to reduce the likelihood of human mistakes. Hence, this study proposed an automatic CAD to diagnose cervical cancer. The described CAD system was constructed using numerous CNNs rather than a single CNN, which gains from their distinguishable construction benefits. Rather than using attributes from a specific domain. The CAD provided here retrieves several descriptors from various domains. It obtains a vast group of DL features from the spatial domain. It also acquires a plethora of statistical and textural handcrafted features from the spatial and time-frequency domains, such as the DWT, GW transform, and GLCM. The diagnostic procedure is implemented in three different ways. The first is associated with training SVMs with independent and combined handcrafted features. The second configuration entails feeding SVM using each DL feature set separately and aggregated with the combined handcrafted features. The SVMs are built in the final configuration with the conjoined PCA features created by merging the entire DL features in addition to the combined handcrafted attributes. The accuracies obtained in configuration II were superior to the ones achieved in configuration I. These findings show that combining multiple handcrafted features from different domains with a single DL feature set can improve diagnostic performance. Furthermore, the performance attained in configuration III was better than the one in configuration II. This demonstrates that incorporating multiple DL features with numerous handcrafted descriptors mined from multiple domains using PCA outperforms whether utilizing one individual DL feature set of a single CNN or one set of handcrafted features. The result of configuration III furthermore demonstrates that PCA is able to decrease feature dimension while improving diagnostic accuracy. These findings and comparisons indicate that the proposed CAD based on multiple domains of feature fusion is effective for distinguishing between different kinds of cervical cancer photos.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset involved in this study could be found at: https://data.mendeley.com/datasets/zddtpgzv63/4. accessed on 8 August 2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lu, J.; Song, E.; Ghoneim, A.; Alrashoud, M. Machine Learning for Assisting Cervical Cancer Diagnosis: An Ensemble Approach. Future Gener. Comput. Syst. 2020, 106, 199–205. [Google Scholar] [CrossRef]
Jemal, A.; Bray, F.; Center, M.M.; Ferlay, J.; Ward, E.; Forman, D. Global Cancer Statistics. CA Cancer J. Clin. 2011, 61, 69–90. [Google Scholar] [CrossRef] [PubMed]
Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
Jemal, A.; Center, M.M.; DeSantis, C.; Ward, E.M. Global Patterns of Cancer Incidence and Mortality Rates and TrendsGlobal Patterns of Cancer. Cancer Epidemiol. Biomark. Prev. 2010, 19, 1893–1907. [Google Scholar] [CrossRef]
Shariff, A.; Kangas, J.; Coelho, L.P.; Quinn, S.; Murphy, R.F. Automated Image Analysis for High-Content Screening and Analysis. J. Biomol. Screen. 2010, 15, 726–734. [Google Scholar] [CrossRef]
Nayar, R.; Wilbur, D.C. The Bethesda System for Reporting Cervical Cytology: A Historical Perspective. Acta Cytol. 2017, 61, 359–372. [Google Scholar] [CrossRef]
Zhu, J.; Norman, I.; Elfgren, K.; Gaberi, V.; Hagmar, B.; Hjerpe, A.; Andersson, S. A Comparison of Liquid-Based Cytology and Pap Smear as a Screening Method for Cervical Cancer. Oncol. Rep. 2007, 18, 157–160. [Google Scholar] [CrossRef]
Hussain, E.; Mahanta, L.B.; Das, C.R.; Talukdar, R.K. A Comprehensive Study on the Multi-Class Cervical Cancer Diagnostic Prediction on Pap Smear Images Using a Fusion-Based Decision from Ensemble Deep Convolutional Neural Network. Tissue Cell 2020, 65, 101347. [Google Scholar] [CrossRef]
Birdsong, G.G. Automated Screening of Cervical Cytology Specimens. Hum. Pathol. 1996, 27, 468–481. [Google Scholar] [CrossRef]
Naz, J.; Sharif, M.; Raza, M.; Shah, J.H.; Yasmin, M.; Kadry, S.; Vimal, S. Recognizing Gastrointestinal Malignancies on WCE and CCE Images by an Ensemble of Deep and Handcrafted Features with Entropy and PCA Based Features Optimization. Neural Process. Lett. 2021, 1–26. [Google Scholar] [CrossRef]
Sigirci, I.O.; Albayrak, A.; Bilgin, G. Detection of Mitotic Cells in Breast Cancer Histopathological Images Using Deep versus Handcrafted Features. Multimed. Tools Appl. 2022, 81, 13179–13202. [Google Scholar] [CrossRef]
Attallah, O. ECG-BiCoNet: An ECG-Based Pipeline for COVID-19 Diagnosis Using Bi-Layers of Deep Features Integration. Comput. Biol. Med. 2022, 142, 105210. [Google Scholar] [CrossRef] [PubMed]
Attallah, O. An Intelligent ECG-Based Tool for Diagnosing COVID-19 via Ensemble Deep Learning Techniques. Biosensors 2022, 12, 299. [Google Scholar] [CrossRef] [PubMed]
Attallah, O. GabROP: Gabor Wavelets-Based CAD for Retinopathy of Prematurity Diagnosis via Convolutional Neural Networks. Diagnostics 2023, 13, 171. [Google Scholar] [CrossRef]
Singha, A.; Thakur, R.S.; Patel, T. Deep Learning Applications in Medical Image Analysis. In Biomedical Data Mining for Information Retrieval: Methodologies, Techniques and Applications; Scrivener Publishing LLC: Beverly, MA, USA, 2021; pp. 293–350. [Google Scholar]
Ragab, D.A.; Sharkas, M.; Attallah, O. Breast Cancer Diagnosis Using an Efficient CAD System Based on Multiple Classifiers. Diagnostics 2019, 9, 165. [Google Scholar] [CrossRef]
Attallah, O. A Deep Learning-Based Diagnostic Tool for Identifying Various Diseases via Facial Images. Digital Health 2022, 8, 20552076221124430. [Google Scholar] [CrossRef]
Attallah, O. MB-AI-His: Histopathological Diagnosis of Pediatric Medulloblastoma and Its Subtypes via AI. Diagnostics 2021, 11, 359–384. [Google Scholar] [CrossRef]
Attallah, O. CoMB-Deep: Composite Deep Learning-Based Pipeline for Classifying Childhood Medulloblastoma and Its Classes. Front. Neuroinformatics 2021, 15, 663592. [Google Scholar] [CrossRef]
Attallah, O.; Zaghlool, S. AI-Based Pipeline for Classifying Pediatric Medulloblastoma Using Histopathological and Textural Images. Life 2022, 12, 232. [Google Scholar] [CrossRef]
Attallah, O.; Anwar, F.; Ghanem, N.M.; Ismail, M.A. Histo-CADx: Duo Cascaded Fusion Stages for Breast Cancer Diagnosis from Histopathological Images. PeerJ Comput. Sci. 2021, 7, e493. [Google Scholar] [CrossRef]
Attallah, O.; Aslan, M.F.; Sabanci, K. A Framework for Lung and Colon Cancer Diagnosis via Lightweight Deep Learning Models and Transformation Methods. Diagnostics 2022, 12, 2926. [Google Scholar] [CrossRef] [PubMed]
Ghanem, N.M.; Attallah, O.; Anwar, F.; Ismail, M.A. AUTO-BREAST: A Fully Automated Pipeline for Breast Cancer Diagnosis Using AI Technology. In Artificial Intelligence in Cancer Diagnosis and Prognosis, Volume 2: Breast and Bladder Cancer; IOP Publishing: Bristol, UK, 2022. [Google Scholar]
Attallah, O.; Ragab, D.A. Auto-MyIn: Automatic Diagnosis of Myocardial Infarction via Multiple GLCMs, CNNs, and SVMs. Biomed. Signal Process. Control 2023, 80, 104273. [Google Scholar] [CrossRef]
Attallah, O. DIAROP: Automated Deep Learning-Based Diagnostic Tool for Retinopathy of Prematurity. Diagnostics 2021, 11, 2034. [Google Scholar] [CrossRef] [PubMed]
Attallah, O. Deep Learning-Based CAD System for COVID-19 Diagnosis via Spectral-Temporal Images. In Proceedings of the 2022 the 12th International Conference on Information Communication and Management, London, UK, 13–15 July 2022; pp. 25–33. [Google Scholar]
Attallah, O.; Samir, A. A Wavelet-Based Deep Learning Pipeline for Efficient COVID-19 Diagnosis via CT Slices. Appl. Soft Comput. 2022, 128, 109401. [Google Scholar] [CrossRef] [PubMed]
Attallah, O. RADIC: A Tool for Diagnosing COVID-19 from Chest CT and X-Ray Scans Using Deep Learning and Quad-Radiomics. Chemom. Intell. Lab. Syst. 2023, 233, 104750. [Google Scholar] [CrossRef]
Khobragade, V.; Jain, N.; Sisodia, D.S. Deep Transfer Learning Model for Automated Screening of Cervical Cancer Cells Using Multi-Cell Images. In Proceedings of the International Conference on Applied Informatics, Ota, Nigeria, 29–31 October 2020; Springer: Cham, Switzerland, 2020; pp. 409–419. [Google Scholar]
Wang, P.; Wang, J.; Li, Y.; Li, L.; Zhang, H. Adaptive Pruning of Transfer Learned Deep Convolutional Neural Network for Classification of Cervical Pap Smear Images. IEEE Access 2020, 8, 50674–50683. [Google Scholar] [CrossRef]
Chen, W.; Li, X.; Gao, L.; Shen, W. Improving Computer-Aided Cervical Cells Classification Using Transfer Learning Based Snapshot Ensemble. Appl. Sci. 2020, 10, 7292. [Google Scholar] [CrossRef]
Kalbhor, M.; Shinde, S.V.; Jude, H. Cervical Cancer Diagnosis Based on Cytology Pap Smear Image Classification Using Fractional Coefficient and Machine Learning Classifiers. ℡KOMNIKA (Telecommun. Comput. Electron. Control) 2022, 20, 1091–1102. [Google Scholar] [CrossRef]
Lavanya Devi, N.; Thirumurugan, P. Cervical Cancer Classification from Pap Smear Images Using Modified Fuzzy C Means, PCA, and KNN. IETE J. Res. 2021, 68, 1591–1598. [Google Scholar] [CrossRef]
Mahmoud, H.A.H.; AlArfaj, A.A.; Hafez, A.M. A Fast Hybrid Classification Algorithm with Feature Reduction for Medical Images. Appl. Bionics Biomech. 2022, 2022, 1367366. [Google Scholar] [CrossRef]
Fekri-Ershad, S.; Ramakrishnan, S. Cervical Cancer Diagnosis Based on Modified Uniform Local Ternary Patterns and Feed Forward Multilayer Network Optimized by Genetic Algorithm. Comput. Biol. Med. 2022, 144, 105392. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.; Lu, L.; Nogues, I.; Summers, R.M.; Liu, S.; Yao, J. DeepPap: Deep Convolutional Networks for Cervical Cell Classification. IEEE J. Biomed. Health Inform. 2017, 21, 1633–1643. [Google Scholar] [CrossRef] [PubMed]
Desai, M. Role of Automation in Cervical Cytology. Diagn. Histopathol. 2009, 15, 323–329. [Google Scholar] [CrossRef]
Anwar, S.M.; Majid, M.; Qayyum, A.; Awais, M.; Alnowami, M.; Khan, M.K. Medical Image Analysis Using Convolutional Neural Networks: A Review. J. Med. Syst. 2018, 42, 226. [Google Scholar] [CrossRef] [PubMed]
Chen, H.; Liu, J.; Wen, Q.-M.; Zuo, Z.-Q.; Liu, J.-S.; Feng, J.; Pang, B.-C.; Xiao, D. CytoBrain: Cervical Cancer Screening System Based on Deep Learning Technology. J. Comput. Sci. Technol. 2021, 36, 347–360. [Google Scholar] [CrossRef]
Sellamuthu Palanisamy, V.; Athiappan, R.K.; Nagalingam, T. Pap Smear Based Cervical Cancer Detection Using Residual Neural Networks Deep Learning Architecture. Concurr. Comput.: Pract. Exp. 2022, 34, e6608. [Google Scholar] [CrossRef]
Vaiyapuri, T.; Alaskar, H.; Syed, L.; Aljohani, E.; Alkhayyat, A.; Shankar, K.; Kumar, S. Modified Metaheuristics with Stacked Sparse Denoising Autoencoder Model for Cervical Cancer Classification. Comput. Electr. Eng. 2022, 103, 108292. [Google Scholar] [CrossRef]
Rahaman, M.M.; Li, C.; Yao, Y.; Kulwa, F.; Wu, X.; Li, X.; Wang, Q. DeepCervix: A Deep Learning-Based Framework for the Classification of Cervical Cells Using Hybrid Deep Feature Fusion Techniques. Comput. Biol. Med. 2021, 136, 104649. [Google Scholar] [CrossRef]
Alquran, H.; Alsalatie, M.; Mustafa, W.A.; Abdi, R.A.; Ismail, A.R. Cervical Net: A Novel Cervical Cancer Classification Using Feature Fusion. Bioengineering 2022, 9, 578. [Google Scholar] [CrossRef]
Liu, W.; Li, C.; Xu, N.; Jiang, T.; Rahaman, M.M.; Sun, H.; Wu, X.; Hu, W.; Chen, H.; Sun, C. CVM-Cervix: A Hybrid Cervical Pap-Smear Image Classification Framework Using CNN, Visual Transformer and Multilayer Perceptron. Pattern Recognit. 2022, 130, 108829. [Google Scholar] [CrossRef]
Zhang, C.; Jia, D.; Li, Z.; Wu, N. Auxiliary Classification of Cervical Cells Based on Multi-Domain Hybrid Deep Learning Framework. Biomed. Signal Process. Control 2022, 77, 103739. [Google Scholar] [CrossRef]
Kupas, D.; Harangi, B. Classification of Pap-Smear Cell Images Using Deep Convolutional Neural Network Accelerated by Hand-Crafted Features. In Proceedings of the 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Glasgow, Scotland, 11–15 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1452–1455. [Google Scholar]
Alias, N.A.; Mustafa, W.A.; Jamlos, M.A.; Alquran, H.; Hanafi, H.F.; Ismail, S.; Rahman, K.S.A. Pap Smear Images Classification Using Machine Learning: A Literature Matrix. Diagnostics 2022, 12, 2900. [Google Scholar] [CrossRef] [PubMed]
Shanthi, P.B.; Hareesha, K.S.; Kudva, R. Automated Detection and Classification of Cervical Cancer Using Pap Smear Microscopic Images: A Comprehensive Review and Future Perspectives. Eng. Sci. 2022, 19, 20–41. [Google Scholar]
Hussain, E.; Mahanta, L.B.; Borah, H.; Das, C.R. Liquid Based-Cytology Pap Smear Dataset for Automated Multi-Class Diagnosis of Pre-Cancerous and Cervical Cancer Lesions. Data Brief 2020, 30, 105589. [Google Scholar] [CrossRef]
Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of Deep Learning: Concepts, CNN Architectures, Challenges, Applications, Future Directions. J. Big Data 2021, 8, 1–74. [Google Scholar] [CrossRef]
Xu, Q.; Zhang, M.; Gu, Z.; Pan, G. Overfitting Remedy by Sparsifying Regularization on Fully-Connected Layers of CNNs. Neurocomputing 2019, 328, 69–74. [Google Scholar] [CrossRef]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Ahmed, S.; Bons, M. Edge Computed NILM: A Phone-Based Implementation Using MobileNet Compressed by Tensorflow Lite. In Proceedings of the 5th International Workshop on Non-intrusive Load Monitoring, Virtual, 18 November 2020; pp. 44–48. [Google Scholar]
Zhang, X.; Zhou, X.; Lin, M.; Sun, J. Shufflenet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the IEEE Conference on Computer vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 6848–6856. [Google Scholar]
Attallah, O. Tomato Leaf Disease Classification via Compact Convolutional Neural Networks with Transfer Learning and Feature Selection. Horticulturae 2023, 9, 149. [Google Scholar] [CrossRef]
Albregtsen, F. Statistical Texture Measures Computed from Gray Level Coocurrence Matrices. 2008, p. 14. Available online: https://www.semanticscholar.org/paper/Statistical-Texture-Measures-Computed-from-Gray-Albregtsen/32538c358410ebce7c9ecf688addddf13f45b75b (accessed on 25 January 2023).
Attallah, O. A Computer-Aided Diagnostic Framework for Coronavirus Diagnosis Using Texture-Based Radiomics Images. Digital Health 2022, 8, 20552076221092544. [Google Scholar] [CrossRef]
De Siqueira, F.R.; Schwartz, W.R.; Pedrini, H. Multi-Scale Gray Level Co-Occurrence Matrices for Texture Description. Neurocomputing 2013, 120, 336–345. [Google Scholar] [CrossRef]
Burger, W.; Burge, M.J. Principles of Digital Image Processing; Springer: London, UK, 2009; Volume 111. [Google Scholar]
He, C.; Zheng, Y.F.; Ahalt, S.C. Object Tracking Using the Gabor Wavelet Transform and the Golden Section Algorithm. IEEE Trans. Multimed. 2002, 4, 528–538. [Google Scholar]
Li, C.; Huang, Y.; Huang, W.; Qin, F. Learning Features from Covariance Matrix of Gabor Wavelet for Face Recognition under Adverse Conditions. Pattern Recognit. 2021, 119, 108085. [Google Scholar] [CrossRef]
Keskar, N.S.; Mudigere, D.; Nocedal, J.; Smelyanskiy, M.; Tang, P.T.P. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima. arXiv 2016, arXiv:1609.04836. [Google Scholar]
Li, M.; Zhang, T.; Chen, Y.; Smola, A.J. Efficient Mini-Batch Training for Stochastic Optimization. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 661–670. [Google Scholar]
Aggarwal, N.; Agrawal, R.K. First and Second Order Statistics Features for Classification of Magnetic Resonance Brain Images. J. Signal Inf. Process. 2012, 3, 19553. [Google Scholar] [CrossRef] [Green Version]
Wang, Z.; Li, M.; Wang, H.; Jiang, H.; Yao, Y.; Zhang, H.; Xin, J. Breast Cancer Detection Using Extreme Learning Machine Based on Feature Fusion With CNN Deep Features. IEEE Access 2019, 7, 105146–105158. [Google Scholar] [CrossRef]
Mohammed, B.A.; Senan, E.M.; Alshammari, T.S.; Alreshidi, A.; Alayba, A.M.; Alazmi, M.; Alsagri, A.N. Hybrid Techniques of Analyzing MRI Images for Early Diagnosis of Brain Tumours Based on Hybrid Features. Processes 2023, 11, 212. [Google Scholar] [CrossRef]
Antropova, N.; Huynh, B.Q.; Giger, M.L. A Deep Feature Fusion Methodology for Breast Cancer Diagnosis Demonstrated on Three Imaging Modality Datasets. Med. Phys. 2017, 44, 5162–5171. [Google Scholar] [CrossRef]
Attallah, O.; Sharkas, M. Intelligent Dermatologist Tool for Classifying Multiple Skin Cancer Subtypes by Incorporating Manifold Radiomics Features Categories. Contrast Media Mol. Imaging 2021, 2021, 7192016. [Google Scholar] [CrossRef]
Yaman, O.; Tuncer, T. Exemplar Pyramid Deep Feature Extraction Based Cervical Cancer Image Classification Model Using Pap-Smear Images. Biomed. Signal Process. Control 2022, 73, 103428. [Google Scholar] [CrossRef]
Basak, H.; Kundu, R.; Chakraborty, S.; Das, N. Cervical Cytology Classification Using PCA and GWO Enhanced Deep Features Selection. SN Comput. Sci. 2021, 2, 369. [Google Scholar] [CrossRef]
Pramanik, R.; Biswas, M.; Sen, S.; de Souza Júnior, L.A.; Papa, J.P.; Sarkar, R. A Fuzzy Distance-Based Ensemble of Deep Models for Cervical Cancer Detection. Comput. Methods Programs Biomed. 2022, 219, 106776. [Google Scholar] [CrossRef] [PubMed]
Manna, A.; Kundu, R.; Kaplun, D.; Sinitca, A.; Sarkar, R. A Fuzzy Rank-Based Ensemble of CNN Models for Classification of Cervical Cytology. Sci. Rep. 2021, 11, 14538. [Google Scholar] [CrossRef] [PubMed]
Zhao, C.; Shuai, R.; Ma, L.; Liu, W.; Wu, M. Improving Cervical Cancer Classification with Imbalanced Datasets Combining Taming Transformers with T2T-ViT. Multimed. Tools Appl. 2022, 81, 24265–24300. [Google Scholar] [CrossRef] [PubMed]
Kundu, R.; Chattopadhyay, S. Deep Features Selection through Genetic Algorithm for Cervical Pre-Cancerous Cell Classification. Multimed. Tools Appl. 2022, 1–22. [Google Scholar] [CrossRef]

Figure 1. Instances of the slides available in the Mendeley LBC dataset for class; (a) high squamous intraepithelial lesion, (b) low squamous intraepithelial lesion, (c) no intraepithelial malignancy, and (d) squamous cell carcinoma.

Figure 2. Introduced CAD’s workflow for cervical cancer diagnosis via pap smear slides.

Figure 3. Confusion Metrics for SVMs with different kernels constructed with PCs generated after fusing the entire DL features with the combined handcrafted features; (a) linear kernel, (b) quadratic kernel, (c) cubic kernel, and (d) gaussian kernel.

Figure 4. ROC curves of the quadratic SVM constructed with PCs generated after fusing the entire DL features with the combined handcrafted features when the positive class is: (a) High squamous intraepithelial, (b) Negative for intraepithelial (c) Low squamous intraepithelial, (d) Squamous cell carcinoma.

Figure 5. A comparison between (a) the highest accuracy reached in each configuration and (b) the number of features used to train the SVMs.

Table 1. The diagnostic accuracy (%) of the SVM classifiers trained with individuals and combined handcrafted features set from multiple domains.

Features	Linear	Quadratic	Cubic	Gaussian
Statistical	85.7	92.8	93.3	90.2
GLCM	73.7	85.0	87.7	72.6
GW	63.6	62.2	57.1	63.6
DWT − GLCM	69.8	74.8	76.6	70.8
GLCM + Statistical + GW + DWT − GLCM	87.8	94.2	95.1	91.3

Table 2. The diagnostic accuracy (%) of the SVM classifiers trained with independent DL Feature sets and concatenated with the combined handcrafted features set from multiple domains.

	Linear	Quadratic	Cubic	Gaussian
ResNet-18	98.4	98.5	98.5	98.4
ResNet-18 + Combined Handcrafted	98.5	99.1	99	98.6
DarkNet-19	97.6	97	96.9	97.7
DarkNet-19 + Combined Handcrafted	99.2	99.6	99.6	97.9
MobileNet	97.2	96.8	96.8	97.3
MobileNet + Combined Handcrafted	99.1	99.2	99.2	98.8

Table 3. The confusion matrices for the linear, quadratic, cubic, and gaussian SVMs trained with the fused DL features and the combined handcrafted features via PCA.

	Number of PC
Classifiers	5	10	15	20	25	30	35	40	45	50	55	60	65	70
Linear	71.8	99.4	99.6	99.7	99.5	99.6	99.7	99.7	99.7	99.6	99.6	99.6	99.6	99.6
Quadratic	81.4	99.3	99.6	99.6	99.3	99.5	100	100	99.8	99.7	99.7	99.8	99.8	99.8
Cubic	79.7	99.4	99.7	99.5	99.2	99.4	99.8	99.9	99.6	99.6	99.6	99.7	99.7	99.7
Gaussian	81.1	99.3	99	99.2	98.6	98.5	98.4	98.4	98.6	98.4	98.6	98.6	98.5	95.5

Table 4. Evaluation indices values calculated from the SVMs in configuration III.

Classifiers	Precision	Specificity	F1-Score	Sensitivity	MCC
Linear	0.997	1	0.997	0.997	0.996
Cubic	0.995	0.999	0.997	1	0.997
Quadratic	1	1	1	1	1
Gaussian	0.993	0.999	0.993	0.993	0.990

Table 5. Comparative analysis between state-of-the-art methods and the introduced CAD constructed with the Mendeley LBC dataset.

	Methods	Number of Features	Precision	Accuracy	Sensitivity	F1-Score
[69]	Preprocessing: Cropping the images into multiple sub-images Processing: DarkNet-19 + NCA + SVM	1000	0.9897	0.9927	0.9763	0.9829
[70]	VGG-16 + Inception + DenseNet-121 + ResNet-50 + PCA + GWO	796	0.9914	0.9947	0.9927	0.9920
[71]	MobileNet + Incetion + InceptionResNet	N/A	0.9934	0.9968	0.9934	0.9987
[72]	Inception + Xception + DenseNet-169 + Fuzzy rank	N/A	0.9913	0.9923	0.9923	0.9918
[73]	Tam Transformers + Visual Transformers	N/A	-	0.9879	0.9861	-
[74]	GoogleNet + ResNet-18 + Genetic Algorithm + SVM	730	0.9839	0.9907	0.9818	0.9831
Proposed	Multiple DL and Handcrafted Features + PCA	35	1	1	1	1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Attallah, O. Cervical Cancer Diagnosis Based on Multi-Domain Features Using Deep Learning Enhanced by Handcrafted Descriptors. Appl. Sci. 2023, 13, 1916. https://doi.org/10.3390/app13031916

AMA Style

Attallah O. Cervical Cancer Diagnosis Based on Multi-Domain Features Using Deep Learning Enhanced by Handcrafted Descriptors. Applied Sciences. 2023; 13(3):1916. https://doi.org/10.3390/app13031916

Chicago/Turabian Style

Attallah, Omneya. 2023. "Cervical Cancer Diagnosis Based on Multi-Domain Features Using Deep Learning Enhanced by Handcrafted Descriptors" Applied Sciences 13, no. 3: 1916. https://doi.org/10.3390/app13031916

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Cervical Cancer Diagnosis Based on Multi-Domain Features Using Deep Learning Enhanced by Handcrafted Descriptors

Abstract

1. Introduction

2. Literature Review

2.1. Conventional CADs for Cervical Cancer Diagnosis

2.2. DL-Based CADs for Cervical Cancer Diagnosis

2.3. Hybrid Based CADs for Cervical Cancer Diagnosis

3. Motivation

4. Materials and Methods

4.1. Mendeley LBC Pap Smear Slides Dataset

4.2. Design of the Introduced CAD

4.2.1. Pap Smear Image Preparation

4.2.2. DL Feature Extraction

4.2.3. Handcrafted Descriptors Mining

Spatial Statistical and Texture Features

Grey Level Co-Occurrence Matrix Texture Features

Discrete Wavelet Transform Textural Features

Gabor Wavelet Transform Textural Features

4.2.4. Multi-Domains Feature Combination and Reduction

4.2.5. Diagnosis

4.3. Evaluation Indices and Networks Hyper-Parameters

4.3.1. Evaluation Indices

4.3.2. Networks Hyper-Parameters

5. Results

5.1. Configuration I SVM Classifiers Performance

5.2. Configuration II SVM Classifiers Performance

5.3. Configuration III SVM Classifiers Performance

5.4. Comparison between Configurations Results

6. Discussion

6.1. Comparative Performance Analysis

6.2. Limitations and Upcoming Work

7. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI