1. Introduction
The aging population in numerous countries poses several challenges to already strained healthcare systems [
1]. Globally, the population aged 65 and over is growing faster than any other age group [
2]. This demographic shift toward older populations is likely to trigger a surge in age-related conditions such as cognitive impairment and Alzheimer’s Disease (AD) [
1]. By 2050, it is estimated that around 152 million people worldwide will suffer from dementia [
3]. With a new case of dementia occurring every three seconds globally, the rate is truly alarming [
3]. The dramatic surge in the number of dementia patients affects caregivers and their families not just psychologically but also physically, socioeconomically, and economically [
4]. As a result, early-stage screening of dementia patients is of utmost importance [
4]. Early-stage screening can identify dementia symptoms before the condition fully develops [
4], allowing treatment plans to be initiated promptly to control dementia’s progression. Hence, cognitive screening plays a critical role in enhancing the dementia healthcare system [
4].
Dementia can be caused by a variety of diseases, including Alzheimer’s Disease (AD), cerebrovascular dementia, hypothyroidism, and benign brain tumors [
5]. The most common type of dementia is Alzheimer’s disease [
4]. The five major symptoms of dementia include memory loss, issues with visual perception, reduced reasoning and judgment, communication and language issues, and an inability to pay attention [
6]. Cognitive tests (CT), which are neuropsychological assessments administered by clinical experts, are commonly used to evaluate memory capacity, general cognition, and language issues in patients [
7]. As significant cognitive decline is one of the most critical early-stage dementia symptoms, quantifiable measurements through cognitive tests play a vital role in early detection [
8]. Neuroimaging tests such as Magnetic Resonance Imaging (MRI) are also used to examine brain activities and diagnose dementia [
8]. However, standard, state-of-the-art diagnostic procedures for dementia, such as cerebrospinal fluid analysis tests and neuroimaging tests, are often expensive, time-consuming, and carry risk factors [
9].
Therefore, there is growing interest in using Machine Learning (ML) technologies to predict and detect the early stages of dementia. For example, the work reported in [
10] proposed using a Machine Learning algorithm to detect the stages of dementia using screening tests. Kruthika et al. [
11] detailed how machine learning techniques, including the SVM (Support Vector Machines) and KNN (K-Nearest Neighbor) algorithms, are used for predicting and classifying dementia. Veeramuthu et al. [
12] leveraged machine learning to develop a decision-making CAD (Computer-Aided Design) tool for detecting dementia.
In response to these developments, this paper proposes the use of Machine Learning (ML) for the early-stage detection of Alzheimer’s Disease. The paper is an extended version of the paper “Early Detection of Alzheimer’s Disease: A Novel Cognitive Feature Selection Approach Using Machine Learning” published in the proceedings of the 2021 Conference on Advances in Information, Communication and Cybersecurity [
13]. Our research blends applied ML methods with a novel feature selection technique. Key objectives include:
Utilizing all available features (cognitive, neuroimaging, and combined) from the ADNI-1, ADNI-2, and ADNI-3 datasets for Alzheimer’s Disease detection
Implementing robust preprocessing techniques, including handling missing values and data normalization.
Proposing and employing the novel NCA-F feature selection method to pinpoint critical and relevant features.
Conducting comparative analyses using various machine learning methods on the selected features building on these objectives, the model proposed in this article implements the AdaBoost Ensemble (adB), Artificial Neural Network (ANN), Support Vector Machine (SVM), and Naïve Bayes (NB) machine learning algorithms with cognitive and neuroimaging features obtained from a public dataset [
14] consisting of 13,916 patients’ records. The remainder of this paper is organized as follows:
Section 2 discusses the role of machine learning in predicting dementia;
Section 3 outlines the research methodology; and
Section 4 analyzes the results of the implemented machine learning models and compares them with existing work.
Section 6 details the limitations of this work, while
Section 5 offers concluding remarks.
2. Related Works
Machine learning (ML) is a promising and emerging technique used for the early detection of dementia. ML has been extensively used in the literature for the prediction of cognitive diseases [
15]. The use of an ensemble classification model to identify patients with high and low dementia risks was proposed in [
16]. The claimed classification accuracy was 94.7% when trained using paralinguistic features only. However, an increase of 2.5% (97.2% accuracy for combined features) in the model’s accuracy was reported when both paralinguistic and episodic memory features were used. Grassi et al. [
17] developed an ensemble of 13 (i.e., SVM with radial basis function, SVM with linear kernel, SVM with polynomial kernel, L1 regularized logistic regression, L2 regularized logistic regression, multilayer perceptron, decision tree, k-NN, random forest, Naïve Bayes, liner regression) machine learning models to predict the conversion from Mild Cognitive Impairment (MCI) to Alzheimer’s disease. The authors reported that the ensemble was able to achieve an Area Under Curve (AUC) of 0.88, a specificity of 79.9%, and a sensitivity of 77.7%. Zhou et al. [
18] proposed a novel approach for dementia diagnosis based on a three-stage deep feature learning and fusion system. Yang et al. [
19] proposed a novel feature weighting method based on nearest neighbors called Component Feature Selections (NCFS). This method leverages a feature weighting vector to maximize classification accuracy and was reported to outperform other benchmark techniques.
Recently, a shift in research has been observed, and the use of cognitive features for the prediction of Alzheimer’s disease (AD) has been reported [
20,
21]. Ford et al. [
21] used 18 cognitive features for the prediction of dementia. From the results, the authors reported a 0.74 area under the receiver operating characteristic (ROC) curve. Gill et al. [
20] used both cognitive and neuroimaging features to predict AD. The authors used only four cognitive features and claimed 81.8% accuracy and an AUC of 0.79 for cognitive features while, for neuroimaging features, 75.7% accuracy and 0.77 AUC.
ML models use a series of steps for the identification, training, and testing of algorithms to find the feature(s) of interest for a given dataset. The extracted features play an important role in the performance of the prediction model for dementia. In machine learning, the process of selecting the appropriate features from a dataset to train the model is known as “feature selection”, which is very important in dataset cleaning [
22]. In datasets that contain many features, it is challenging to select the features that are most relevant to the model. As such, removing irrelevant and redundant features from a given dataset improves the overall performance of machine-learning models [
23]. Three feature selection approaches are commonly used in the literature, including the filtering approach, the wrapper, and embedded methods [
6]. The features are selected based on the multiple statistical test scores and the derived correlation with the target variable. Test scores are based on a correlation coefficient, which defines the statistical relationship between the variables. Other approaches use the correlation coefficient as a feature selector, such as Pearson’s correlation, Linear Discriminant Analysis (LDA), Analysis of Variance (ANOVA), and the Chi-Square approach [
24]. Furthermore, AlShboul et al. utilized machine learning to analyze ADNI data, focusing on classifying dementia stages through cognitive and demographic features [
25]. Their work, based on the TADPOLE challenge, underscored effective algorithms and highlighted the potential of cognitive tests for non-invasive diagnoses. Their reliance on comprehensive assessments, such as the CDR, supported its application in clinical decision-making and achieved an accuracy rate of 89%.
In a similar vein, Lin et al. employed machine learning to pinpoint gene biomarkers crucial for the prediction of stable MCI patients, boasting an AUC value of 0.841. This research emphasizes the importance of early diagnosis and the potential of precision medicine [
26]. Another significant study revealed cognitive and functional markers associated with AD progression [
27]. By comparing various cognitive domains, the authors devised a computational method to monitor AD progression. Evaluations using ADNI data shed light on functional components that are closely tied to the disease’s progression. Nonetheless, the rise of deep learning techniques, particularly transfer learning, has shown promise in enhancing AD detection from neuroimaging scans. However, it is worth noting that these methods often demand resource-intensive computational models [
28].
Deep learning is also utilized for the biomarker’s prediction of AD stages and progression from neuroimaging data [
29,
30]. The problem with these methods is that they utilize high-dimensional 3D neuroimaging data such as PET and MRI scans. In contrast, neuroimaging biomarkers are utilized for the progression of AD disease based on TADPOLE challenge data [
14,
31,
32]. A study compares the top methods from the TADPOLE Challenge for predicting AD evolution [
31]. Algorithms forecasted clinical diagnosis, ventricular volume, and cognitive scores. Different algorithms were evaluated, showing significant performance improvements over baselines, and interpretability analysis was conducted using SHAP values. The features CDRSB, AV45-PET, and FDG-PET are identified as the best-performing features [
31]. Similarly, ML algorithms forecasted clinical diagnosis, ADAS-Cog13, and ventricular volume. No single algorithm excelled in predicting all three outcomes [
32]. While some methods outperformed baselines, performance variation and challenges in addressing missing data were observed. In our proposed method, we have utilized cognitive scores and neuroimaging measurements to detect different stages of AD.
3. A Machine Learning Approach for Predicting Dementia
The research proposes a novel approach referred to as the Neighbourhood Component Analysis and Correlation-Based Filtration (NCA-F) method for the selection of the important primary features for the prediction of dementia. The research highlighted the impact of selected features in the early-stage detection of dementia. The ML-based model relies on a technique that enhances the feature reduction and selection processes of the relevant features. The process of combining cognitive and neuroimaging features resulted in the formulation of dementia biomarkers.
Figure 1 provides an overview of the three-stage methodology used in this research.
As this research focuses on predicting dementia, neuroimaging measurements, along with cognitive scores, have been employed. Neuroimaging measurements and cognitive scores are considered sensitive features [
14]. For the prediction of dementia, a diverse set of ML algorithms was selected based on their compatibility with our data. Specifically:
SVM: This was chosen for its suitability for high-dimensional data.
ANN: An advanced algorithm optimized and parameterized for high feature numbers.
NB: A probabilistic method ideal for smaller datasets due to its assumption of feature independence.
AdBE: An ensemble method that improves upon decision trees by emphasizing corrections from previous iterations.
These algorithms, AdaBoost Ensemble (AdBE), Artificial Neural Network (ANN), Support Vector Machine (SVM), and Naïve Bayes (NB), were trained for multiple combinations of features.
In addition to neuroimaging measurements, we leveraged cognitive test scores as an integral part of our feature set. While neuroimaging provides detailed structural and functional insights into the brain, cognitive tests offer a more accessible and immediate means of assessing an individual’s cognitive functions. These tests are not only easier to administer but also critical in real-world clinical settings where quick and non-invasive evaluations are often necessary. By combining both of these types of data, we aimed to provide a comprehensive and clinically relevant model for dementia prediction. Furthermore, the inclusion of non-imaging cognitive tests from the ADNI dataset ensures that our research remains pertinent to a broader range of clinical scenarios beyond just those with imaging facilities.
3.1. Data Extraction
The dataset used in this research is provided by The Alzheimer’s Disease Prediction of Longitudinal Evolution [
14]. This research has merged data from all phases of the Alzheimer’s Disease Neuroimaging Initiative (ADNI)* database (adni.loni.usc.edu) [
33,
34,
35], hence the name ADNIMERGE dataset. *Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in the analysis or writing of this report. A complete listing of ADNI investigators can be found at:
http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf (accessed on 11 January 2023).
The ADNI was launched in 2003 as a public-private partnership led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial MRI, positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD. It contains both cognitive test scores and neuroimaging measurement values for 13,892 records of 2132 patients. Many patients have visited multiple times, and each visit is recorded as a new record because a cognitive test score changes on each visit. Cognitive Normal (CN) records in the dataset were 4911, while Alzheimer’s Disease (AD) records were 8981, as shown in
Figure 2. ADNIMERGE standard datasets contain some or all of the eight biomarkers, including (i) the main cognitive tests (ADAS, MMSE, and RAVLT), (ii) MRI ROIs (volumes, areas, and thicknesses), (iii) FDG PET ROI averages, (iv) AV45 PET ROI averages, (v) AV1451 PET ROI averages, (vi) DTI ROI measures (cell radial diffusivity, axonal diffusivity), (vii) CSF biomarkers, and (viii) some other features such as APOE status, demographic information, and diagnosis.
3.2. Data Pre-Processing
The pre-processing method is implemented to clean the noisy data and avoid underfitting or overfitting problems. The further steps involved are listed below.
3.2.1. Handling Missing Values
Each record of the ADNIMERGE dataset has 113 features, which contain several missing values. Retaining features with a high percentage of missing values can lead to inaccuracies. We set our removal threshold at 40%, which ensures a more robust dataset while allowing for manageable imputation. The remaining missing values are filled using Iterative Imputer, a method that imputes missing values for each feature from all the remaining features in a round-robin manner [
36].
3.2.2. Data Normalization
Data normalization is an important process that impacts the overall performance of the model. Normalizing data ensures stable and efficient optimization and better generalization in ML and DL models. The ADNIMERGE dataset has a diverse range of values in different features. Thus, this dataset was normalized between 0 and 1 using the minimax technique. Equation (1) shows the mathematical expression used for dataset normalization.
where
denotes the input data and
denotes the
normalized data;
and
are the minimum and maximum values, respectively.
3.3. The Feature Selection Approach
It is well established that the removal of redundant and irrelevant features from a dataset improves the model’s performance. as the use of irrelevant features may result in the model becoming underfitted or overfitted. The Filtering and Embedded feature selection methods were adopted. Nonetheless, this research also explored and investigated the benefits of combining the Filter and Neighborhood component feature selection approaches.
3.3.1. The Filtering Method
By using the Filtering method, a correlation heat map is generated by the correlation coefficient. This is a measure of the linear dependency between two or more variables. The Correlation coefficient matrix is defined as the matrix for each of the pairwise variable combinations [
1], as expressed mathematically in Equations (2) and (3).
In Equation (3), denotes the mean and standard deviation, while denotes the covariance function. The Filtering method approach aims to define a threshold and filter only the highly correlated features of the developed model. Only the features with a correlated threshold having an absolute value greater than 0.9 are filtered in this work.
3.3.2. Wrapper Method
In a wrapper method like Principal Component Analysis (PCA) or Neighborhood Component Analysis (NCA), weights are assigned to features based on the clustering and classification performance of individual and combined features, respectively. In the proposed research, NCA with the Stochastic Gradient Decent (SGD) method as a solver is used for assigning weights to the features. SGD is suitable for handling large datasets, and its stochastic nature makes optimization for NCA more efficient and effective. The least-weighted features are excluded based on performance.
3.3.3. The Proposed Method (NCA-F)
This research proposes a new approach referred to as NCA-F (Neighborhood Component Analysis and correlation-based Filtration method). It uses a combination of filtering, Pearson’s correlation coefficient, and a wrapper method in the feature selection process. Firstly, irrelevant features are excluded based on Pearson’s correlation coefficient having an absolute value greater than 0.9. After the filtration process, the selected features are further processed using the NCA method. NCA aims to assign weights to each feature based on its classification performance. NCA-F has used the nearest neighbor classifier for checking the performance of different combinations of features and then assigning weights to each feature. These weighted features are then sorted in descending order and used in the model for the prediction of dementia. This research has separated neuroimaging measures (neuroimaging features) and cognitive test scores (cognitive features) and combined both of these features (combined features) to analyze the impact of both features on dementia detection together and separately. These features are shown in
Table 1 and
Table 2 after applying the feature filtration method with their weights. Although in a typical ML model, age is treated as a demographic, in AD prediction, age is an important factor and is treated as a cognitive feature by ADNIMERGE. “Combined features” are the combination of these two tables.
4. Results
4.1. Experimental Setup
Before applying the ADNIMERGE dataset along with all features to machine learning models, this research has analyzed the effects of neuroimaging, cognitive, and combined features using the AdaBoost Ensemble classifier using 5-fold validation, and the best number of features is selected for all three features. Performance measures for the N number of features for all three types of features are shown in
Figure 3,
Figure 4 and
Figure 5. A sample correlation matrix for neuroimaging features is shown in
Figure 6, which contains only the top five weighted neuroimaging features. Similarly, for cognitive features, only 18 out of 27 are selected after 5-fold cross-validation to achieve the best accuracy for the AdaBoost Ensemble classifier. For combined features, 19 features are selected out of 35 features, which are shown in
Table 3.
After the selection of the best features from all three types, four different machine learning models, mentioned in
Section 3, were used to predict dementia, and the results were validated with 5-fold cross-validation. Python 3.0 is used for the implementation of this methodology. The results of the implemented machine learning models for predicting dementia using highly weighted combinations of features are analyzed and discussed in the next sections.
4.2. Training and Testing
For the training setup, an 80:20 ratio between the training and test datasets was used. In this stage, 5 neuroimaging, 18 cognitive, and 19 combined highly weighted features from the proposed NCA-F were used to analyze the effect of these features on dementia prediction. AdB, ANN, SVM, and NB models are trained and tested, and the results are reported in
Table 3. AdB has outperformed ANN, SVM, and NB models in most cases. For neuroimaging features, the SVM model outperformed the other three models, i.e., the ANN, AdB, and NB, and achieved ~74% classification accuracy, as shown in
Table 4. SVM, ANN, and AdB models have performed almost equally, while NB has achieved the lowest performance measures as compared to other models, as shown in
Figure 7a, which depicts the Receiver Operating Characteristic (ROC) curve for all four models on neuroimaging features. The Area Under the Curve (AUC) for SVM, ANN, and AdB models is ~80% whereas the NB AUC is ~74% after 10-fold cross-validation. For cognitive features, the AdB model has outperformed the remaining models mentioned and achieved ~83% classification accuracy.
Figure 7b also shows that AdB has the optimum results with ~90% AUC. Similarly, for combined features, AdB has outperformed the other three models with ~83% accuracy and ~90% AUC. The combined features contain 7 neuroimaging and 12 cognitive features.
From
Table 4, it is evident that only neuroimaging features have not performed well for dementia prediction. Cognitive features are more effective than neuroimaging features for prediction. Combined features have also performed well for the prediction; however, the only problem with combined features is that neuroimaging features are required. Moreover, the AdB model has good performance results for all three features. We have explicitly checked the performance of all four models and concluded that AdB has the optimum results, as shown in
Figure 7. For further investigations, the overall performance of all three features is checked for all four models, and the results of the AdB model are shown in
Figure 8. Neuro, Cog, and Com are neuroimaging, cognitive, and combined features, respectively.
Figure 9 also depicts that the ROC curve of cognitive and combined features has better results than neuroimaging features.
4.3. Comparative Analysis
This section provides a comparison between the proposed NCA-F and the literature on different benchmark ADNI datasets. This article has explored Gill et al. [
20] for comparison purposes because this work has also focused on the early detection of dementia using neuroimaging (MRI) features, cognitive (clinical) features, and a combination of both of these features. Gill et al. used the ADNI1 dataset; however, at that time, the number of records in the dataset was only 600, and ADNI1 used during this research has 5013 records, while ADNIMERGE has a total of 13,892 records. Further comparative analysis between the proposed work and Gill et al. [
20] has been discussed below.
4.3.1. Comparison of NCA-F on ADNI1 (Updated) with the Literature
Upon the use of 5 neuroimaging features exclusively with the updated ADNI1 dataset, an accuracy of 79.40% is achieved, while a drastic jump of 8.49% is recorded to report an overall accuracy of 87.89% on 12 selected cognitive features, whereas the combination of both features, with 7 and 9 features of neuroimaging and cognitive metrics used, achieves an accuracy of 87.39%. Pertinent to mention, that the record number remains the same for all experimental analyses at 5013.
4.3.2. Comparison of NCA-F on ADNIMERGE with Literature
Three different categories of data are used to test the NCA-F method: Neuroimaging, Cognitive, and a combination of both. NCA-F fares the best when a combination of both features is used, as in the ADNIMERGE dataset, where the 7 and 12 best features are selected from neuroimaging and cognitive features, respectively. A reported accuracy of 83.42% is achieved, in contrast, which drops to 74.33% if only neuroimaging data are used with four features. Similarly, the best 18 cognitive features correspond to an accuracy of 83.15%, a mere drop of 0.27% in terms of accuracy. The frequency of recording remains static for all the experiments, at 13,892. However, it is the tradeoff required to opt between enhanced performance and higher accuracy.
4.3.3. Comparison between NCA-F on ADNI1 (Updated) and ADNIMERGE
A comparative analysis of the two exhibits some surprising yet convincing results. such as the use of mixed features did not yield the expected rise in accuracy, whereas, as evident from the experimental analysis in this paper vis-à-vis others, the result is skewed by the most weighted metrics, most notably the cognitive features instead of neuroimaging features. The accuracy of combinatory features using ADNI1 (updated) and ADDIMERGE datasets are 87.39 and 83.42 percent, respectively. The drop of approximately 4% happens, which perhaps can be attributed to the latter’s abundance of records resulting in over-fitting and hence erroneous classification. Additionally, the accuracy drop takes place regardless of the three extra features used for learning when the ADNIMERGE dataset is used, which validates our presumption: the most weighted (significant) features skew accuracy the greatest.
4.3.4. Benefits of Cognitive and Neuroimaging Features and Their Relevance to DSM-5 Criteria
Cognitive testing establishes a baseline for an individual’s cognitive abilities in the absence of symptomatic indicators, serving as a foundation for future comparative evaluations if cognitive decline is suspected. Cognitive features are crucial for early detection and accurate staging of Alzheimer’s Disease (AD). These features allow for the identification of nuanced cognitive changes before overt symptoms are present, establish benchmarks for monitoring deviations from healthy cognitive patterns, and provide objective measures conducive to accurate diagnoses.
In relation to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) criteria, cognitive tests such as the Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MoCA) offer valuable insights into several domains, including memory, executive function, and attention. These domains correspond with DSM-5 criteria for diagnosing neurocognitive disorders, including AD. Specifically, the decline in one or more cognitive domains, evident through cognitive testing, is a key criterion for the diagnosis according to DSM-5.
Neuroimaging, on the other hand, contributes critical baseline data about brain structure and activity. Subsequent scans can be compared to these baselines to identify structural and functional changes that might be indicative of AD. When cognitive and neuroimaging features are combined, they significantly improve the sensitivity and specificity of detecting AD in both its early and late stages, as demonstrated in
Table 5.
Our research adds a new dimension to the field by identifying a distinct set of cognitive and neuroimaging features critical for AD diagnosis. Unlike existing studies that have emphasized the importance of features such as CDRSB, AV45-PET, FDG-PET, and ADAS-Cog13 [
31,
32], our model isolated 4 neuroimaging and 18 cognitive features as being more critical in the early and accurate detection of AD.
4.3.5. Overall Comparison
The overall comparison between the proposed NCA-F and Gill et al. [
20] is given in
Table 5. This table summarizes the results of the performance comparison in terms of the number of features used, the type of features, the dataset type, the number of records, the accuracy achieved, and the Area under the ROC curve (AUC). From
Table 5, it can be observed that the proposed approach has achieved better performance on both datasets (ADNIMERGE and ADNI-1) as compared to the existing approach of Gill et al. [
20]. We have also validated the performance of Gill et al.’s methodology on the updated ADNI-1 for a fair comparison. Our proposed methodology has outperformed Gill et al.’s. All the results are cross-validated with 5-fold cross-validation. The accuracy of the proposed method for cognitive features of the ADNIMERGE dataset is not the best of all; however, these features are independent of any Magnetic Resonance Imaging (MRI) or neuroimaging tests, and we are interested in the early detection of dementia based on some cognitive tests. The ADNI-1 dataset contains a limited number of records, which creates overfitting. On the other hand, ADNIMERGE has many records that resolve the overfitting problem. Moreover, researchers have identified different features such as CDRSB, AV45-PET, FDG-PET, and ADAS-Cog13 that are important for the progression of AD [
31,
32]. In contrast, our proposed methods have individually identified 4 neuroimaging and 18 cognitive features as more important for the detection of AD.
When comparing our approach against existing models, particularly the methodology by Gill et al., it is crucial to underscore the diversified feature set we utilized. While Gill et al. predominantly relied on biomarker-derived features, our model uniquely integrated both cognitive and neuroimaging features, leading to better performance across both the updated ADNI-1 and ADNIMERGE datasets. Moreover, our approach showed greater flexibility; it can adapt to resource-limited settings by using only cognitive features, thereby serving a broader clinical spectrum. Furthermore, the higher performance of our model substantiates the benefit of our proposed Neighborhood Component Analysis and Correlation-Based Filtration (NCA-F) in feature selection, which was pivotal in identifying the 4 neuroimaging and 18 cognitive features as most critical for AD detection.
4.4. Additional Observations on Robustness, Generalizability, and Limitations
4.4.1. Robustness and Reliability of AUC Values
To assess the robustness and reliability of the reported AUC (Area Under the Curve) values, multiple validation techniques were employed. Additional analyses were conducted using bootstrapping and stratified K-Fold cross-validation. In the bootstrapping analysis, the AUC ranged from 0.88 to 0.93 across 1000 iterations, with a mean AUC of 0.91 and a 95% confidence interval of [0.89, 0.92]. Similarly, in the stratified K-Fold cross-validation (K = 10), the AUC ranged from 0.87 to 0.92 with a mean of 0.90, further reinforcing the reliability of this metric. These additional analyses consistently indicated high AUC values, comparable to those reported in the main experiments. Furthermore, the experiment was repeated under identical conditions. A consistent pattern of results was observed across both the initial and repeated experiments. The high AUC values obtained suggest the model’s strong ability to distinguish between the classes, even if other performance metrics may appear modest.
4.4.2. Feature Independence and Clinical Usability
Independence between cognitive features and neuroimaging tests was observed, which offers benefits for resource-limited settings where advanced neuroimaging facilities may be unavailable.
4.4.3. Model Stability across Datasets
Stability in the performance of the AdaBoost model was noted when tested across different sizes and types of datasets. This suggests a lower susceptibility to overfitting.
4.4.4. Limitations Regarding Neuroimaging Features
Neuroimaging features alone were found to be less effective than combined cognitive features in predicting dementia. This limitation is noteworthy for clinicians and healthcare systems.
4.4.5. Cross-Validation Reliability
Consistency across different cross-validation folds was observed, reinforcing the reliability of the methodology used.
4.4.6. Recommendations
Our research suggests that for early detection and ongoing monitoring of AD, cognitive evaluations remain a cornerstone, aligning with typical clinical approaches. These assessments can often be conducted more frequently and are less invasive than neuroimaging studies. Cognitive features offer early warning signs and may provide a foundation for longitudinal tracking of cognitive health. When significant cognitive decline is suspected or observed, neuroimaging studies, including MRI and PET scans, should be considered for a more comprehensive understanding.
Regarding the choice of medical methods for data-driven approaches, our results indicate that in the early stages, cognitive assessments such as the Mini-Mental State Examination (MMSE) or the Montreal Cognitive Assessment (MoCA) align well with common clinical practices. As the disease progresses, more sophisticated and comprehensive neuroimaging tests may become increasingly important for understanding the extent and nature of degenerative changes.
It is essential to note that the observations and insights offered here are based on data analysis and should not be construed as clinical advice. We are not medical professionals, and these findings are intended to contribute to a scientific understanding that could inform but not replace professional medical evaluations and treatment plans.
4.5. Summary of Findings
This research explored the influence of neuroimaging, cognitive, and combined features on predicting dementia through the AdaBoost Ensemble classifier. Based on the analysis of the ADNIMERGE dataset, the best number of features was selected for each type: 5 neuroimaging features, 18 cognitive features, and 19 combined features. The best weighted combined features were identified using Neighborhood Component Analysis and Correlation-Based Filtration (NCA-F), and four different machine learning models (AdaBoost (AdB), Artificial Neural Network (ANN), Support Vector Machine (SVM), and Naive Bayes (NB)) were employed to predict dementia. The performance of these models was cross-validated through a 10-fold process.
In terms of feature types, SVM outperformed the other models on neuroimaging features, achieving approximately 74% classification accuracy. On cognitive features, the AdaBoost model had the highest performance, with approximately 83% accuracy. When it comes to combined features, which contain 7 neuroimaging and 12 cognitive features, AdaBoost again had the best performance with approximately 83% accuracy.
Interestingly, neuroimaging features alone did not yield high performance for dementia prediction. This study found that cognitive features are more effective than neuroimaging features for prediction, and combined features also performed well. However, the challenge with combined features is that neuroimaging features are required. Among the four models, AdaBoost showed good performance results for all three feature types.
Comparative analysis was also performed between the proposed NCA-F method and previous work by Gill et al., which also focused on the early detection of dementia using neuroimaging and cognitive features. The proposed NCA-F method achieved better performance on both the updated ADNI-1 and ADNIMERGE datasets when compared to the existing approach by Gill et al.
To this end, in this study, distinct advantages were observed depending on the types of features used for Alzheimer’s Disease (AD) diagnosis. Models built on biomarkers such as CDRSB, AV45-PET, and FDG-PET offered robustness and were particularly effective at capturing advanced stages of the disease, corroborating previous research [
31,
32]. However, models relying solely on cognitive features, as assessed through tools such as the Mini-Mental State Examination (MMSE) and Montreal Cognitive Assessment (MoCA), exhibited higher sensitivity in the early detection of AD, a crucial requirement for timely intervention. Interestingly, our combined feature models, incorporating both biomarkers and cognitive features, showed the most balanced performance across the early and late stages. This balance is evidenced by an 83% accuracy rate and consistently high AUC values, suggesting that a multi-dimensional approach is often more comprehensive for diagnosing a complex disorder like AD.
5. Limitations
Despite the promising outcomes of this study, some limitations must be acknowledged. Firstly, the datasets used, ADNIMERGE and ADNI-1, despite being extensive and informative, still represent a specific patient population. Results could potentially vary with other datasets or populations. This study also relies heavily on the accuracy of cognitive tests and the feature selection method. Variations in test administration, subject response, and selection algorithms can introduce errors.
Another limitation is the need for neuroimaging in the combined features. While combined features provided slightly better results, the requirement for neuroimaging makes this less feasible in many real-world clinical settings. Cognitive testing is generally easier to administer, cheaper, and more accessible, especially in low-resource settings. Hence, a focus on further improving performance with cognitive features alone would be beneficial.
Additionally, while the AdaBoost model performed well in this study, it might not be the optimal model for all scenarios. The performance of machine learning models can vary based on the specific characteristics of the dataset. Therefore, exploring other potential models could be valuable. Lastly, the issue of overfitting must be considered, particularly when dealing with a large number of records, as seen with the ADNIMERGE dataset. Techniques to mitigate this problem and ensure the model’s generalizability should be considered in future studies.
6. Conclusions
This research explored the use of machine learning algorithms in the early detection of dementia, with a particular focus on the potential of cognitive features derived from a series of cognitive tests, and contrasted these with neuroimaging features in predictive model training. The unique contribution of this research lies in the implementation of the AdaBoost Ensemble model on cognitive features, yielding an enhanced accuracy rate of approximately 83%. The AdaBoost model demonstrated improved performance compared to other benchmark models, including the Artificial Neural Network, Support Vector Machine, and Naïve Bayes. While the performance metrics improved when we combined cognitive and neuroimaging features, we emphasized cognitive features because of their clinical convenience and ease of execution. Furthermore, our work underscores the significance of cognitive assessments in the early detection of dementia, suggesting that clinicians should prioritize evaluating specific cognitive elements during AD screening. By shedding light on which cognitive areas are pivotal, this research informs clinicians about optimal times for cognitive feature assessment and how it complements the biomarker assessments in the diagnosis trajectory.
Future work should consider refining these machine-learning models, exploring other machine-learning algorithms, and enhancing performance using cognitive features alone. Future work should also consider analyzing different datasets or patient populations to validate the general applicability of the model. Overall, this research paves the way for innovative, machine-learning-assisted strategies for early dementia detection, promoting the use of easily accessible cognitive tests.