Enhancing Fingerprint Forensics: A Comprehensive Study of Gender Classification Based on Advanced Data-Centric AI Approaches and Multi-Database Analysis

Spanier, Assaf B.; Steiner, Dor; Sahalo, Navon; Abecassis, Yoel; Ziv, Dan; Hefetz, Ido; Kimchi, Shimon

doi:10.3390/app14010417

Open AccessArticle

Enhancing Fingerprint Forensics: A Comprehensive Study of Gender Classification Based on Advanced Data-Centric AI Approaches and Multi-Database Analysis

¹

School of Software Engineering and Computer Science, Azrieli College of Engineering, Jerusalem 9103501, Israel

²

Latent Fingermark Development Laboratory, Division of Identification and Forensic Sciences, Israel Police, Jerusalem 91906, Israel

³

Fingerprint Database Laboratory, Division of Identification and Forensic Sciences, Israel Police, Jerusalem 91906, Israel

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(1), 417; https://doi.org/10.3390/app14010417

Submission received: 25 November 2023 / Revised: 25 December 2023 / Accepted: 28 December 2023 / Published: 2 January 2024

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Fingerprint analysis has long been a cornerstone in criminal investigations for suspect identification. Beyond this conventional role, recent efforts have aimed to extract additional demographic information from fingerprints, such as gender, age, and nationality. While studies have demonstrated promising accuracy in gender classification based on fingerprints, practical implementation faces challenges, including the often low quality of crime scene fingerprints. This study presents a pioneering comparison of gender classification across diverse datasets, considering variations in fingerprint image quality. We examine the results from four databases, encompassing both public and private sources, employing state-of-the-art Data-Centric AI (DCAI) approaches for enhanced classification. Our findings reveal that a conservative Convolutional Neural Network (CNN)—specifically VGG—proves effective, achieving an accuracy ranging from 70% to 95% based on fingerprint quality. DCAI methods contribute a noteworthy 1–4% improvement. Notably, for partial or low-quality fingerprints, the periphery emerges as a critical determinant of gender classification. This study contributes insights into practical gender classification from fingerprints, emphasizing the significance of the fingerprint periphery. Furthermore, we provide the source code for future research and accessibility in real-world applications.

Keywords:

fingerprint analysis; gender classification; Data-Centric AI (DCAI); forensic science; partial or low-quality fingerprint images

1. Introduction

A fingerprint found at a crime scene is important evidence in a criminal investigation. Fingerprints are collected and compared against an existing database of fingerprint images in the police database. Comparisons are made by finding minutiae features [1,2] and comparing them to those recorded in the police database. If a match is found, the suspect’s identity can be established. While minutiae marking conventionally undergoes automated procedures, occasions may arise necessitating manual interventions—specifically involving the removal and modification of minutiae, owing to the presence of artifacts. Beyond the fact that the manual extraction of minutiae features is tedious work and prone to errors, the main disadvantages of this technique are: (1) fingerprints that are not of sufficient value are excluded and (2) attempts are not made to extract additional information beyond comparison with the database—for example, gender, age, and other information [3]. Fingerprints being used as a gender classification method is an exciting research frontier in forensic science. Research on this topic, first pioneered by Acree in 1999 [4], has now spanned over 20 years. The most recent literature review encompassing this work was published in 2021 [5]. The review concluded that fingerprint ridge density could be a reliable parameter for gender classification. It also concluded that the number of ridges present on a fingerprint per unit area varies between individuals and shows differences based on gender. Female fingerprints tend to have a higher ridge density than male’s. However, the mean density of fingerprint ridges varies between populations, as demonstrated over different datasets in various studies [6,7]. For instance, a study published in [6] utilized different image processing techniques, specifically minutiae detection and feature extraction; the study demonstrated 65% accuracy in gender classification, indicating moderate success. This success was further validated by studies conducted in 2023 [7].

Alternative approaches to gender classification based on fingerprint ridges use advanced machine learning and deep learning methods based on the whole fingerprint image. In [8], the authors introduce an approach that leverages Fast Fourier Transform (FFT), Principal Component Analysis (PCA) features, and min–max normalization. The models are trained utilizing a Support Vector Machine (SVM) classifier and incorporate sampling techniques such as SOMAT to address dataset imbalances. In this study, the right ring finger emerged as the most suitable point of interest for gender identification. It achieved an accuracy rate of 75% and 91% for males and females, respectively. The study refrained from employing deep learning methodologies and relied on a single dataset. Another article [9] presented 99% accuracy on the NIST-DB4; however, the article did not provide references or specifics regarding the precision of the train-test division when considering the occurrence of two impressions for each fingerprint and the gender balance, resulting in uncertainty regarding those results. In contrast, when the second impression of each finger was considered [10], the gender classification score was significantly lower—65%.

One of the most common publicly available databases for gender classification is SOCOfing. In [11], gender classification performance was evaluated using single finger-based classification, resulting in 77% average accuracy. Remarkably, when employing a weighted approach that considered multiple fingers simultaneously, accuracy significantly improved, achieving a 90% accuracy rate. However, gender classification based on multiple fingerprints is theoretical. According to our knowledge, it has no practical applications in crime scene investigation. This is primarily due to the rarity of encountering more than one fingerprint at a crime scene, as supported by prior research [12]. In this study, the authors employed only an elementary 5-layer CNN [13] architecture. Moreover, it is noteworthy that the network was trained “from scratch”, without the use of any beneficial fine-tuning techniques. Another study, conducted on the SOCOfing dataset [14,15], reported an accuracy rate of 90% on a restricted subset comprising 2000 fingerprints out of the total 6000 fingerprints available in the SOCOfing database. Notably, there is limited clarity regarding the specific criteria employed to select the 2000 fingerprints from the larger pool of 6000. A similar phenome was also reported in [7]. In summary, research articles focusing on the SOCOfing database have reported gender classification outcomes of approximately 75% when employing neural network-based approaches. When multiple fingerprints are used (though not practically applicable), accuracy rates of as high as 90% have been observed. Hence, it is evident that a baseline accuracy of around 75% is commonly encountered in gender classification tasks conducted on the SOCOfing dataset. This is indicated by references such as [16] and others.

In the context of studies utilizing advanced techniques on private fingerprint datasets, a recent comprehensive publication [17] conducted an extensive comparative analysis. This analysis encompassed nine widely employed classifiers including CNN, Support Vector Machine (SVM) with three distinct kernels, k-Nearest Neighbors (kNN), Adaboost, J48, ID3, and Linear Discriminant Analysis (LDA). Their methodology involved gender classification based on various fingerprint features, among them ridge-density, ridges, minutiae, and fingertip-size (FTS). The results revealed that CNN achieved the highest success rate for gender classification in this context. The results showed an accuracy of 95%. However, it is important to note that these results were based on private, high-quality fingerprint images collected in controlled laboratory settings (e.g., 500 dpi resolution with each finger scanned three times to ensure image quality), which may not be representative of real-world crime scene scenarios. Moreover, a somewhat redundant methodology that uses CNN for classification was utilized, when it is essentially a method for extracting features.

The Data-Centric Artificial Intelligence (DCAI) approach shifts the AI paradigm by placing more emphasis on data rather than algorithms or models [18,19]. Instead of solely concentrating on the model or algorithm itself, this approach analyzes data to identify which parameter instances have the most significant impact on classification. It then re-trains models based on these influential instances. In recent years, this method has been proven to offer distinct advantages over traditional approaches. DCAI represents an alternative paradigm in artificial intelligence that prioritizes data over algorithms. By implementing a set of margin-based criteria, we effectively filter out uncertain classification, and as mentioned, enable the model to learn from more reliable and informative examples. This approach has demonstrated the efficacy of the margin-of-confidence clean method in improving the performance and generalization capabilities of deep learning models.

To conclude, this study’s primary contributions encompass several key aspects:

Cross-Dataset Gender Classification Evaluation: For the first time, this research compares gender classification performance across multiple datasets, including three publicly available databases and one proprietary internal database. This cross-dataset evaluation is critical for establishing the robustness and generalizability of the proposed CNN methodology, as it mitigates the biases and limitations inherent to single-dataset studies.
Enhanced Analysis of Low-Quality and Partial Fingerprints: Recognizing the practical challenges in fingerprint investigations, this study specifically targets the classification of gender from fingerprints that are partial or of low quality. In the process of identifying and delineating the ROI, this study enhances the ability to classify gender from fingerprints that would otherwise be deemed of little value, thus significantly improving the practical application of fingerprint analysis in forensic contexts.
Application of Data-Centric AI (DCAI) for Performance Improvement: This research pioneers the application of Data-Centric AI approaches within the context of fingerprint-based gender classification. By focusing on the data itself, rather than solely on the model or algorithm, the study leverages DCAI to identify and re-train on the most impactful instances, thereby enhancing classification accuracy. This data-centric approach represents a paradigm shift in artificial intelligence applications.

2. Datasets

In this paper we evaluate four datasets, as detailed in Table 1. Three of these datasets are publicly accessible, namely: (1) NIST-DB4 [20], (2) SOCOfing [21], and (3) NIST-302 [22]. Additionally, we incorporated a privately obtained dataset from the Israeli police, henceforth referred to as IsrPoliceDB. To the best of our knowledge, these datasets represent the only publicly available resources that include gender information in conjunction with fingerprint images. The NIST-DB4 dataset [20] comprises 4000 plain fingerprints, captured at a resolution of 500 dpi, sourced from 2000 individuals (380 females and 1620 males), where each subject contributed two impressions of their index finger for this dataset. The SOCOfing database [21], established in 2007, consists of 6000 fingerprints collected from 600 African subjects (123 females and 477 males). Each participant provided a single impression of all ten fingers. The images in this dataset measure 96 × 103 pixels and have a resolution of 500 dpi. Furthermore, a relatively recent dataset, NIST-302 [23], published by the NIST agency in 2019, includes fingerprint impressions from 200 Americans (132 females and 68 males). Each individual contributed 20 impressions of all ten fingers. The images in this dataset are at a resolution of 500 dpi, with dimensions of 256 × 360 pixels.

3. Methods

Section 3 is divided into three key parts: The first part discusses the CNN techniques examined, specifically VGG and ResNet. The second part centers on delineating the used cases for evaluating CNN’s performance in handling partially or low-quality fingerprint images. In the third part, we discuss the DCAI approaches implemented to enhance our findings’ performance.

The input of our method is a single fingerprint image, and the output a gender classification. This study employed supervised learning and CNN methods for model training—in particular, networks such as VGG [24] and ResNet [25], which have maintained their significance within the realm of deep learning for image classification. The architectures emphasize depth through the recurrent use of small receptive field convolutions and pooling layers, simplifying implementation and comprehension. ResNet [25] introduces shortcut connections known as residual blocks; however, VGG’s [24] adaptability remains a significant advantage. By incorporating pre-trained weights into both models, they can both be seamlessly integrated into diverse networks and deep learning tasks, emphasizing their versatility. VGG and ResNet should be carefully considered in the context of smaller datasets. The less complex architecture and shallower depth of pre-trained VGG models may be advantageous in reducing overfitting, which is particularly beneficial for smaller databases. In contrast, ResNet’s depth enables it to learn intricate patterns. VGG and ResNet’s selection for smaller datasets hinges on factors such as computational resources, data complexity, and project-specific requirements. In the broader context of image classification, the choice between VGG and ResNet is emblematic of the ongoing evolution of deep learning, with each framework offering distinct advantages based on dataset characteristics and computational constraints.

Our approach involves fine-tuning CNN architectures tailored to the specific image classification objectives. To train our CNN models, we employed standard training techniques [13], including data augmentation, normalization, a method for class imbalance [26], stochastic AdaGrad optimization, and others. With a learning rate of 0.0001 and a batch size of 32, the models were trained using the default parameters. The process of choosing the specific CNN architecture (i.e., VGG or ResNet) was selected based on experimentation and validation, ensuring optimal performance in classifying gender fingerprint images. Given the imbalanced dataset, evaluating the predictive model’s performance posed challenges, as traditional accuracy measures may have been misleading as they can mirror the inherent class distribution. Therefore, we opted for the F-score as our evaluation metric, which integrates precision and recall, offering a more robust performance measure for imbalanced datasets. The F-score provides a comprehensive perspective on accuracy, surpassing individual metrics such as precision or recall and enabling clearer understanding and communication of findings, irrespective of dataset class imbalances.

Due to the partial or low-quality fingerprint images commonly found at crime scenes, we explored the VGG network’s performance in classifying gender based on fingerprints’ internal and external cylindrical regions. By examining the specific contributions of the internal and external areas (see Figure 1), we aimed to gain a deeper understanding of the nature of the fingerprint data and its relevance to gender classification tasks in cases of partially or low-quality fingerprints. We evaluated four scenarios: gender classification based on (a) 50% of the inner regions, (b) 60% of the inner regions, (c) 50% of the outer regions, and (d) 40% of the outer regions. The training and testing sets were kept the same in each case.

The methodology employed to extract both the inner and outer area of a fingerprint follows a series of steps: Initially, a binary threshold transforms the grayscale fingerprint images, which is succeeded by morphological operations. These operations are designed to reduce noise and isolate the fingerprint. Subsequently, the methodology fits an ellipse around the largest contour, assumed to be the fingerprint. Using these ellipse parameters, the inner and outer regions of the fingerprint are extracted. It is important to note that this methodology is designed to handle fingerprints that are off-center or rotated, as it does not depend on a specific alignment or central positioning within the image. This methodology also assumes that the majority of the original fingerprint is visible within the image.

Regarding the DCAI, we employed a comprehensive set of five distinct approaches to identify and rectify issues within the most challenging 5% of data instances. Subsequently, we conducted model retraining using the refined dataset and evaluated the performance of each newly trained model. The five specific DCAI approaches we examined included [18]: Cleanlab Out of Distribution (cleanlab-OOD), FLIP (comprising Easy and Hard variants), and Margin Of Confidence (MOC; Easy and Hard variants). The following sections will provide a detailed explanation of each of these approaches.

Out of Distribution (OOD): OOD refers to Identifying outliers in test data, such as data samples that do not stem from the distribution of the training data. To evaluate OOD, we utilized the Cleanlab framework [18], which leverages the “Principle of Counting”. This principle uses the model’s predicted probabilities (the confident joint) to estimate the number of examples in each class. By applying the Principle of Counting to these probabilities, we identified examples that fell outside the expected distribution, designating them as clearnlab Out of Distribution (cleanlab-OOD) instances.

Margin Of Confidence (MOC): MOC is a critical metric for assessing prediction accuracy. It quantifies the margin between two prediction scores. A larger MOC signifies more accurate predictions, while smaller margins imply potentially less reliable and inaccurate predictions. To measure MOC, we employed a classification head with two outputs for gender prediction (male and female). This setup allowed us to assess the accuracy of our model’s gender predictions based on the given datasets. We distinguished between MOC-Hard and MOC-Easy approaches. The MOC-Hard approach involved removing the top 5% of data instances where the model demonstrated elevated prediction confidence. This strategy allowed us to focus on more challenging data for model optimization. In contrast, the MOC-Easy approach targeted potential noise in the dataset by removing the 5% of data with the lowest prediction confidence scores, enhancing our focus on informative data.

FLIP (Classification Prediction Flips): FLIP measures the number of times a classification prediction changes during training. This metric assesses the robustness and generalizability of models. It provides insights into the effort required to achieve an acceptable level of accuracy with unseen data. It also helps identify points at which further optimization efforts may not yield improved performance. It also helps identify points where such efforts should be halted due to diminishing model accuracy returns. Similar to MOC, we distinguished between FLIP-Hard and FLIP-Easy approaches. The FLIP-Hard approach involved removing 5% of data with consistent prediction scores, retaining more complex data elements. On the other hand, the FLIP-Easy technique discarded 5% of the most inconsistent prediction data, ensuring a focus on data where the model exhibited persistent prediction clarity.

4. Results

Section 4 is divided into three parts: In the first part, we outline the process of selecting the optimal CNN for classifying gender using fingerprint images. This part also includes a thorough evaluation of the network’s performance across all four databases, ensuring a comprehensive assessment. In the second part, the most influential fingerprint region that contributes significantly to gender classification was identified and evaluated. In the final part, we explore several DCAI strategies aimed at enhancing gender classification accuracy. These strategies are designed to refine and optimize our gender classification models, leading to more robust and reliable classifications.

4.1. Part I

To achieve this, the NITS-302 dataset was used as a basis for evaluating five variants of the two most widely employed CNN architectures (VGG and ResNet). NIST-302 was chosen for this evaluation due to its extensiveness and relatively high image quality, as well as its comprehensive coverage, including images of all 10 fingerprints. We used pre-trained ImageNet networks at a learning rate of 0.0001 and a batch size of 32 with an AdaGrad optimizer to train the selected models. The study utilized a variety of techniques to train our convolutional neural network (CNN) models. For data augmentation, we implemented random horizontal flipping of images, with a probability of 0.5 and a random rotation within a range of ±15 degrees. This approach enhanced the diversity of our dataset and aided in reducing overfitting. To address the issue of class imbalance in our dataset, we used a weighted random sampler. The sampler calculates the distribution of the classes in the dataset and assigns weights inversely proportional to the class frequencies. This ensures that each sample’s probability of being picked for a batch is relative to its class representation, resulting in a more balanced training process. Our dataset was divided into a training and a testing set, with an 80–20 split (last column of Table 2). This is a commonly used ratio, providing a significant amount of data for training while still leaving an adequate amount for testing.

Table 3 shows that ResNet18 exhibited overfitting issues and produced suboptimal results. Compared to ResNet, VGG16 and VGG19 achieved a test accuracy of 0.83, which is notably higher than ResNet’s 0.75. It appears that VGG19 optimizes the trade-off between model complexity and the generalization of unknown data; this is evident when focusing on the balance between train and test accuracy. Furthermore, the F-score, which represents a consolidated view of model performance based on precision and recall, recommends the VGG models. Specifically, VGG19 achieved the highest test accuracy and F-scores of 0.84 and 0.81, respectively.

Next, the selected VGG19 model was evaluated across the four databases we used—one internal and three public.

As can be seen in Table 2, the VGG19 network can be generalized to other datasets, as the classification results range from 70 to 95% with no evidence of overfitting. The results are correlated with the size and quality of the images in the dataset, as indicated in Table 1. IsrPoliceDB yielded the highest classification accuracy rate of 96%, in accordance with the dataset having the largest image size and resolution. A moderately sized image and resolution in the NIST datasets (DB4, 302) achieved an accuracy rate of 80%, while a considerably smaller image achieved an accuracy rate of 68%. These results indicate that the VGG network is capable of capturing positive samples effectively, while minimizing false positives reliably and stably.

4.2. Part II

Next, we will present the outcomes of the second part of our study. This centered on identifying and assessing the critical fingerprint image region essential for gender classification, especially in scenarios involving partial or low-quality fingerprint images. Our approach involves the identification and delineation of the ROI in fingerprint images, aiming to pinpoint the areas that significantly influence gender classification.

As can be seen in Figure 2, a 4% decrease is seen when classification is based on the inner region of the fingerprint. In contrast, when classifying based on the outer region, accuracy is usually not affected—with one exception, in the NIST-DB4, which showed a 1% drop/decrease; all other cases either improved or remained unchanged. These results indicate that the outer region of a fingerprint is more significant for gender classification, which is in accordance with previous literature findings. This is further supported by the absence of “delta” elements and additional features in the outer region that do not interfere with “ridge counting”.

4.3. Part III

Lastly, we will examine the DCAI strategies used to improve the results. As discussed in the Section 3, DCAI is a paradigm shift in artificial intelligence that focuses on data as the primary source of information. We examined five DCAI approaches: cleanlab-OOD, FLIP (Easy and Hard), and MOC (Easy and Hard). For each of the five approaches, we scored every instance in the training set and filtered out the highest 5% of instances in the dataset. Subsequently, we re-trained each model using the resulting clean dataset. The newly trained model was then evaluated on the original test-set.

As evident from Figure 3, two out of the five approaches—cleanlab-OOD and FLIP-Easy—showed consistent improvements, with an average F-score improvement of 2.5% and 2.75%, respectively. The cleanlab-OOD approach improved the F-score across the four datasets, except for the NIST-302 dataset, where the values were equal to the baseline. The most significant improvement was observed for the NIST-DB4 dataset, where the F-score increased from 0.82 to 0.87. FLIP-Easy approaches showed noticeable improvements in the NIST-DB4 and NIST-302 datasets. These findings provide compelling evidence of the successful filtering capability of DCAI approaches, effectively eliminating the data that negatively impacted the model’s performance.

5. Conclusions

This paper demonstrates a comprehensive evaluation of fingerprint image gender classification using CNN. The study is divided into three main parts: selection of an optimal CNN for gender classification based on the given datasets, identification of critical fingerprint regions for classification, and exploration of DCAI strategies to enhance classification accuracy. This section delves into the implications of these findings.

The results indicate that VGG19 outperforms other architectures in terms of accuracy, precision, recall, and F-score. It achieved a test accuracy of 0.84, highlighting its ability to balance model complexity and generalization. These findings emphasize VGG19’s superiority in accurately classifying gender based on fingerprint images. VGG19’s successful application is further demonstrated when it is tested across different datasets, including the IsrPoliceDB and three public databases (Table 2). The classification results range from 70% to 95%, showing its ability to generalize across various datasets without overfitting. The results correlate with the size and quality of the images in the dataset. This indicates that VGG19 effectively captures positive samples while minimizing false positives. In order to apply gender classification in a practical manner in a variety of real-world scenarios, this ability to generalize across diverse datasets is vital.

The second part of the study focused on identifying and assessing the critical fingerprint region that significantly influence gender classification (Figure 2). It became evident that the outer region holds greater importance in gender classification. This conclusion is in harmony with the existing literature and underpinned by the distinct absence of disruptive elements within the outer region [27]. The methodology employed in this study to define the ROI in fingerprint images yields valuable insights for researchers and practitioners dealing with partial or low-quality fingerprint images.

The third part of the study explored DCAI strategies aimed at further enhancing classification accuracy (Figure 3). Among the five DCAI approaches examined, cleanlab-OOD and FLIP-Easy consistently demonstrated improvement in F-scores, with an average increase of 2.5% and 2.75%, respectively. FLIP-Easy appeared to be the optimal approach for several reasons: First, by eliminating 5% of the data with the most inconsistent classifications during training, it ensured that the model focused primarily on data where the classifications were stable; this enhanced the model’s generalization. By applying this strategy, overfitting was reduced, model reliability was improved, and independent data could be predicted more accurately. Second, this helped in building a leaner, more resource-efficient model. Data points with fluctuating classifications can slow down the learning process by causing the model to repeatedly adjust its parameters to correct perceived errors; by removing these unstable elements, the model can train faster and use computational resources more efficiently. Lastly, the FLIP-Easy approach encourages a focus on high-quality data. To conclude, FLIP-Easy provides the highest performance for constructing a more generalizable, efficient, and high-performance model.

Additional implications for various security and forensic applications—Fingerprints searched against the database do not always yield a match with the perpetrator, especially when the fingerprints are very low quality, and may not be sufficient for direct identification. However, in such cases, gender classification can still play a role in the exclusion or inclusion of suspects and serve as an important investigative lead, focusing the investigator on males or females. Additionally, in cases with limited forensic resources, where not all evidence can be thoroughly analyzed immediately, fingerprints can aid in prioritizing investigative efforts. Therefore, focusing on males or females based on fingerprint classification can significantly streamline the investigative process [27].

While this study presents significant findings, it also recognizes its limitations. The performance of VGG19 might be improved by comparing it with other state-of-the-art architectures for a more comprehensive understanding. Additional research can include a broader analysis of real crime scene fingerprint images, beyond the ink fingerprints used in this study. Additionally, the employed methodology can accommodate off-center and rotated fingerprints, but it does not consider scenarios where only a part of the fingerprint is visible. While these cases were intentionally omitted from the study, as they were generated during the augmentation process for training the network, this represents a direction for future research to investigate cases of 50%, “half a ring”, partial fingerprints. These offer a direction for further investigation and future research. Overall, the study findings provide a robust foundation for the ongoing exploration of gender classification using fingerprint images.

Author Contributions

Software, D.S. and Y.A.; Validation, S.K.; Formal analysis, N.S.; Investigation, I.H.; Resources, D.Z.; Supervision, A.B.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Three of these datasets are publicly accessible, namely: (1) NIST-DB4 https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=a9278ab9e051084aab0b710e7a17decd464800a8 (accessed on 27 December 2023) [20], (2) SOCOfing https://doi.org/10.48550/arXiv.1807.10609 [21], and (3) NIST-302 https://doi.org/10.6028/NIST.TN.2007 [22]. Additionally, we incorporated a privately obtained dataset from the Israeli police, henceforth referred to as IsrPoliceDB. Permissions for the use of fingerprint images were granted by the Research and Development unit of the DIFS. To ensure privacy, all fingerprint images were utilized in an anonymized manner, safeguarding the personal privacy of the individuals involved.

Conflicts of Interest

The authors declare no conflict of interest.

References

Schuch, P. Survey on features for fingerprint indexing. IET Biom. 2019, 8, 1–13. [Google Scholar] [CrossRef]
Maltoni, D.; Maio, D.; Jain, A.K.; Prabhakar, S. Handbook of Fingerprint Recognition; Springer: London, UK, 2009. [Google Scholar] [CrossRef]
Falohun, A.; Fenwa, O.; Ajala, F. A Fingerprint-based Age and Gender Detector System using Fingerprint Pattern Analysis. Int. J. Comput. Appl. 2016, 136, 43–48. [Google Scholar] [CrossRef]
Acree, M.A. Is there a gender difference in fingerprint ridge density. Forensic Sci. Int. 1999, 102, 35–44. [Google Scholar] [CrossRef] [PubMed]
Sharma, S.; Shrestha, R.; Krishan, K.; Kanchan, T. Sex estimation from fingerprint ridge density. A review of literature. Acta Biomed. 2021, 92, e2021366. [Google Scholar] [CrossRef]
Terhörst, P.; Damer, N.; Braun, A.; Kuijper, A. What can a single minutia tell about gender. In Proceedings of the International Workshop on Biometrics and Forensics (IWBF), Sassari, Italy, 7–8 June 2018; pp. 1–7. [Google Scholar]
Nayak, A.; Nayak, M.T.; Solanki, J.; Mathur, H.; Srivastava, A.; Gupta, A. Comparative analysis of cheiloscopy, pulpal tissue and fingerprint for gender identification. J. Oral Maxillofac. Pathol. 2023, 27, 585–591. [Google Scholar] [CrossRef] [PubMed]
Berriche, L. Comparative Study of Fingerprint-Based Gender Identification. Genet. Res. 2022, 2022, 1626953. [Google Scholar] [CrossRef]
Jayakala, G. Gender classification based on fingerprint analysis. Turk. J. Comput. Math. Educ. (TURCOMAT) 2021, 12, 1249–1256. [Google Scholar]
Terhörst, P.; Damer, N.; Braun, A.; Kuijper, A. Deep and Multi-Algorithmic Gender Classification of Single Fingerprint Minutiae. In Proceedings of the 21st International Conference on Information Fusion (FUSION), Cambridge, UK, 10–13 July 2018; pp. 2113–2120. [Google Scholar] [CrossRef]
Iloanusi, O.N.; Ejiogu, U.C. Gender classification from fused multi-fingerprint types. Inf. Secur. J. 2020, 29, 209–219. [Google Scholar] [CrossRef]
Wüllenweber, S.; Giles, S. The effectiveness of forensic evidence in the investigation of volume crime scenes. Sci. Justice 2021, 61, 542–554. [Google Scholar] [CrossRef] [PubMed]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; ISBN 9780262035613. [Google Scholar]
Narayanan, A.; Sajith, K. Gender Detection and Classification from Fingerprints Using Pixel Count. In Proceedings of the International Conference on Systems, Energy & Environment (ICSEE), Kannur, India, 12–13 July 2019. [Google Scholar] [CrossRef]
Olufunso, O.S.; Evwiekpaefe, A.E.; Irhebhude, M.E. Determination of gender from fingerprints using dynamic horizontal voting ensemble deep learning approach. Int. J. Adv. Intell. Inform. 2022, 8, 324–336. [Google Scholar] [CrossRef]
Shehu, Y.I.; Ruiz-Garcia, A.; Palade, V.; James, A. Detailed Identification of Fingerprints Using Convolutional Neural Networks. In Proceedings of the 17th IEEE Int. Conf. on Machine Learning and Applications (ICMLA), Orlando, FL, USA, 17–20 December 2018; pp. 1161–1165. [Google Scholar] [CrossRef]
Qi, Y.; Qiu, M.; Jiang, H.; Wang, F. Extracting Fingerprint Features Using Autoencoder Networks for Gender Classification. Appl. Sci. 2022, 12, 10152. [Google Scholar] [CrossRef]
Zha, D.; Bhat, Z.P.; Lai, K.-H.; Yang, F.; Jiang, Z.; Zhong, S.; Hu, X. Data-centric artificial intelligence: A survey. arXiv 2023, arXiv:2303.10158. [Google Scholar]
Jakubik, J.; Vössing, M.; Kühl, N.; Walk, J.; Satzger, G. Data-centric artificial intelligence. arXiv 2022, arXiv:2212.11854. [Google Scholar]
Watson, C.I.; Wilson, C.L. NIST Special Database 4. Fingerprint Database; National Institute of Standards and Technology: Gaithersburg, MD, USA, 1992. [Google Scholar]
Shehu, Y.I.; Ruiz-Garcia, A.; Palade, V.; James, A. Sokoto Coventry Fingerprint Dataset. arXiv 2018, arXiv:1807.10609. [Google Scholar]
Fiumara, G.; Woodgate, B.; Flanagan, P.; Schwarz, M. NIST Technical Note 2007. NIST Special Database 302; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2019. [Google Scholar]
Fiumara, G.; Schwarz, M.; Heising, J.; Peterson, J.; Flanagan, P.; Marshall, K. NIST Special Database 302: Supplemental Release of Latent Annotations; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2019. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
Johnson, J.M.; Khoshgoftaar, T.M. Survey on deep learning with class imbalance. J. Big Data 2019, 6, 1–54. [Google Scholar] [CrossRef]
Hsiao, C.-T.; Lin, C.-Y.; Wang, P.-S.; Wu, Y.-T. Application of convolutional neural network for fingerprint-based prediction of gender, finger position, and height. Entropy 2022, 24, 475. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Cases of partially or low-quality fingerprints: (a) complete fingerprint, (b) the inner cylindrical region, (c) the outer cylindrical region.

Figure 2. The percentage change in F-score compared to baseline results is shown in Table 2. For each database we examined four cases. Gender classification based on (a) 50% of the inner region (first row), (b) 60% of the inner region (second row), (c) 50% of the outer region (third row), (d) 40% of the outer region (forth row).

Figure 3. A percentage change in F-score compared to baseline results is shown in Table 2. For each database we examined five DCAI approaches: cleanlab-OOD, FLIP (Easy and Hard), and MOC (Easy and Hard).

Table 1. Summary of Fingerprint Databases Utilized in This Study. The databases considered/used in the study include IsrPoliceDB, NIST-DB4, SOCOfing, and NIST-302. The ‘X’ and ‘V’ symbols signify the absence and presence, respectively, of specific characteristics within each database.

Database Name	Age	Gender	Ethnic Group	# of Impressions	Image Resolution	Type	Size (Male/Female)
IsrPoliceDB	X	V	Israel	1	JPG 1000 dpi 688 × 975 px	Plan	1271 619 (50%)/652 (50%)
NIST-DB4	X	V	Not mentioned	2	PNG, 96 dpi, 512 × 512 px 8 bit, gray	Plan (2 impressions of each finger)	4000 3250 (81%)/750 (19%)
SOCOfing	X	V	African	1	BMP, 500 dpi, 96 × 103 px	Plan 600 subjects × 10 fingers × 1 impression	6000 4770 (75%)/1230 (25%)
NIST-302	V	V	American	20+	PNG, 500 dpi, 256 × 360 px	plain, rolled, 4-finger, 2-thumb, palm	2000 1583 (75%)/396 (25%)

Table 2. Results of gender classification using the selected VGG19 model on the four different datasets: our internal IsrPoliceDB and the three public datasets—NIST-DB4, NIST-302, and SOCOfing.

Network Name	Train Accuracy	Test Accuracy	F-Score	Precision	Recall	Train and Test Size
IsrPoliceDB	0.97	0.96	0.96	0.96	0.96	1016/255
NIST-DB4	0.94	0.90	0.82	0.86	0.79	800/200
NIST-302	0.88	0.84	0.81	0.81	0.81	1583/396
SOCOfing	0.84	0.80	0.68	0.69	0.68	4800/1200

Table 3. Results of gender classification using fingerprint images employing the NIST-302 dataset across two popular CNN architectures (VGG and ResNet). VGG16, VGG19, ResNet18, ResNet50, and ResNet101 were evaluated. The results are presented in terms of accuracy, precision, recall, and F-score.

Network Name	Train Accuracy	Test Accuracy	F-Score	Precision	Recall
VGG16	0.92	0.83	0.8	0.8	0.81
VGG19	0.87	0.84	0.81	0.81	0.81
ResNet18	0.97	0.76	0.77	0.78	0.78
ResNet50	0.85	0.75	0.76	0.78	0.74
ResNet101	0.87	0.76	0.77	0.79	0.75

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Spanier, A.B.; Steiner, D.; Sahalo, N.; Abecassis, Y.; Ziv, D.; Hefetz, I.; Kimchi, S. Enhancing Fingerprint Forensics: A Comprehensive Study of Gender Classification Based on Advanced Data-Centric AI Approaches and Multi-Database Analysis. Appl. Sci. 2024, 14, 417. https://doi.org/10.3390/app14010417

AMA Style

Spanier AB, Steiner D, Sahalo N, Abecassis Y, Ziv D, Hefetz I, Kimchi S. Enhancing Fingerprint Forensics: A Comprehensive Study of Gender Classification Based on Advanced Data-Centric AI Approaches and Multi-Database Analysis. Applied Sciences. 2024; 14(1):417. https://doi.org/10.3390/app14010417

Chicago/Turabian Style

Spanier, Assaf B., Dor Steiner, Navon Sahalo, Yoel Abecassis, Dan Ziv, Ido Hefetz, and Shimon Kimchi. 2024. "Enhancing Fingerprint Forensics: A Comprehensive Study of Gender Classification Based on Advanced Data-Centric AI Approaches and Multi-Database Analysis" Applied Sciences 14, no. 1: 417. https://doi.org/10.3390/app14010417

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Fingerprint Forensics: A Comprehensive Study of Gender Classification Based on Advanced Data-Centric AI Approaches and Multi-Database Analysis

Abstract

1. Introduction

2. Datasets

3. Methods

4. Results

4.1. Part I

4.2. Part II

4.3. Part III

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI