What Is Hidden in Clear Sight and How to Find It—A Survey of the Integration of Artificial Intelligence and Eye Tracking

Kędras, Maja; Sobecki, Janusz

doi:10.3390/info14110624

Open AccessReview

What Is Hidden in Clear Sight and How to Find It—A Survey of the Integration of Artificial Intelligence and Eye Tracking

by

Maja Kędras

and

Janusz Sobecki

^*

Department of Informatics, Faculty of Computer Systems and Management, Wrocław University of Science and Technology, 50-370 Wrocław, Poland

^*

Author to whom correspondence should be addressed.

Information 2023, 14(11), 624; https://doi.org/10.3390/info14110624

Submission received: 14 October 2023 / Revised: 9 November 2023 / Accepted: 13 November 2023 / Published: 20 November 2023

(This article belongs to the Special Issue Recent Advances and Perspectives in Human-Computer Interaction)

Download

Browse Figures

Versions Notes

Abstract

:

This paper presents an overview of the uses of the combination of eye tracking and artificial intelligence. In the paper, several aspects of both eye tracking and applied AI methods have been analyzed. It analyzes the eye tracking hardware used along with the sampling frequency, the number of test participants, additional parameters, the extraction of features, the artificial intelligence methods used and the methods of verification of the results. Finally, it includes a comparison of the results obtained in the analyzed literature and a discussion about them.

Keywords:

eye tracking; artificial intelligence; machine learning

1. Introduction

In the era of system personalization and with growing emphasis on the user experience, eye tracking technologies are becoming increasingly in demand. Since eye tracking generates an immense amount of data, it is very challenging, or even quite impossible in some cases, to process it by hand. One of the solutions to this problem is using artificial intelligence which is able to automatically identify problem areas or places of a particular user’s attention. The abundance of data is not the only issue which may be solved by artificial intelligence (AI).

Eye tracking in general is used to monitor the places of human sight concentration. Since the beginning of eye tracking studies in the nineteenth century, and over the years, many different technologies and methods have been established and introduced into various disciplines [1]. Currently, the most common technology used in eye tracking is video recording of the eyes using natural or infrared light. For over thirty years eye trackers were widely used in different types of UX and psychology studies [2], such as the usefulness of web or desktop applications, perception of information in the form of graphics or texts, comparative studies of the effectiveness of system interfaces, correlation of eye tracking data with the strategy of searching for information in web systems, etc.

Most of the eye tracking research is conducted using specialized sensors or devices, which are often very expensive and may be need specialized knowledge to operate. Despite the device used, they generate a lot of rough data, growing with the sampling rate. These data should be processed to find the pattern of the eye’s focus, which is usually carried out with the help of different AI tools, especially machine learning (ML) algorithms.

AI is a field of computer science that aims to create intelligent machines. The main issues with AI include programming computers for certain issues such as knowledge, reasoning, problem-solving, perception, learning, planning and the ability to manipulate and move objects [3]. Machine Learning (ML) is an application of AI based around the idea that we let machines have access to data and we let them learn for themselves [4]. ML algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model in order to become more accurate in predicting outcomes without being explicitly programmed to do so.

Currently, the application of AI methods allows researchers to conduct similar experiments using just a consumer-grade camera, which makes this kind of research more accessible. The combination of these two technologies can also be used in various eye movement recognition systems, for example, speech generation systems for paralyzed people, fatigue detection systems, or even in virtual reality games.

Combining eye tracking with artificial intelligence can bring many benefits to science. However, the number of publications on this subject remains relatively small. In this work, the existing applications of these technologies will be reviewed and their quality and usefulness assessed. Further possible directions for the development of this field will also be proposed.

The content of the paper is as follows. Section 2 introduces the research methodology of the review. In Section 2.1 and Section 2.2 the applications of the combination of eye tracking and artificial intelligence are described and categorized. The following chapter describes the eye trackers and the sampling frequencies used in the analyzed literature. Section 2.3 describes the number of people participating in the re-search and provides available information about them. Section 2.4 contains information about the additional parameters used with eye tracking data. Section 2.5 categorizes the feature extraction types, whereas the eighth chapter analyzes the methods of artificial intelligence used in the research and their number per study. Section 2.7 analyzes the methods of the results’ verification and their number per study. Section 3 shows comparable results obtained in the analyzed literature. The last chapter provides a summary and discussion of the collected data.

2. Materials and Methods

This survey is based on Systematic Literature Review (SLR) methodology [5]. This methodology allows the work of other researchers in the field to be summarized in an orderly and reproducible manner. It was used here for the purpose of investigation the use of artificial intelligence in the field of eye tracking. For this purpose, the following research questions were asked:

Which eye trackers were used when collecting data for AI?
Which sampling frequencies were used when collecting data for AI?
What kind of non-eye tracking parameters were used when AI was used?
How many people participated in the experiments collecting eye tracking data to be used with AI?
What is the gender distribution of the participants and what age range are they in?
How were the features extracted?
How many artificial intelligence methods were used in one eye tracking study?
Which methods of artificial intelligence were used with eye tracking data?
How were the results of using AI with eye tracking data verified?

For this survey papers were collected using the Scopus database. Papers were collected from the time period from 2015 to 2020. Only papers available in English have been taken into account. There were 5 queries used:

eye AND tracking AND artificial AND intelligence;
eye AND movement AND artificial AND intelligence;
gaze AND estimation AND artificial AND intelligence;
smartphone AND eye AND tracking;
webcam AND eye AND tracking.

The above queries relate to the words that are present in the article keywords, titles and abstracts. The search results overlapped so duplicates were removed before proceeding to abstract analysis. Then, the criteria for rejection of the publication were selected. They are as follows:

The research analyzed only static images.
The research detected only the eye position.
The eye tracking data were collected using Electroencephalography (EEG).
Artificial intelligence was not used on eye tracking data nor to calculate eye tracking data.
The paper is not accessible.

After application of the procedure described above, 93 papers were selected.

The methodology employed in this systematic literature review is a robust approach for summarizing the work of other researchers in the field of eye tracking and artificial intelligence. However, in conjunction with the established selection criteria, there are several potential limitations to consider. To start with, this review relies on papers available in the Scopus database, which may not include all relevant research in the field. Additionally, the use of works exclusively in English leads to language bias which may have caused the omission of valuable studies. An important limitation is also the period from which the works for analysis were selected. The field of artificial intelligence and eye tracking is rapidly evolving, and this time frame may omit recent developments and studies that were not yet published at the time of conducting the survey. Lastly, while the search queries used are comprehensive, it is possible that some relevant studies are missed due to the limitations of the keyword search. Different terminology or less common keywords might not have been included in the search.

2.1. Applications of Artificial Intelligence Enhanced Eye Tracking

Artificial intelligence-enhanced eye tracking has its applications in many different areas. Teaching and learning applications are the largest group observed. Three of those studies use eye tracking data to predict student performance [6,7,8,9]. Another interest is reading, in terms of predicting reading ability [10], recognizing reading behavior [11,12], detecting readability [13] and detecting words which are difficult for the readers [14]. In the case of words’ analysis, their understanding was also predicted [15]. Another use of AI and eye tracking was in identifying levels of comprehension [16]. It was also used to predict SAT scores [17], cognitive abilities [18], learning curves [19] and detect the speed of learning [20]. The next usage of AI and eye tracking was concentrated on predicting the type of prior disclosure [21]. The last application from this group was in predicting the social plane of interaction of a teacher conducting their classes [22].

The second group of applications of eye tracking enhanced with artificial intelligence is in emotion recognition. First of the considered studies focused on predicting which of several emotions was being felt by the participants [23] and the second predicted the aesthetic impression of a website [24]. Others were predicting reactions to advertising [25], predicting perceived face attractiveness [26], recognizing affect [27] and recommending paintings which the participants would like [28]. The rest of the papers focused on a single emotion: excitement [29], enjoyment [30], interest [31], confidence [32], confusion [33], stress [34] and satisfaction [35]. All of the listed publications focused on participants’ emotions but two additional studies were considered which were predicting the emotions of other people using the eye tracking data of their observers [36,37].

The third group distinguished in this paper is that of medical applications. AI-enhanced eye tracking is mostly used to detect neurological disorders such as autism spectrum disorder [38,39], schizophrenia [40], Parkinson’s disease [41] or dyslexia [42]. Three of the considered publications used eye tracking data to predict whether the patient has any neurological disease [43,44,45] and one detected organs in fetal ultrasound images [46].

Another group which could be identified is that of human behavior. One example of that group would be detecting the type of the participant’s activity both, when using a computer [47,48] and in everyday life [49,50,51,52]. Another usage is in predicting decision strategy [53] as well as taking in consideration ethical decision making [54]. Eye tracking data were also used to teach AI to play games [55], predict dwell time in museums [56], automatically assess surgery skills [57] and detect eye contact [58]. In terms of human behavior, research about intention was also found [59,60,61,62].

The fourth group distinguished in this survey is research on predicting tiredness, attention [63,64] and distraction [65,66,67,68]. In contrast to attention and distraction, tiredness was not detected directly [69]. It was indicated by task demand [70], take-over time [71], mental workload [72], operator overload [68] and reduced alertness [73].

The fifth identified group is research using eye tracking data as a way to interact with a software. It was used as to authenticate user, both by entering the password [74,75,76,77] and using sclera biometrics [78], to detect defined gestures [79,80,81,82], the desired direction of movement [83] or choosing an answer in a questionnaire form [84].

A separate category is that of gaze estimation, which was carried out in 13 studies [85,86,87,88,89,90,91,92,93,94,95,96,97].

The two last studies which do not fit any of the described categories used eye tracking data to recognize and classify objects [98] and distinguish Chinese ethnic groups [99].

2.2. Eye Trackers

Researchers used the following eye trackers (Figure 1). It is worth noting that in two studies [12,24] two different eye tracking devices were used. One used SMI and Tobii and the other used Tobii and EyeTribe eye trackers [24]. In both cases they were used in parallel and were not compared.

Almost a quarter of the studies were conducted using Tobii hardware, and more than one tenth used a simple web camera. Sadly over 8% of papers did not specify the type of eye tracker used, but even without that data we can say that there are a lot of options for eye tracking. In total, 12.9% of studies used hardware which was used only by them in the scope of the analyzed research. On the one hand, it prevents monopolization of the market by a single manufacturer, which also means that eye tracking research is more accessible to perform, but on the other hand it may make studies harder to compare.

Since Tobii eye trackers have been used the most times, it is possible to observe the use of many different models. The most common was the Tobii T120 (5 papers), next was the Tobii EyeX (4 papers) and lastly the Tobii T60, Tobii X1 and Tobii X2-30 (two papers each). The Tobii hardware used in only one paper were the Tobii 175, Tobii 4C, Tobii Steelseries Sentry, Tobii TX300, Tobii X2-60, Tobii X3-120 and Tobii X300. One paper did not specify the exact model used. Regarding the SMI hardware, the most popular model was the SMI RED 250 (4 papers). One paper used the SMI RED 4 and two did not specify the model they used.

In terms of sampling rate, 4.3% of studies used external data and 38.7% of studies did not specify it. The remaining 57% papers used the sampling rates shown in the graph below. When an interval was given, the minimum value was included (Figure 2).

The frequencies used only by one paper were (in Hz) 4.5, 5, 10, 15, 17, 28, 50, 150, 176, 240, 256, 3000 and 8000.

There is no clear consensus between the researchers about the proper eye tracking sampling frequency, but there is a tendency to use higher frequencies (above 200 Hz) when using velocity-based event detection algorithms [100]. In terms of detecting saccades and fixations’ sampling frequency, a change from 60 Hz to 120 Hz does not seem to provide significant improvement in the fixations’ detection rate [101], but it is important when evaluating saccades [96]. For this exact purpose frequencies lower than 200 Hz are discouraged in the case of saccades’ speed studies [102]. Overall, since fixations take less time, they require smaller frequencies than saccades and microsaccades [103]. That is why low-level research connected with visual cognition usually requires frequencies of 1000 Hz to 2000 Hz [104].

In terms of this survey, we can observe a tendency to use frequencies of 30 Hz and 60 Hz. The 30 Hz frequency gained its popularity as it was used as an the American television standard NTSC, whereas 60 Hz was commonly used in cameras. The third sampling rate, in terms of the number of studies which used it, is 120 Hz, and apart from that we can clearly see that, similar to the case of eye trackers, there is no tendency to use one particular frequency. There are also no justification for the selected frequency, at most researchers include the justification that it is a frequency sufficient for the conducted research. This indicates that scientists are using the highest sampling rate of the eye tracker at the disposal of the researchers. The use of lower frequencies occurs mainly when there is a need to synchronize an eye tracker with another sensor which has a lower sampling rate. There is no clear reason for the highest sampling frequencies. They were used for detecting the speed of reading (8000 Hz), detecting people with dyslexia (3000 Hz), predicting web user click intention (1000 Hz), using eye-tracking data as an input for teaching AI to play computer games in a similar way to humans (1000 Hz), detecting cognitive health (500 Hz), predicting intention (500 Hz) and detecting reading abilities (500 Hz). We can say that they were used for research connected with cognition and mental health, so, psychological studies.

In terms of the lowest sampling frequencies, most of those under 30 Hz were used when a web or mobile camera was chosen as an eye tracker (4), some used Tobii hardware (3) and one used HTC Vive. They were usually used for tasks related to detecting predefined types of behavior: predicting targets (17 Hz), detecting eye contact (25 Hz), task recognition (25 Hz) and behavior identification (28 Hz). They were also used for attention estimation (5 Hz) and identifying levels of user comprehension (15 Hz). The study with the lowest sampling frequency (4.5 Hz) used higher sampling frequencies as well, but since it was conducted by the participants themselves, using their own hardware, such frequencies were the lowest used but not the only ones. The aim of that study was gaze estimation.

In terms of illumination during an eye tracking experiment, 25.8% of the papers included some information about lighting conditions but almost all of them would not be sufficient to conduct an experiment in similar conditions. They only mention that they kept the illumination constant or similar throughout the experiment. Only one paper included results for different illuminations. When considering only experiments conducted using cameras and not eye trackers, the percentage of papers including information on lighting conditions is 57.89% which is considerably higher than the overall percentage. This is understandable, since cameras are more sensitive to changes in lighting than eye trackers, however, it would be advisable to make these data more accurate and appear more frequently in papers.

2.3. Participants

Overall, 49.25% of the participants of the described studies were men and 50.75% were women. However, when we look at the average proportions per study there are usually 55.91% men and 44.09% women. Sadly, not all researchers specified the gender ratio of their participants, so the figures given are based only on the studies that have done so.

The size of the research group varies greatly, with the smallest consisting of only one person and the largest having 2334 participants. As with eye trackers and sampling rates, there is no clear trend here. Researchers usually chose a group of 17 to 33 people, and it can be theorized that, again, this is simply the smallest group which allows for obtaining statistically significant results.

Since gathering participants may be the one of the biggest challenges of eye tracking studies, it may be surprising that only 5.38% of the analyzed papers used databases created by other researchers, but this may be explained by the fact that such databases are few and they may not be sufficient for very specific applications (Figure 3).

As much as 31.18% of the papers have not given additional information about the participants. A total of 22.58% clearly defined the participants of their study as students and 8.6% have stated an age range which strongly suggests that their participants were also students. Finally, 4.3% of the papers described their participants as children. Clearly, the biggest issue is these papers not giving proper information about their participants. However, based on the available data, we can infer that most of the research is carried out on young people, in particular students, and adults and the elderly are not adequately represented in eye tracking research.

Other information which the papers were usually lacking was the participants’ vision. Only 22.58% included such information, but in most cases (42.86% of the papers with information about the participant’s vision) it stated that participants had normal or corrected-to-normal vision without giving accurate data on the proportion and method of correction. A total of 23.81% of papers included information about the number of participants wearing glasses, and the exact same amount of papers included only people with correct vision. Only one paper conducted an experiment with participants both wearing glasses and not wearing them, and one stated that participants were not asked to remove their glasses during the experiment which suggests that there were participants wearing glasses in that study.

Additionally, it is worth noting that only 17.2% of papers contained information about the approval of the research by an ethics committee. One paper took Google’s AI Principles into consideration when designing its experiment but has not included any information about the approval of an ethics committee.

2.4. Additional Data for Artificial Intelligence

Additional parameters were used for artificial intelligence by 23.65% of the analyzed studies. The most commonly used data were obtained with electroencephalography (EEG), which is a non-invasive method of recording electrical activity on the scalp which is used to determine the activity of the brain. Since many experiments which have used eye tracking data are connected with cognition this is understandable. Equally popular is movement and position data. Position data might be especially important since it may influence eye tracking data. Another parameter which can be used is a video of the face, which allows researchers to estimate the emotions of study participants. An equally common parameter is the time which the participant needed to perform the task under consideration. Lastly, data about the study subjects were used. Four studies used their age and three used their gender.

Since the area of eye tracking research is quite wide, the parameters that may be considered are also quite diverse and sometimes really study-specific, like, for example, data describing studied texts, images or videos. All of the additional parameters used in the remaining papers are included in Table 1.

2.5. Features Extraction

Feature selection usually begins every application of AI methods. The application of a proper feature selection method has a very large impact on the obtained results of the AI algorithms, no matter the area of application. This is of even greater importance when the AI algorithms have to deal with large amounts of data, as is usually the case in image processing applications. This is also the case in eye tracking. One of the main features of all eye trackers is the frequency of gathering data on the participant eye focus coordinates as well as eye blinks and pupil size, which can vary based on changes in lighting and the mental state of the participant [105]. The eye tracker frequency may vary from 15 Hz to 1000 Hz. With increasing frequency, the amount of data increases, so in some cases we need to select some features which aggregate the raw data, such as dwell times on AOI or heat maps [12].

Feature extraction is also crucial when working with visual imagery. Simplification may include scaling down images, converting them to grey-scale and using Principal Component Analysis. Such an approach can transform sets with thousands of features to several dozen components ready for further analysis [106]. This method may be further improved by the autoencoding technique, which efficiently reduces dimensionality and extracts meaningful features from eye-tracking data [107].

In this paragraph we have presented the categorization of the feature selection types which are present in the analyzed literature. We propose the following categories:

A.: Typical eye tracking data which are gathered via typical eye tracker software, such as Tobii Studio or Tobii Pro Lab [104];
B.: Eye tracking data after some additional processing, which are not present in typical eye tracker software, such as more sophisticated statistics or transformations (i.e., DWT);
C.: The application of some basic ML algorithms to eye tracker data such as k-means or decision trees;
D.: The application of neural networks or deep learning;
N.: Not specified in the article.

The figure below presents the percentages of each of the feature extraction types used in the analyzed literature (Figure 4).

2.6. Artificial Intelligence Methods Used with Eye Tracking

Most of the surveyed studies used only one artificial intelligence but as many as 37.6 percent decided to compare at least two different AIs. It is worth noting, however, that in many works using only one AI, the researchers compared results using different parameters (Figure 5).

Testing different methods of artificial intelligence is very beneficial because often there is no clear rationale for using one particular solution instead of another.

This is especially true when we look at the type of methods used. As many as 40.9 percent of the works used AIs that were not used in any of the other considered works, which clearly shows the variety of available solutions. It is also difficult to observe tendencies to use specific algorithms in specific areas of research, apart from the most popular AI seeming to be the support-vector machine (SVM), which is commonly used for data classification, especially in the field of image recognition. The second choice for researchers was Random Forest, which creates a multitude of decision trees and calculates the result based on the predictions of the individual trees. The third method was a convolutional neural network (CNN), which is commonly used in image recognition. It is worth noting that the multilayer perceptron (MLP) and CNN are both types of artificial neural networks (ANN), which, combined, were used in 34.5% of the considered papers, but since the researchers decided to list them separately this separation was kept in this survey.

By analyzing the use of SVMs and neural networks (all types), changes in their popularity over the years can be noticed. In 2015, the SVM was used in 65% of surveyed publications, whereas 29% has chosen neural networks. From that moment on, SVMs began to be used less and less, and neural networks more and more often. In 2020, SVMs would be used in 22% of the analyzed studies, and neural networks in 56%. Neural networks are becoming an increasingly popular method of artificial intelligence in many fields, and it is not surprising that this is the case for eye tracking applications. There are more and more ready-made solutions that allow scientists to use this method, even without detailed knowledge of it, and, what is more, they allow the use of incomplete data. Much smaller changes in use occurred in the case of Random Forest, which was used in 30% of publications in 2016, and then decreased its share to 22% in 2020. The continuing popularity of Random Forest is probably due to the ease of using this method while it provides fairly good results (Figure 6).

The artificial intelligence methods which were used by only one paper include a bag of visual words, Bayes net, Bayesian classifier, Bayesian lasso regression, boosted logistic regression, canopy, CNN + long short-term memory, decomposition tree, Deep Bayesian Network, discriminant analysis, DNN, double q-learning, extremely randomized trees, farthest first, generalized additive models, generative model base method, gradient boost, hidden-state conditional random fields, hierarchical clustering, lasso regression, least-squares regression, low-rank constraint, Mahalanobis distance-based classifier, mixed group ranks, multi-layer combinatorial fusion, multinomial logistic regression, radial basis function, random sample consensus, recurrent neural network, recurrent neural networks with long short-term memory, repeated incremental pruning to produce error reduction, semi-supervised extreme learning machine, sequential minimal optimization, Static Bayesian Network with supervised clustering, a strengthened deep belief network, Tabu search, transfer learning and a Viola-Jones algorithm with haar cascade classifiers.

In terms of the best results, we can see a similar tendency. Mostly the best results were achieved by AIs which were used in only one paper, then SVMs, followed by Random Forest and convolutional neural networks. If a study used only one AI it was considered to be the best (Figure 7).

The artificial intelligence methods which were chosen by only one paper include a Gaussian process regression, Bayesian lasso regression, long short-term memory net-work, generative model base method, DNN, transfer learning, Deep Bayesian Network, Tabu search, decision tree, low-rank constraint, linear discriminant analysis, ensemble, lasso regression, naive Bayes, a strengthened deep belief network, semi-supervised extreme learning machine, Support vector regression, Decomposition tree, recurrent neural network, CNN + long short-term memory, Viola-Jones algorithm with haar cascade classifiers, recurrent neural networks with long short-term memory, random sample consensus and linear regression.

2.7. Methods for Verification of the Results

Almost three quarters of the researchers used only one method of verification for their results, while the rest used from two to five methods (Figure 8).

Most of the studies used accuracy as a result of their study, but, secondarily, there are methods used by only one paper which are often study-specific. It makes it extremely difficult to compare different results so a comparison will be made between only the results which were verified using the accuracy value. Apart from that, the next three most popular indicators were precision, recall and f-score (also called f1-score). Including accuracy, they are all based on components of the confusion matrix: the true positive (TP), false positive (FP), false negative (FN) and true negative (TN). Their formulas are as follows [108]:

a c c u r a c y = \frac{T P + T N}{T P + F P + F N + T N}

p r e c i s s i o n = \frac{T P}{T P + F N}

r e c a l l = \frac{T P}{T P + F N}

f - s c o r e = \frac{2}{p r e c i s s i o n + r e c a l l}

All papers used those formulas or did not specify which ones they had used, possibly considering them universally accepted (Figure 9).

The verification methods used by only one paper include accuracy in degrees, angular error, average distance error, average error, average hit ratio, average success rate, average visual angle error, confidence, cross-validation error, D error, discounted cumulative gain, equal error rate, error, error in centimeters, false positive rate, G-mean, gaze estimation bias (degrees), improvement in game score, Mann–Whitney u-value, mean absolute residual, mean and standard deviation of fp and fn, mean angular error, mean squared error, overall average error, R-squared, relative difference to baseline, reliability, root mean squared error, percent of screen size error, sensitivity index, specificity, support and visual angle.

3. Results

In the event that no single result was given, the best result obtained in a given publication is given. Some of the papers have not specified the number of participants (ns) and some used external data sources (ext.) (Table 2).

When looking at the results we can see that the SVM gives better results than Random Forest, but it is important to note that it is possible that this was caused by the fact that SVM was used more often. However, neural networks seem to give consistently higher results than Random Forest, even though Random Forest gave two results higher than neural networks were capable of giving in this survey.

There is no clear correlation between the number of participants and the resulting accuracy of the AI. A larger number of subjects might lead to having too diverse a dataset, which may make the prediction more challenging, but on the other hand, having a smaller number of participants may lead to a dataset not diverse enough to properly predict the desired parameters.

The papers which used more than one artificial intelligence method produced slightly better results. It might be also worth noticing that the half of the papers with the higher accuracies did not use sampling frequencies higher than 256 or lower than 30. Sadly, some of the frequencies were not specified. Furthermore, of the top 20 results, only one used an additional parameter, which may suggest that studies with information clearly associated with eyeball movement give the best results, and that combining eye tracking data with other kinds of data is not always the best choice.

There seems to be no correlation between the result and the type of eye tracker, especially since the researchers used many different devices.

4. Discussion

Eye tracking and artificial intelligence appeared in a variety of applications connected to measuring academic performance, emotion recognition, medical studies, human behavior and tiredness detection. This combination also made it possible to use eye movement as an input and to track it using digital, web and mobile cameras.

The choice of the eye tracker, sampling frequency, artificial intelligence algorithm and verification method is characterized by a huge variety, which may result from the various fields of application. Many researchers decide to use unique solutions that do not appear in other works, which may indicate that this remains a new field of research, which is just developing research standards.

There are some noticeable trends though. The most popular eye trackers were made by Tobii and most common sampling frequencies were 30 and 60 Hz. The most popular artificial intelligence methods used for eye tracking data analysis were SVM and Random Forest, while the results were most often judged on the basis of their accuracy. Unfortunately, information about the illumination during the experiments is usually lacking.

The research was carried out on groups of various sizes, most often from 17 to 33 people, with a relatively equal gender division. On the other hand, almost one quarter of experiments were conducted on students, while adults and the elderly were usually underrepresented. Sadly, information about the participant’s eye-sight was usually not included and even if it was it was not detailed enough to replicate the study. Another issue was the lack of clear information about the approval of an ethics committee.

One clear parameter allowing for the higher accuracy of artificial intelligence using eye tracking data has not been found. Recommendations could include the use of a sampling rate between 30 and 265 Hz and the use of more than one artificial intelligence method.

Clearly, eye tracking data analysis, especially with the support of artificial intelligence, can teach us a lot about human nature, and there is still a lot to discover. The fields in which this technology can be used are very diverse, which on the one hand makes it difficult to compare the results, but on the other hand shows the great possibilities of its use. The accuracy of some solutions still leaves room for improvement, but they still show various correlations between human behavior and emotions, due to which they can act as a clue for researchers in the fields of psychology, medicine, didactics, etc., when choosing the subjects of their research.

Future research directions offer a wealth of opportunities, including expanding this technology’s applications across diverse demographic groups, enhancing multimodal integrations, and exploring novel clinical, educational, and cross-cultural domains. Exploring education and the detection of psychological disorders holds great promise as a starting point for future research. In education, the integration of AI-enhanced eye tracking can transform teaching and learning methods, while in the realm of psychology, it offers potential for early diagnosis and interventions, including interventions in real time during tasks like driving. These research directions are poised to yield practical, impactful solutions. To further advance the field, interdisciplinary collaborations could foster the development of holistic solutions that draw from various domains and have broader societal impacts. Additionally, cross-cultural validation is essential, particularly in emotion recognition and behavior prediction, to ensure the cultural sensitivity and accuracy of AI models. Real-time interventions based on gaze behavior in educational settings hold potential for enhancing learning outcomes.

There are, however, some areas which should be improved in future studies. In some papers, vital information about the study participants is often missing, leaving a significant gap in our understanding. This lack of information spans from the basic number of participants to more detailed aspects like gender distribution, age ranges, and eye sight parameters (such as the use of glasses). Such information should be included by future researchers to ensure that their studies are more comprehensive and ultimately more impactful. Perhaps the most crucial improvement in terms of participants’ information is the need for the inclusion of documented approval from ethics committees.

A promising avenue for collaborative progress could revolve around the establishment of comprehensive eye tracking databases, allowing researchers from diverse backgrounds to access and analyze this valuable resource. It is noteworthy that a mere 5.33% of papers in our study drew upon external data sources, signifying a missed opportunity for the broader scientific community to benefit from shared datasets and foster collective advancements in eye tracking research. Incorporating additional parameters, such as electroencephalography (EEG), participant position, and movement data, as well as experiment-related factors like duration and stimulus parameters, holds the potential to enrich the depth and breadth of eye tracking studies. These additional dimensions offer a holistic perspective on the cognitive and contextual aspects influencing visual attention, paving the way for more comprehensive and nuanced findings in the field of eye tracking research. Researchers should consider these multifaceted variables as valuable assets in their quest to unravel the complexities of visual perception and cognition.

Furthermore, fostering the use and comparison of multiple AI methods within research endeavors is poised to substantially elevate the quality and rigor of eye tracking studies. The absence of a clear rationale for choosing one AI solution over another underscores the need for comprehensive comparisons. By exploring and evaluating various AI techniques, researchers can identify the most effective solutions tailored to the specific challenges they aim to address.

Lastly, a critical advancement required in the field of eye tracking research lies in the inclusion of standardized and diverse methods for analyzing result verifications. While accuracy emerges as the most frequently employed parameter, it is crucial to consider its future integration with additional performance metrics like precision, recall, and F-score. These established metrics provide a more comprehensive understanding of the true accuracy of AI models. Furthermore, the research community may find it advantageous to explore or develop novel verification methods tailored specifically to the nuances of AI and eye tracking. This approach not only enhances the quality and precision of eye tracking research but also facilitates the comparability of results across different papers. By establishing standardized verification methods and broadening the spectrum of the metrics employed, researchers can effectively compare and benchmark their findings with those from other studies.

5. Conclusions

Eye tracking data analysis, particularly when combined with artificial intelligence, offers valuable insights into human behavior and emotions. Its versatile applications make result comparison challenging but highlight its immense potential. While their accuracy can still be improved, these solutions provide valuable insights for researchers in psychology, medicine, education, and other fields, when selecting their research subjects.

Author Contributions

Conceptualization, M.K. and J.S.; methodology, M.K. and J.S.; software, M.K. and J.S.; validation, M.K. and J.S.; formal analysis, M.K. and J.S.; investigation, M.K. and J.S.; resources, M.K. and J.S.; data curation, M.K. and J.S.; writing—original draft preparation, M.K. and J.S.; writing—review and editing, M.K. and J.S.; visualization, M.K. and J.S.; supervision, J.S.; project administration, M.K.; funding acquisition, J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data is contained in the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Holmqvist, K.; Nyström, M.; Andersson, R.; Dewhurst, R.; Jarodzka, H.; Van de Weijer, J. Eye Tracking: A Comprehensive Guide to Methods and Measures; OUP: Oxford, UK, 2011. [Google Scholar]
Bojko, A. Eye Tracking the User Experience: A Practical Guide to Research; Rosenfeld Media: Brooklyn, NY, USA, 2013. [Google Scholar]
Russell, S.J.; Norvig, P. Artificial Intelligence: A Modern Approach, 3rd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2009. [Google Scholar]
Bishop, C.M. Pattern Recognition and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
Kitchenham, B.; Charters, S. Guidelines for Performing Systematic Literature Reviews in Software Engineering; Version 2.3; Technical Report, EBSE Technical Report EBSE-2007-01; Keele University and Durham University Joint Report; Keele: Staffs, UK, 2007. [Google Scholar]
Sharma, K.; Giannakos, M.; Dillenbourg, P. Eye-tracking and artificial intelligence to enhance motivation and learning. Smart Learn Environ. 2020, 7, 13. [Google Scholar] [CrossRef]
Sharma, K.; Papamitsiou, Z.; Giannakos, M. Building pipelines for educational data using AI and multimodal analytics: A “grey-box” approach. Br. J. Educ. Technol. 2019, 50, 3004–3031. [Google Scholar] [CrossRef]
Peterson, J.; Pardos, Z.; Rau, M.; Swigart, A.; Gerber, C.; McKinsey, J. Understanding Student Success in Chemistry Using Gaze Tracking and Pupillometry. In Artificial Intelligence in Education; Conati, C., Heffernan, N., Mitrovic, A., Verdejo, M.F., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 358–366. [Google Scholar]
Zhan, Z.; Zhang, L.; Mei, H.; Fong, P.S.W. Online Learners’ Reading Ability Detection Based on Eye-Tracking Sensors. Sensors 2016, 16, 1457. [Google Scholar] [CrossRef] [PubMed]
Yi, J.; Sheng, B.; Shen, R.; Lin, W. Real Time Learning Evaluation Based on Gaze Tracking. In Proceedings of the 2015 14th International Conference on Computer-Aided Design and Computer Graphics (CAD/Graphics), Xi’an, China, 26–28 August 2015; pp. 157–164. [Google Scholar]
Liao, W.-H.; Chang, C.-W.; Wu, Y.-C. Classification of Reading Patterns Based on Gaze Information. In Proceedings of the 2017 IEEE International Symposium on Multimedia (ISM), Taichung, Taiwan, 11–13 December 2017; pp. 595–600. [Google Scholar]
González-Garduño, A.; Søgaard, A. Learning to Predict Readability Using Eye-Movement Data From Natives and Learners. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar] [CrossRef]
Garain, U.; Pandit, O.; Augereau, O.; Okoso, A.; Kise, K. Identification of Reader Specific Difficult Words by Analyzing Eye Gaze and Document Content. In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 9–15 November 2017; pp. 1346–1351. [Google Scholar]
Orlosky, J.; Huynh, B.; Hollerer, T. Using Eye Tracked Virtual Reality to Classify Understanding of Vocabulary in Recall Tasks. In Proceedings of the 2019 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR), San Diego, CA, USA, 9–11 December 2019; pp. 66–667. [Google Scholar]
Li, J.; Ngai, G.; Leong, H.V.; Chan, S.C.F. Your Eye Tells How Well You Comprehend: 2016 IEEE 40th Annual Computer Software and Applications Conference, COMPSAC 2016. In Proceedings of the 2016 IEEE 40th Annu Comput Softw Appl Conf Workshop COMPSAC 2016, Atlanta, GA, USA, 10–14 June 2016; Volume 2, pp. 503–508. [Google Scholar]
Howe, A.; Nguyen, P. SAT Reading Analysis Using Eye-Gaze Tracking Technology and Machine Learning. In Intelligent Tutoring Systems; Nkambou, R., Azevedo, R., Vassileva, J., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 332–338. [Google Scholar]
Conati, C.; Lallé, S.; Rahman, M.A.; Toker, D. Further Results on Predicting Cognitive Abilities for Adaptive Visualizations. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017; pp. 1568–1574. [Google Scholar]
Lallé, S.; Toker, D.; Conati, C.; Carenini, G. Prediction of Users’ Learning Curves for Adaptation while Using an Information Visualization. In Proceedings of the 20th International Conference on Intelligent User Interfaces, Atlanta, GA, USA, 29 March–1 April 2015; pp. 357–368. [Google Scholar]
Lallé, S.; Conati, C.; Carenini, G. Prediction of individual learning curves across information visualizations. User Model. User-Adapt. Interact. 2016, 26, 307–345. [Google Scholar] [CrossRef]
Król, M.; Król, M.E. Eye movement anomalies as a source of diagnostic information in decision process analysis. J. Exp. Psychol. Learn Mem. Cogn. 2021, 47, 1012–1026. [Google Scholar] [CrossRef]
Prieto, L.; Sharma, K.; Dillenbourg, P.; Jesús, M. Teaching Analytics: Towards Automatic Extraction of Orchestration Graphs Using Wearable Sensors. In Proceedings of the LAK‘16: Proceedings of the Sixth International Conference on Learning Analytics & Knowledge, Edinburgh, UK, 25–29 April 2016. [Google Scholar] [CrossRef]
Matsuda, Y.; Fedotov, D.; Takahashi, Y.; Arakawa, Y.; Yasumoto, K.; Minker, M. EmoTour: Estimating Emotion and Satisfaction of Users Based on Behavioral Cues and Audiovisual Data. Sensors 2018, 18, 3978. [Google Scholar] [CrossRef] [PubMed]
Pappas, I.O.; Sharma, K.; Mikalef, P.; Giannakos, M.N. How Quickly Can We Predict Users’ Ratings on Aesthetic Evaluations of Websites? Employing Machine Learning on Eye-Tracking Data. Responsible Des Implement Use Inf. Commun. Technol. 2020, 12067, 429–440. [Google Scholar]
Sun, W.; Li, Y.; Sheopuri, A.; Teixeira, T. Computational Creative Advertisements. In Proceedings of the WWW ‘18: Companion Proceedings of the The Web Conference 2018, Lyon, France, 23–27 April 2018. [Google Scholar] [CrossRef]
Schweikert, C.; Gobin, L.; Xie, S.; Hsu, D.F. Preference Prediction Based on Eye Movement Using Multi-layer Combinatorial Fusion. In Brain Informatics; Wang, S., Yamamoto, V., Su, J., Yang, Y., Jones, E., Iasemidis, L., Mitchell, T., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 282–293. [Google Scholar]
Emsawas, T.; Fukui, K.; Numao, M. Feasible Affect Recognition in Advertising Based on Physiological Responses from Wearable Sensors. In Advances in Artificial Intelligence; Ohsawa, Y., Yada, K., Ito, T., Takama, Y., Sato-Shimokawara, E., Abe, A., Mori, J., Matsumura, N., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 27–36. [Google Scholar]
Felício, C.Z.; De Almeida, C.M.M.; Alves, G.; Pereira, F.S.F.; Paixão, K.V.R.; De Amo, S.; Barcelos, C.A.Z. VP-Rec: A Hybrid Image Recommender Using Visual Perception Network. In Proceedings of the 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI), San Jose, CA, USA, 6–8 November 2016; pp. 70–77. [Google Scholar]
Abdessalem, H.B.; Chaouachi, M.; Boukadida, M.; Frasson, C. Toward Real-Time System Adaptation Using Excitement Detection from Eye Tracking. In Intelligent Tutoring Systems; Coy, A., Hayashi, Y., Chang, M., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 214–223. [Google Scholar]
Gonzalez Viejo, C.; Fuentes, S.; Howell, K.; Dunshea, F.R. Robotics and computer vision techniques combined with non-invasive consumer biometrics to assess quality traits from beer foamability using machine learning: A potential for artificial intelligence applications. Food Control 2018, 92, 72–79. [Google Scholar] [CrossRef]
Healy, G.; Smeaton, A. Eye fixation related potentials in a target search task. In Proceedings of the 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Boston, MA, USA, 30 August–3 September 2011; pp. 4203–4206. [Google Scholar]
Smith, J.; Legg, P.; Matovic, M.; Kinsey, K. Predicting User Confidence During Visual Decision Making. ACM Trans. Interact. Intell. Syst. 2018, 8, 10:1–10:30. [Google Scholar] [CrossRef]
Lallé, S.; Conati, C.; Carenini, G. Predicting Confusion in Information Visualization from Eye Tracking and Interaction Data. In Proceedings of the IJCAI’16: Twenty-Fifth International Joint Conference on Artificial Intelligence, New York, NY, USA, 9–15 July 2016. [Google Scholar]
Ciupe, A.; Florea, C.; Orza, B.; Vlaicu, A.; Petrovan, B. A Bag of Words Model for Improving Automatic Stress Classification. In Proceedings of the Second International Afro-European Conference for Industrial Advancement AECIA 2015, Villejuif, France, 9–11 September 2015; pp. 339–349. [Google Scholar]
Lu, W.; Jia, Y. Inferring User Preference in Good Abandonment from Eye Movements. In Web-Age Information Management; Dong, X.L., Yu, X., Li, J., Sun, Y., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 457–460. [Google Scholar]
López-Gil, J.-M.; Virgili-Gomá, J.; Gil, R.; Guilera, T.; Batalla, I.; Soler-González, J.; García, R. Method for Improving EEG Based Emotion Recognition by Combining It with Synchronized Biometric and Eye Tracking Technologies in a Non-invasive and Low Cost Way. Front. Comput. Neurosci. 2016, 10, 85. [Google Scholar]
Lu, B.; Duan, X. Facial Expression Recognition Based on Strengthened Deep Belief Network with Eye Movements Information. In Artificial Intelligence in China; Liang, Q., Wang, W., Mu, J., Liu, X., Na, Z., Chen, B., Eds.; Springer: Singapore, 2020; pp. 645–652. [Google Scholar]
Nag, A.; Haber, N.; Voss, C.; Tamura, S.; Daniels, J.; Ma, J.; Chiang, B.; Ramachandran, S.; Schwartz, S.; Winograd, T.; et al. Toward Continuous Social Phenotyping: Analyzing Gaze Patterns in an Emotion Recognition Task for Children With Autism Through Wearable Smart Glasses. J. Med. Internet Res. 2020, 22, e13810. [Google Scholar] [CrossRef]
Liu, W.; Yu, X.; Raj, B.; Yi, L.; Zou, X.; Li, M. Efficient autism spectrum disorder prediction with eye movement: A machine learning framework. In Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction (ACII), Xi’an, China, 21–24 September 2015; pp. 649–655. [Google Scholar]
Kacur, J.; Polec, J.; Csoka, F.; Smolejova, E. GMM Based Detection of Schizophrenia Using Eye Tracking. In Proceedings of the 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Tuscany, Italy, 9–11 July 2019; pp. 1–4. [Google Scholar]
Przybyszewski, A.W.; Szlufik, S.; Dutkiewicz, J.; Habela, P.; Koziorowski, D.M. Machine Learning on the Video Basis of Slow Pursuit Eye Movements Can Predict Symptom Development in Parkinson’s Patients. In Intelligent Information and Database Systems; Nguyen, N.T., Trawiński, B., Kosala, R., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 268–276. [Google Scholar]
Rello, L.; Ballesteros, M. Detecting readers with dyslexia using machine learning with eye tracking measures. In Proceedings of the 12th International Web for All Conference, Florence, Italy, 18–20 May 2015; Association for Computing Machinery: New York, NY, USA, 2015; pp. 1–8. [Google Scholar]
Kupas, D.; Harangi, B.; Czifra, G.; Andrassy, G. Decision support system for the diagnosis of neurological disorders based on gaze tracking. In Proceedings of the 10th International Symposium on Image and Signal Processing and Analysis, Ljubljana, Slovenia, 18–20 September 2017; pp. 37–40. [Google Scholar]
Zhang, Y.; Wilcockson, T.; Kim, K.I.; Crawford, T.; Gellersen, H.; Sawyer, P. Monitoring dementia with automatic eye movements analysis. In Intelligent Decision Technologies 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 299–309. [Google Scholar]
Mao, Y.; He, Y.; Liu, L.; Chen, X. Disease Classification Based on Eye Movement Features With Decision Tree and Random Forest. Front. Neurosci. 2020, 14, 798. [Google Scholar] [CrossRef] [PubMed]
Ahmed, M.; Noble, J.A. An eye-tracking inspired method for standardised plane extraction from fetal abdominal ultrasound volumes. In Proceedings of the 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), Prague, Czech Republic, 13–16 April 2016; pp. 1084–1087. [Google Scholar]
de Lope, J.; Graña, M. Comparison of Labeling Methods for Behavioral Activity Classification Based on Gaze Ethograms. In Hybrid Artificial Intelligent Systems; de la Cal, E.A., Villar Flecha, J.R., Quintián, H., Corchado, E., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 132–144. [Google Scholar]
Destyanto, T.Y.R.; Lin, R.F. Detecting computer activities using eye-movement features. J. Ambient. Intell. Humaniz. Comput. 2020. [Google Scholar] [CrossRef]
Kit, D.; Sullivan, B. Classifying mobile eye tracking data with hidden Markov models. In Proceedings of the 18th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct, Florence, Italy, 6–9 September 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 1037–1040. [Google Scholar]
Frutos-Pascual, M.; Garcia-Zapirain, B. Assessing Visual Attention Using Eye Tracking Sensors in Intelligent Cognitive Therapies Based on Serious Games. Sensors 2015, 15, 11092–11117. [Google Scholar] [CrossRef] [PubMed]
Fan, X.; Wang, F.; Song, D.; Lu, Y.; Liu, J. GazMon: Eye Gazing Enabled Driving Behavior Monitoring and Prediction. IEEE Trans. Mob. Comput. 2021, 20, 1420–1433. [Google Scholar] [CrossRef]
Meng, C.; Zhao, X. Webcam-Based Eye Movement Analysis Using CNN. IEEE Access 2017, 5, 19581–19587. [Google Scholar] [CrossRef]
Yin, P.-Y.; Day, R.-F.; Wang, Y.-C. Tabu search-based classification for eye-movement behavioral decisions. Neural Comput. Appl. 2018, 29, 1433–1443. [Google Scholar] [CrossRef]
Fernandes, D.L.; Siqueira-Batista, R.; Gomes, A.P.; Souza, C.R.; da Costa, I.T.; Cardoso, F.d.S.L.; de Assis, J.V.; Caetano, G.H.L.; Cerqueira, F.L. Investigation of the visual attention role in clinical bioethics decision-making using machine learning algorithms. Procedia Comput. Sci. 2017, 108, 1165–1174. [Google Scholar] [CrossRef]
Zhang, R.; Walshe, C.; Liu, Z.; Guan, L.; Muller, K.S.; Whritner, J.A.; Zhang, L.; Hayhoe, M.M.; Ballard, D.H. Atari-HEAD: Atari Human Eye-Tracking and Demonstration Dataset. Proc. AAAI Conf. Artif. Intell. 2020, 34, 6811–6820. [Google Scholar] [CrossRef]
Emerson, A.; Henderson, N.; Rowe, J.; Min, W.; Lee, S.; Minogue, J.; Lester, J. Investigating Visitor Engagement in Interactive Science Museum Exhibits with Multimodal Bayesian Hierarchical Models. In Artificial Intelligence in Education; Bittencourt, I.I., Cukurova, M., Muldner, K., Luckin, R., Millán, E., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 165–176. [Google Scholar]
Eivazi, S.; Slupina, M.; Fuhl, W.; Afkari, H.; Hafez, A.; Kasneci, E. Towards Automatic Skill Evaluation in Microsurgery. In Proceedings of the 22nd International Conference on Intelligent User Interfaces Companion, Limassol, Cyprus, 13–16 March 2017; Association for Computing Machinery: New York, NY, USA, 2017; pp. 73–76. [Google Scholar]
Ye, N.; Tao, X.; Dong, L.; Li, Y.; Ge, N. Indicating eye contacts in one-to-many video teleconference with one web camera. In Proceedings of the 2015 Asia Pacific Conference on Multimedia and Broadcasting, Bali, Indonesia, 23–25 April 2015; pp. 1–5. [Google Scholar]
Pettersson, J.; Falkman, P. Human Movement Direction Classification using Virtual Reality and Eye Tracking. Procedia Manuf. 2020, 51, 95–102. [Google Scholar] [CrossRef]
Hu, B.; Liu, X.; Wang, W.; Cai, R.; Li, F.; Yuan, S. Prediction of interaction intention based on eye movement gaze feature. In Proceedings of the 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing, China, 24–26 May 2019; pp. 378–383. [Google Scholar]
Castellanos, J.L.; Gomez, M.F.; Adams, K.D. Using machine learning based on eye gaze to predict targets: An exploratory study. In Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, HI, USA, 27 November–1 December 2017; pp. 1–7. [Google Scholar]
Jadue, J.; Slanzi, G.; Salas, L.; Velásquez, J.D. Web User Click Intention Prediction by Using Pupil Dilation Analysis. In Proceedings of the 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), Singapore, 2–9 December 2015; pp. 433–436. [Google Scholar]
Chen, O.T.-C.; Chen, P.-C.; Tsai, Y.-T. Attention estimation system via smart glasses. In Proceedings of the 2017 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Manchester, UK, 23–25 August 2017; pp. 1–5. [Google Scholar]
Delvigne, V.; Wannous, H.; Vandeborre, J.-P.; Ris, L.; Dutoit, T. Attention Estimation in Virtual Reality with EEG based Image Regression. In Proceedings of the 2020 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR), Utrecht, The Netherlands, 14–18 December 2020; IEEE Computer Society: Washington, DC, USA, 2020; pp. 10–16. [Google Scholar]
Yoshizawa, A.; Nishiyama, H.; Iwasaki, H.; Mizoguchi, F. Machine-learning approach to analysis of driving simulation data. In Proceedings of the 2016 IEEE 15th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC), Palo Alto, CA, USA, 22–23 August 2016; pp. 398–402. [Google Scholar]
Koma, H.; Harada, T.; Yoshizawa, A.; Iwasaki, H. Considering eye movement type when applying random forest to detect cognitive distraction. In Proceedings of the 2016 IEEE 15th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC), Palo Alto, CA, USA, 22–23 August 2016; pp. 377–382. [Google Scholar]
Liu, T.; Yang, Y.; Huang, G.-B.; Yeo, Y.K.; Lin, Z. Driver Distraction Detection Using Semi-Supervised Machine Learning. IEEE Trans. Intell. Transp. Syst. 2016, 17, 1108–1120. [Google Scholar] [CrossRef]
Bixler, R.; D’Mello, S. Automatic Gaze-Based Detection of Mind Wandering with Metacognitive Awareness. In User Modeling, Adaptation and Personalization; Ricci, F., Bontcheva, K., Conlan, O., Lawless, S., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 31–43. [Google Scholar]
Yamada, Y.; Kobayashi, M. Detecting mental fatigue from eye-tracking data gathered while watching video: Evaluation in younger and older adults. Artif. Intell. Med. 2018, 91, 39–48. [Google Scholar] [CrossRef] [PubMed]
Shojaeizadeh, M.; Djamasbi, S.; Paffenroth, R.C.; Trapp, A.C. Detecting task demand via an eye tracking machine learning system. Decis. Support Syst. 2019, 116, 91–101. [Google Scholar] [CrossRef]
Lotz, A.; Weissenberger, S. Predicting Take-Over Times of Truck Drivers in Conditional Autonomous Driving. Adv. Intell. Syst. Comput. 2019, 786, 329–338. [Google Scholar]
Monfort, S.S.; Sibley, C.M.; Coyne, J.T. Using machine learning and real-time workload assessment in a high-fidelity UAV simulation environment. In Next-Generation Analyst IV; SPIE: Bellingham, WA, USA, 2016; pp. 93–102. [Google Scholar]
Mannaru, P.; Balasingam, B.; Pattipati, K.; Sibley, C.; Coyne, J. Cognitive Context Detection for Adaptive Automation. Proc. Hum. Factors Ergon. Soc. Annu. Meet 2016, 60, 223–227. [Google Scholar] [CrossRef]
Larue, G.S.; Rakotonirainy, A.; Pettitt, A.N. Predicting Reduced Driver Alertness on Monotonous Highways. IEEE Pervasive Comput. 2015, 14, 78–85. [Google Scholar] [CrossRef]
Liu, D.; Dong, B.; Gao, X.; Sibley, C.; Coyne, J. Exploiting Eye Tracking for Smartphone Authentication. In Applied Cryptography and Network Security; Malkin, T., Kolesnikov, V., Lewko, A.B., Polychronakis, M., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 457–477. [Google Scholar]
Tiwari, A.; Pal, R. Gaze-Based Graphical Password Using Webcam. In Information Systems Security; Ganapathy, V., Jaeger, T., Shyamasundar, R., Eds.; ICISS 2018. Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2018; Volume 11281. [Google Scholar] [CrossRef]
Li, N.; Wu, Q.; Liu, J.; Hu, W.; Qin, B.; Wu, W. EyeSec: A Practical Shoulder-Surfing Resistant Gaze-Based Authentication System. In Information Security Practice and Experience; Liu, J.K., Samarati, P., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 435–453. [Google Scholar]
Das, A.; Pal, U.; Ferrer Ballester, M.A.; Blumenstein, M. Multi-angle based lively sclera biometrics at a distance. In Proceedings of the 2014 IEEE Symposium on Computational Intelligence in Biometrics and Identity Management (CIBIM), Orlando, FL, USA, 9–12 December 2014; pp. 22–29. [Google Scholar]
Qiao, Y.; Wang, J.; Chen, J.; Ren, J. Design and Realization of Gaze Gesture Control System for Flight Simulation. J. Phys. Conf. Ser. 2020, 1693, 012213. [Google Scholar] [CrossRef]
Kabir, A.; Shahin, F.B.; Islam, M. Design and Implementation of an EOG-based Mouse Cursor Control for Application in Human-Computer Interaction. J. Phys. Conf. Ser. 2020, 1487, 012043. [Google Scholar] [CrossRef]
Reda, R.; Tantawi, M.; Shedeed, H.; Tolba, M.F. Eye Movements Recognition Using Electrooculography Signals. In Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2020); Hassanien, A.-E., Azar, A.T., Gaber, T., Oliva, D., Tolba, F.M., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 490–500. [Google Scholar]
Pai, S.; Bhardwaj, A. Eye Gesture Based Communication for People with Motor Disabilities in Developing Nations. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar]
Taban, R.A.; Croock, M.S. Eye Tracking Based Directional Control System using Mobile Applications. Int. J. Comput. Digit. Syst. 2018, 7, 365–374. [Google Scholar]
López, A.; Fernández, D.; Ferrero, F.J.; Valledor, M.; Postolache, O. EOG signal processing module for medical assistive systems. In Proceedings of the 2016 IEEE International Symposium on Medical Measurements and Applications (MeMeA), Benevento, Italy, 15–18 May 2016; pp. 1–5. [Google Scholar]
Jigang, L.; Francis, B.S.L.; Rajan, D. Free-Head Appearance-Based Eye Gaze Estimation on Mobile Devices. In Proceedings of the 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Okinawa, Japan, 11–13 February 2019; pp. 232–237. [Google Scholar]
Semmelmann, K.; Weigelt, S. Online webcam-based eye tracking in cognitive science: A first look. Behav. Res. Methods 2018, 50, 451–465. [Google Scholar] [CrossRef]
Papoutsaki, A.; Sangkloy, P.; Laskey, J.; Daskalova, N.; Huang, J.; Hays, J. Webgazer: Scalable webcam eye tracking using user interactions. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, New York, NY, USA, 9–15 July 2016; AAAI Press: New York, NY, USA, 2016; pp. 3839–3845. [Google Scholar]
Saikh, T.; Bangalore, S.; Carl, M.; Bandyopadhyay, S. Predicting source gaze fixation duration: A machine learning approach. In Proceedings of the 2015 International Conference on Cognitive Computing and Information Processing(CCIP), Noida, India, 3–4 March 2015. [Google Scholar] [CrossRef]
Valliappan, N.; Dai, N.; Steinberg, E.; He, J.; Rogers, K.; Ramachandran, V.; Xu, P.; Shojaeizadeh, M.; Guo, L.; Kohlhoff, K.; et al. Accelerating eye movement research via accurate and affordable smartphone eye tracking. Nat. Commun. 2020, 11, 4553. [Google Scholar] [CrossRef] [PubMed]
Tősér, Z.; Rill, R.A.; Faragó, K.; Jeni, L.A.; Lőrincz, A. Personalization of Gaze Direction Estimation with Deep Learning. In Proceedings of the KI 2016: Advances in Artificial Intelligence, Klagenfurt, Austria, 26–30 September 2016; pp. 200–207. [Google Scholar]
Dechterenko, F.; Lukavsky, J. Predicting eye movements in multiple object tracking using neural networks. In Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications; Association for Computing Machinery: New York, NY, USA, 2016; pp. 271–274. [Google Scholar]
Lai, H.-Y.; Saavedra-Peña, G.; Sodini, C.G.; Sze, V.; Heldt, T. Measuring Saccade Latency Using Smartphone Cameras. IEEE J. Biomed Health Inform. 2020, 24, 885–897. [Google Scholar] [CrossRef]
Brousseau, B.; Rose, J.; Eizenman, M. Hybrid Eye-Tracking on a Smartphone with CNN Feature Extraction and an Infrared 3D Model. Sensors 2020, 20, 543. [Google Scholar] [CrossRef] [PubMed]
Rakhmatulin, I.; Duchowski, A.T. Deep Neural Networks for Low-Cost Eye Tracking. Procedia Comput. Sci. 2020, 176, 685–694. [Google Scholar] [CrossRef]
Al-Btoush, A.I.; Abbadi, M.A.; Hassanat, A.B.; Tarawneh, A.S.; Hasanat, A.; Prasath, V.B.S. New Features for Eye-Tracking Systems: Preliminary Results. In Proceedings of the 2019 10th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 11–13 June 2019; pp. 179–184. [Google Scholar]
Hossain, M.S.; Ali, A.A.; Amin, M.A. Eye-Gaze to Screen Location Mapping for UI Evaluation of Webpages. In Proceedings of the 2019 3rd International Conference on Graphics and Signal Processing, Hong Kong, China, 1–3 June 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 100–104. [Google Scholar]
Krafka, K.; Khosla, A.; Kellnhofer, P.; Kannan, H.; Bhandarkar, S.; Matusik, W.; Torralba, A. Eye Tracking for Everyone. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE Computer Society: Washington, DC, USA, 2016; pp. 2176–2184. [Google Scholar]
Wan, Q.; Kaszowska, A.; Samani, A.; Panetta, K.; Taylor, H.A.; Agaian, S. Aerial Border Surveillance for Search and Rescue Missions Using Eye Tracking Techniques. In Proceedings of the 2018 IEEE International Symposium on Technologies for Homeland Security (HST), Woburn, MA, USA, 23–24 October 2018; pp. 1–5. [Google Scholar]
Xiaodong, D.; Bo, L.; Peng, L.; Chunhong, G. Study of Eye Movement Behavior Pattern Diversity between Chinese Ethnic Groups. In Proceedings of the 2015 IEEE International Conference on Computational Intelligence & Communication Technology, London, UK, 8–12 June 2015; pp. 767–770. [Google Scholar]
Holmqvist, K.; Andersson, R. Eye-Tracking: A Comprehensive Guide to Methods, Paradigms and Measures; OUP Oxford: Oxford, UK, 2017. [Google Scholar]
Leube, A.; Rifai, K.; Rifai, K. Sampling rate influences saccade detection in mobile eye tracking of a reading task. J. Eye Mov. Res. 2017, 10. [Google Scholar] [CrossRef] [PubMed]
Juhola, M.; Jäntti, V.; Pyykkö, I. Effect of sampling frequencies on computation of the maximum velocity of saccadic eye movements. Biol. Cybern. 1985, 53, 67–72. [Google Scholar] [CrossRef]
Andersson, R.; Nyström, M.; Holmqvist, K. Sampling frequency and eye-tracking measures: How speed affects durations, latencies, and more. J. Eye Mov. Res. 2010, 3, 1–12. [Google Scholar] [CrossRef]
Rolfs, M. Microsaccades: Small steps on a long way. Vision Res. 2009, 49, 2415–2441. [Google Scholar] [CrossRef]
Tobii Pro Lab User Manual; Tobii AB: Danderyd, Sweden, 2021.
Mathôt, S. Pupillometry: Psychology, Physiology, and Function. J. Cogn. 2018, 1, 1–23. [Google Scholar] [CrossRef]
Carette, R.; Elbattah, M.; Cilia, F.; Dequen, G.; Guérin, J.L.; Bosche, J. Learning to Predict Autism Spectrum Disorder based on the Visual Patterns of Eye-tracking Scanpaths. In Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019), Prague, Czech Republic, 22–24 February 2019; pp. 103–112. [Google Scholar]
Elbattah, M.; Carette, R.; Dequen, G.; Guérin, J.L.; Cilia, F. Learning clusters in autism spectrum disorder: Image-based clustering of eye-tracking scanpaths with deep autoencoder. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin/Heidelberg, Germany, 23–27 July 2019. [Google Scholar]
Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]

Figure 1. Percentage distribution of the eye trackers used in the analyzed works.

Figure 2. Percentage distribution of the frequencies of the eye trackers used in the analyzed papers.

Figure 3. Number of participants in the studies.

Figure 4. Types of feature extraction.

Figure 5. Number of artificial intelligence methods used in each paper.

Figure 6. Types of artificial intelligence methods used.

Figure 7. Artificial intelligence methods revealed to give the best results.

Figure 8. Number of result verification methods used in each paper.

Figure 9. Result verification methods used by the researchers.

Table 1. Additional data used for teaching in artificial intelligence algorithms.

Ref.	Task	Additional Parameters
[64]	attention estimation	EEG, head movement
[38]	identifying children with ASD	questionnaire, age, gender
[56]	predicting dwell time in a museum	face expression, body movement, interaction trace logs
[27]	affect recognition	EEG, ECG
[8]	predicting students’ performance and effort	EEG, face videos, arousal data from wristband
[71]	predicting take-over time	head position, body posture, simulation data
[30]	predicting liking a video	infrared thermal image, heart rate, face expression
[32]	predicting user confidence	Time
[25]	predicting reaction to ads	gender, age, survey, time, ad parameters, behavior connected with an ad (e.g., sharing)
[13]	predicting readability	text features
[17]	predicting SAT score	Time
[36]	predicting the emotion of an observed person	EEG, empatica bracelet
[32]	predicting social plane of interaction	EEG, accelerometer, audio, video
[33]	detecting user confusion	mouse actions, distance of the user’s head from the screen
[72]	predicting mental workload	Reaction time
[42]	detecting people with dyslexia	age, text characteristics
[74]	predicting reduced driver alertness	EEG
[19]	predicting learning curve	perceptual speed, verbal working memory, visual working memory, locus of control
[37]	classifying emotions in pictures	image
[89]	predicting eye movement	distance between the object and the dis- tractor
[41]	predicting Parkinson symptoms’ development	age, sex, duration of the disease
[23]	emotion estimation	head movement, body movement, audio, video of the face

Table 2. Comparison of the accuracy of artificial intelligence algorithms.

Ref.	Task	AI	N	Accuracy
[47]	detecting the type of behavior when using laptop	SVM	ns	99.77%
[67]	detecting driver distraction	Semi-Supervised Extreme Learning Machine	34	97.2%
[95]	gaze estimation	Random Forest	10	97.2%
[37]	classifying emotions in pictures	Strengthened Deep Belief Network	40	97.1%
[45]	predicting neurological diseases	Random Forest	96	96.88%
[12]	predicting type of reading	SVM	30	96.69%
[81]	detecting eye gestures	naive Bayes	ext	95.0%
[11]	reading behavior recognition	Hidden Markov Model	4	95.0%
[92]	gaze estimation	ANN	29	94.1%
[61]	predicting targets	MLP	5	94.0%
[48]	detecting computer activity type	CNN	150	93.15%
[76]	authentication	CNN	26	93.14%
[63]	detecting attention	SVM	10	93.1%
[80]	detecting eye gestures	SVM	5	93.0%
[46]	detecting organs	AdaBoost	10	92.5%
[29]	predicting excitement	DNN	20	92.0%
[75]	smartphone authentication	Random Sample Consensus	21	91.6%
[82]	eye gestures for patients	Recurrent Neural Network	270	91.4%
[69]	detecting mental fatigue	SVM	18	91.0%
[54]	predicting ethical decision-making	MLP	75	90.7%
[22]	predicting social plane of interaction	Gradient Boosted Decision Tree	1	90.6%
[59]	predicting movement intention	CNN	24	88.37%
[32]	predicting user confidence	Random Forest	23	88.0%
[73]	detecting operator overload	Linear Discriminant Analysis	20	87.91%
[39]	detecting people with ASD	SVM	130	86.89%
[13]	predicting readability	MLP	ext	86.62%
[50]	identifying children’s behavior	Random forest	32	84.0%
[38]	distinguishing children with ASD	Logistic Regression	33	83.9%
[62]	predicting web user click intention	ANN	25	82.0%
[83]	choosing direction	Viola-Jones Algorithm with HAAR Cascade Classifiers	ns	82.0%
[30]	predicting liking a video	ANN	30	81.8%
[35]	detecting satisfaction	SVM	30	80.53%
[42]	detecting people with dyslexia	SVM	97	80.18%
[20]	detecting speed of learning	Random Forest	161	80.0%
[99]	distinguishing Chinese ethnic groups	SVM	35	80.0%
[58]	detecting eye contact	SVM	ns	80.0%
[41]	predicting Parkinson symptoms’ development	Decomposition Tree	10	79.5%
[40]	detection of Schizophrenia	Generative Model Base Method	44	79.2%
[70]	detecting task demand	Random Forest	48	79.0%
[72]	predicting mental workload	Ensemble	20	78.0%
[19]	predicting learning curve	Random Forest	95	77.0%
[17]	predicting SAT score	Decision Tree	30	76.67%
[15]	predicting word understanding	SVM	16	75.6%
[68]	detecting mind wandering	SVM	178	74.0%
[65]	detecting cognitive distraction	SVM	18	73.0%
[27]	affect recognition	Long Short-Term Memory Network	130	72.8%
[57]	automatic surgery skills assessment	Random Forest	9	69.0%
[96]	gaze estimation	ANN	10	68.31%
[18]	predicting user’s cognitive abilities	Random forest	166	66.1%
[60]	predicting intention	SVM	20	64.0%
[9]	predicting student’s performance	Logistic Regression	95	63.0%
[32]	detecting user confusion	Random Forest	136	61.0%
[49]	predicting type of task	Hidden Markov Model	8	57.0%
[21]	predicting the prior disclosure type from eye data	Gradient Boosted Decision Tree	20	53.5%
[88]	predicting the duration of gaze fixation	SVM	Ext.	49.1%
[36]	predicting the emotion of an observed person	MLP	44	42.7%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kędras, M.; Sobecki, J. What Is Hidden in Clear Sight and How to Find It—A Survey of the Integration of Artificial Intelligence and Eye Tracking. Information 2023, 14, 624. https://doi.org/10.3390/info14110624

AMA Style

Kędras M, Sobecki J. What Is Hidden in Clear Sight and How to Find It—A Survey of the Integration of Artificial Intelligence and Eye Tracking. Information. 2023; 14(11):624. https://doi.org/10.3390/info14110624

Chicago/Turabian Style

Kędras, Maja, and Janusz Sobecki. 2023. "What Is Hidden in Clear Sight and How to Find It—A Survey of the Integration of Artificial Intelligence and Eye Tracking" Information 14, no. 11: 624. https://doi.org/10.3390/info14110624

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

What Is Hidden in Clear Sight and How to Find It—A Survey of the Integration of Artificial Intelligence and Eye Tracking

Abstract

1. Introduction

2. Materials and Methods

2.1. Applications of Artificial Intelligence Enhanced Eye Tracking

2.2. Eye Trackers

2.3. Participants

2.4. Additional Data for Artificial Intelligence

2.5. Features Extraction

2.6. Artificial Intelligence Methods Used with Eye Tracking

2.7. Methods for Verification of the Results

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI