ARTÍCULO
TITULO

A Review of Local Outlier Factor Algorithms for Outlier Detection in Big Data Streams

Omar Alghushairy    
Raed Alsini    
Terence Soule and Xiaogang Ma    

Resumen

Outlier detection is a statistical procedure that aims to find suspicious events or items that are different from the normal form of a dataset. It has drawn considerable interest in the field of data mining and machine learning. Outlier detection is important in many applications, including fraud detection in credit card transactions and network intrusion detection. There are two general types of outlier detection: global and local. Global outliers fall outside the normal range for an entire dataset, whereas local outliers may fall within the normal range for the entire dataset, but outside the normal range for the surrounding data points. This paper addresses local outlier detection. The best-known technique for local outlier detection is the Local Outlier Factor (LOF), a density-based technique. There are many LOF algorithms for a static data environment; however, these algorithms cannot be applied directly to data streams, which are an important type of big data. In general, local outlier detection algorithms for data streams are still deficient and better algorithms need to be developed that can effectively analyze the high velocity of data streams to detect local outliers. This paper presents a literature review of local outlier detection algorithms in static and stream environments, with an emphasis on LOF algorithms. It collects and categorizes existing local outlier detection algorithms and analyzes their characteristics. Furthermore, the paper discusses the advantages and limitations of those algorithms and proposes several promising directions for developing improved local outlier detection methods for data streams.

 Artículos similares

       
 
Zhouxian Lu, Yong Li and Feng Shuang    
Due to the low efficiency and safety of a manual insulator inspection, research on intelligent insulator inspections has gained wide attention. However, most existing defect recognition methods extract abstract features of the entire image directly by co... ver más
Revista: Drones

 
Francisco Melo Pereira and Rute C. Sofia    
This paper provides an analysis of two machine learning algorithms, density-based spatial clustering of applications with noise (DBSCAN) and the local outlier factor (LOF), applied in the detection of outliers in the context of a continuous framework for... ver más
Revista: Future Internet

 
Naledzani Mudau and Paidamwoyo Mhangara    
Research on the detection of informal settlements has increased in the past three decades owing to the availability of high- to very-high-spatial-resolution satellite imagery. The achievement of development goals, such as the Sustainable Development Goal... ver más
Revista: Urban Science

 
Bat-hen Nahmias-Biran, Shuki Cohen, Vladimir Simon and Israel Feldman    
Mobile phones have achieved a high rate of penetration and gained great interest in the field of travel behavior studies. However, mobile phone data exploitation for national travel models has only been sporadically studied thus far. This work focuses on... ver más

 
Huiwen Ji, Min Xia, Dongsheng Zhang and Haifeng Lin    
Cloud and cloud shadow detection are essential in remote sensing imagery applications. Few semantic segmentation models were designed specifically for clouds and their shadows. Based on the visual and distribution characteristics of clouds and their shad... ver más