Inicio  /  Information  /  Vol: 12 Par: 1 (2021)  /  Artículo
ARTÍCULO
TITULO

A Frequent Pattern Conjunction Heuristic for Rule Generation in Data Streams

Frederic Stahl    
Thien Le    
Atta Badii and Mohamed Medhat Gaber    

Resumen

This paper introduces a new and expressive algorithm for inducing descriptive rule-sets from streaming data in real-time in order to describe frequent patterns explicitly encoded in the stream. Data Stream Mining (DSM) is concerned with the automatic analysis of data streams in real-time. Rapid flows of data challenge the state-of-the art processing and communication infrastructure, hence the motivation for research and innovation into real-time algorithms that analyse data streams on-the-fly and can automatically adapt to concept drifts. To date, DSM techniques have largely focused on predictive data mining applications that aim to forecast the value of a particular target feature of unseen data instances, answering questions such as whether a credit card transaction is fraudulent or not. A real-time, expressive and descriptive Data Mining technique for streaming data has not been previously established as part of the DSM toolkit. This has motivated the work reported in this paper, which has resulted in developing and validating a Generalised Rule Induction (GRI) tool, thus producing expressive rules as explanations that can be easily understood by human analysts. The expressiveness of decision models in data streams serves the objectives of transparency, underpinning the vision of ?explainable AI? and yet is an area of research that has attracted less attention despite being of high practical importance. The algorithm introduced and described in this paper is termed Fast Generalised Rule Induction (FGRI). FGRI is able to induce descriptive rules incrementally for raw data from both categorical and numerical features. FGRI is able to adapt rule-sets to changes of the pattern encoded in the data stream (concept drift) on the fly as new data arrives and can thus be applied continuously in real-time. The paper also provides a theoretical, qualitative and empirical evaluation of FGRI.

 Artículos similares

       
 
Rutineia Tassi, Enio Júnior Seidel, David da Motta-Marques, Adolfo Villanueva and Latif Kalin    
This study explored the role of the hydrological regime as a trigger factor for wildlife roadkill along a 22 km road crossing the Taim Wetland, a Ramsar site of international importance in South Brazil. The north?south crossing of BR-471, a federal highw... ver más
Revista: Water

 
Sabrina Demarie, Jean Renaud Pycke, Alessia Pizzuti and Veronique Billat    
Pacing strategy refers to the distribution of effort and speed throughout the race to achieve optimal performance. This study aims to understand whether the choice of pacing strategy in swimming depends on the length of competitions and how sex, age, and... ver más
Revista: Applied Sciences

 
Qin-Hu Tian, Wen-Ting Zhang and Wu Zhu    
The Weihe Fault is an important basement fault that is buried deep and controls the formation, evolution, and seismicity of the Weihe Basin. It has been quiescent for more than 300 years with only a few moderate and small earthquakes distributed unevenly... ver más
Revista: Applied Sciences

 
Faria Ferooz, Malik Tahir Hassan, Sajid Mahmood, Hira Asim, Muhammad Idrees, Muhammad Assam, Abdullah Mohamed and El-Awady Attia    
To reduce crime rates, there is a need to understand and analyse emerging patterns of criminal activities. This study examines the occurrence patterns of crimes using the crime dataset of Lahore, a metropolitan city in Pakistan. The main aim is to facili... ver más
Revista: Applied Sciences

 
Inayat Ur Rahman, Aftab Afzal, Zafar Iqbal, Eduardo Soares Calixto, Jawaher Alkahtani, Mona S. Alwahibi, Niaz Ali, Rukhsana Kausar, Uzma Khan and Rainer W. Bussmann    
The current research was carried out to characterize the phytosociology of the forests of one of Pakistan?s most valuable tree species (Deodar) across its native range. In this context, our main hypothesis was that, along the altitudinal gradient, we wou... ver más
Revista: Applied Sciences