ARTÍCULO
TITULO

PRIVAFRAME: A Frame-Based Knowledge Graph for Sensitive Personal Data

Gaia Gambarelli and Aldo Gangemi    

Resumen

The pervasiveness of dialogue systems and virtual conversation applications raises an important theme: the potential of sharing sensitive information, and the consequent need for protection. To guarantee the subject?s right to privacy, and avoid the leakage of private content, it is important to treat sensitive information. However, any treatment requires firstly to identify sensitive text, and appropriate techniques to do it automatically. The Sensitive Information Detection (SID) task has been explored in the literature in different domains and languages, but there is no common benchmark. Current approaches are mostly based on artificial neural networks (ANN) or transformers based on them. Our research focuses on identifying categories of personal data in informal English sentences, by adopting a new logical-symbolic approach, and eventually hybridising it with ANN models. We present a frame-based knowledge graph built for personal data categories defined in the Data Privacy Vocabulary (DPV). The knowledge graph is designed through the logical composition of already existing frames, and has been evaluated as background knowledge for a SID system against a labeled sensitive information dataset. The accuracy of PRIVAFRAME reached 78%. By comparison, a transformer-based model achieved 12% lower performance on the same dataset. The top-down logical-symbolic frame-based model allows a granular analysis, and does not require a training dataset. These advantages lead us to use it as a layer in a hybrid model, where the logical SID is combined with an ANNs SID tested in a previous study by the authors.

 Artículos similares

       
 
Jadil Alsamiri and Khalid Alsubhi    
In recent years, the Internet of Vehicles (IoV) has garnered significant attention from researchers and automotive industry professionals due to its expanding range of applications and services aimed at enhancing road safety and driver/passenger comfort.... ver más
Revista: Future Internet

 
Zacharias Anastasakis, Terpsichori-Helen Velivassaki, Artemis Voulkidis, Stavroula Bourou, Konstantinos Psychogyios, Dimitrios Skias and Theodore Zahariadis    
Federated Learning is identified as a reliable technique for distributed training of ML models. Specifically, a set of dispersed nodes may collaborate through a federation in producing a jointly trained ML model without disclosing their data to each othe... ver más
Revista: Future Internet

 
Craig Mahoney, Joshua Montgomery, Stephanie Connor and Danielle Cobbaert    
Boreal wetlands within the oil sands region of Alberta, Canada, are subject to natural and anthropogenic pressures, resulting in the need for monitoring these sensitive ecosystems to ensure their protection. This study presents results from Canada?s pilo... ver más
Revista: Water

 
Cléssio Moura de Souza, Dominik Kremer and Blake Byron Walker    
Knowledge and experiences of violence transform the ways in which individuals perceive the urban landscape, construct and reproduce (un)safety, and make everyday decisions regarding mobility and the use of space. This knowledge and these experiences are ... ver más

 
Gabriel Giraldo, Myriam Servières and Guillaume Moreau    
Wind can influence people?s behavior and their way of inhabiting an architectural or urban space. Furthermore, virtual reality (VR) enables the simulation of different physical and sensitive phenomena such as the wind. We aim to analyze the effects of di... ver más