Inicio  /  Algorithms  /  Vol: 15 Par: 12 (2022)  /  Artículo
ARTÍCULO
TITULO

Do Neural Transformers Learn Human-Defined Concepts? An Extensive Study in Source Code Processing Domain

Claudio Ferretti and Martina Saletta    

Resumen

State-of-the-art neural networks build an internal model of the training data, tailored to a given classification task. The study of such a model is of interest, and therefore, research on explainable artificial intelligence (XAI) aims at investigating if, in the internal states of a network, it is possible to identify rules that associate data to their corresponding classification. This work moves toward XAI research on neural networks trained in the classification of source code snippets, in the specific domain of cybersecurity. In this context, typically, textual instances have firstly to be encoded with non-invertible transformation into numerical vectors to feed the models, and this limits the applicability of known XAI methods based on the differentiation of neural signals with respect to real valued instances. In this work, we start from the known TCAV method, designed to study the human understandable concepts that emerge in the internal layers of a neural network, and we adapt it to transformers architectures trained in solving source code classification problems. We first determine domain-specific concepts (e.g., the presence of given patterns in the source code), and for each concept, we train support vector classifiers to separate points in the vector activation spaces that represent input instances with the concept from those without the concept. Then, we study if the presence (or the absence) of such concepts affects the decision process of the neural network. Finally, we discuss about how our approach contributes to general XAI goals and we suggest specific applications in the source code analysis field.

 Artículos similares

       
 
Manav Garg, Pranshav Gajjar, Pooja Shah, Madhu Shukla, Biswaranjan Acharya, Vassilis C. Gerogiannis and Andreas Kanavos    
The musical key serves as a crucial element in a piece, offering vital insights into the tonal center, harmonic structure, and chord progressions while enabling tasks such as transposition and arrangement. Moreover, accurate key estimation finds practica... ver más
Revista: Information

 
Angelo Casolaro, Vincenzo Capone, Gennaro Iannuzzo and Francesco Camastra    
A time series is a sequence of time-ordered data, and it is generally used to describe how a phenomenon evolves over time. Time series forecasting, estimating future values of time series, allows the implementation of decision-making strategies. Deep lea... ver más
Revista: Information

 
Oscar Ondeng, Heywood Ouma and Peter Akuon    
Visual understanding is a research area that bridges the gap between computer vision and natural language processing. Image captioning is a visual understanding task in which natural language descriptions of images are automatically generated using visio... ver más
Revista: Applied Sciences

 
Mostafa Aliyari and Yonas Zewdu Ayele    
This article aims to assess the effectiveness of state-of-the-art artificial neural network (ANN) models in time series analysis, specifically focusing on their application in prediction tasks of critical infrastructures (CIs). To accomplish this, shallo... ver más

 
Mohamad Abou Ali, Fadi Dornaika and Ignacio Arganda-Carreras    
Deep learning (DL) has made significant advances in computer vision with the advent of vision transformers (ViTs). Unlike convolutional neural networks (CNNs), ViTs use self-attention to extract both local and global features from image data, and then ap... ver más
Revista: Algorithms