Redirigiendo al acceso original de articulo en 24 segundos...
Inicio  /  Future Internet  /  Vol: 15 Par: 9 (2023)  /  Artículo
ARTÍCULO
TITULO

Analysis of Program Representations Based on Abstract Syntax Trees and Higher-Order Markov Chains for Source Code Classification Task

Artyom V. Gorchakov    
Liliya A. Demidova and Peter N. Sovietov    

Resumen

In this paper we consider the research and development of classifiers that are trained to predict the task solved by source code. Possible applications of such task detection algorithms include method name prediction, hardware?software partitioning, programming standard violation detection, and semantic code duplication search. We provide the comparative analysis of modern approaches to source code transformation into vector-based representations that extend the variety of classification and clustering algorithms that can be used for intelligent source code analysis. These approaches include word2vec, code2vec, first-order and second-order Markov chains constructed from abstract syntax trees (AST), histograms of assembly language instruction opcodes, and histograms of AST node types. The vectors obtained with the forementioned approaches are then used to train such classification algorithms as k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and multilayer perceptron (MLP). The obtained results show that the use of program vectors based on first-order AST-based Markov chains with an RF-based classifier leads to the highest accuracy, precision, recall, and F1 score. Increasing the order of Markov chains considerably increases the dimensionality of a vector, without any improvements in classifier quality, so we assume that first-order Markov chains are best suitable for real world applications. Additionally, the experimental study shows that first-order AST-based Markov chains are least sensitive to the used classification algorithm.

 Artículos similares

       
 
Fhrizz S. De Jesus, Lyka Mae L. Fajardo     Pág. 13 - 32
AbstractEmployee development and training programs are critical to the global success of firms. Not only do these programs enable employees to develop new abilities, but they also enable businesses to increase employee productivity and improve company cu... ver más

 
Ashraf Abdelkarim and Ahmed F.D. Gaber    
This study aims to assess the impact of flash floods in the Wadi Nu?man basin on urban areas, east of Mecca, which are subjected to frequent floods, during the period from 1988?2019. By producing and analyzing the maps of the regions, an integrated appro... ver más
Revista: Water

 
M. Imam Arifandy,Hariyadi Hariyadi,Soeryo Adiwibowo     Pág. 199 - 206
Indonesian Sustainability Palm Oil (ISPO) is requirement for palm oil private business in the effort to preserve the environment, increase economic activities, and social activities of the community. The environmental management conducted by the company ... ver más

 
Triyana Muliawati, Dewi Suhika     Pág. 40 - 46
The development of student character starts from education process in campus life and residence. The environment is less comfortable and effective in the learning process will affect student achievement. To overcome this, the Institute of Technology of S... ver más

 
Ade Maya Saraswati     Pág. 151 - 161
Abstract: This research aims to analyse the effect of tax and tunnelling incentive to transfer pricing with good corporate governance as a moderating variable. This research focuses on all mining companies listed on the Indonesian Stock Exchange in the p... ver más
Revista: Jurnal Economia