Inicio  /  Future Internet  /  Vol: 15 Par: 9 (2023)  /  Artículo
ARTÍCULO
TITULO

Analysis of Program Representations Based on Abstract Syntax Trees and Higher-Order Markov Chains for Source Code Classification Task

Artyom V. Gorchakov    
Liliya A. Demidova and Peter N. Sovietov    

Resumen

In this paper we consider the research and development of classifiers that are trained to predict the task solved by source code. Possible applications of such task detection algorithms include method name prediction, hardware?software partitioning, programming standard violation detection, and semantic code duplication search. We provide the comparative analysis of modern approaches to source code transformation into vector-based representations that extend the variety of classification and clustering algorithms that can be used for intelligent source code analysis. These approaches include word2vec, code2vec, first-order and second-order Markov chains constructed from abstract syntax trees (AST), histograms of assembly language instruction opcodes, and histograms of AST node types. The vectors obtained with the forementioned approaches are then used to train such classification algorithms as k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and multilayer perceptron (MLP). The obtained results show that the use of program vectors based on first-order AST-based Markov chains with an RF-based classifier leads to the highest accuracy, precision, recall, and F1 score. Increasing the order of Markov chains considerably increases the dimensionality of a vector, without any improvements in classifier quality, so we assume that first-order Markov chains are best suitable for real world applications. Additionally, the experimental study shows that first-order AST-based Markov chains are least sensitive to the used classification algorithm.

 Artículos similares

       
 
CUYA Thamar M.,DAYRIT Amayna Shaene C.,CUYA-ANTONIO Olive Chester,PASCUAL Marilou P.,FRONDA Jennifer G.     Pág. 113 - 120
Graduate education plays a significant role in the development and the economy of the nation. Therefore, it is significant to evaluate the program it offers as they provide relevant contributions to their graduates? knowledge, skills, and ability in the ... ver más

 
Fhrizz S. De Jesus, Lyka Mae L. Fajardo     Pág. 13 - 32
AbstractEmployee development and training programs are critical to the global success of firms. Not only do these programs enable employees to develop new abilities, but they also enable businesses to increase employee productivity and improve company cu... ver más

 
M. Imam Arifandy,Hariyadi Hariyadi,Soeryo Adiwibowo     Pág. 199 - 206
Indonesian Sustainability Palm Oil (ISPO) is requirement for palm oil private business in the effort to preserve the environment, increase economic activities, and social activities of the community. The environmental management conducted by the company ... ver más

 
Triyana Muliawati, Dewi Suhika     Pág. 40 - 46
The development of student character starts from education process in campus life and residence. The environment is less comfortable and effective in the learning process will affect student achievement. To overcome this, the Institute of Technology of S... ver más

 
Ade Maya Saraswati     Pág. 151 - 161
Abstract: This research aims to analyse the effect of tax and tunnelling incentive to transfer pricing with good corporate governance as a moderating variable. This research focuses on all mining companies listed on the Indonesian Stock Exchange in the p... ver más
Revista: Jurnal Economia