Inicio  /  Applied Sciences  /  Vol: 10 Par: 8 (2020)  /  Artículo
ARTÍCULO
TITULO

Source Code Assessment and Classification Based on Estimated Error Probability Using Attentive LSTM Language Model and Its Application in Programming Education

Md. Mostafizer Rahman    
Yutaka Watanobe and Keita Nakamura    

Resumen

The rate of software development has increased dramatically. Conventional compilers cannot assess and detect all source code errors. Software may thus contain errors, negatively affecting end-users. It is also difficult to assess and detect source code logic errors using traditional compilers, resulting in software that contains errors. A method that utilizes artificial intelligence for assessing and detecting errors and classifying source code as correct (error-free) or incorrect is thus required. Here, we propose a sequential language model that uses an attention-mechanism-based long short-term memory (LSTM) neural network to assess and classify source code based on the estimated error probability. The attentive mechanism enhances the accuracy of the proposed language model for error assessment and classification. We trained the proposed model using correct source code and then evaluated its performance. The experimental results show that the proposed model has logic and syntax error detection accuracies of 92.2% and 94.8%, respectively, outperforming state-of-the-art models. We also applied the proposed model to the classification of source code with logic and syntax errors. The average precision, recall, and F-measure values for such classification are much better than those of benchmark models. To strengthen the proposed model, we combined the attention mechanism with LSTM to enhance the results of error assessment and detection as well as source code classification. Finally, our proposed model can be effective in programming education and software engineering by improving code writing, debugging, error-correction, and reasoning.

 Artículos similares

       
 
Leon Kopitar, Iztok Fister, Jr. and Gregor Stiglic    
Introduction: Type 2 diabetes mellitus is a major global health concern, but interpreting machine learning models for diagnosis remains challenging. This study investigates combining association rule mining with advanced natural language processing to im... ver más
Revista: Information

 
Marko Jevtic, Sa?a Mladenovic and Andrina Granic    
Due to the everchanging and evergrowing nature of programming technologies, the gap between the programming industry?s needs and the educational capabilities of both formal and informal educational environments has never been wider. However, the need to ... ver más
Revista: Applied Sciences

 
Min-Jen Tsai, Ya-Chu Lee and Te-Ming Chen    
QR codes (short for Quick Response codes) were originally developed for use in the automotive industry to track factory inventories and logistics, but their popularity has expanded significantly in the past few years due to the widespread applications of... ver más
Revista: Algorithms

 
Valeria Mercuri, Martina Saletta and Claudio Ferretti    
As the prevalence and sophistication of cyber threats continue to increase, the development of robust vulnerability detection techniques becomes paramount in ensuring the security of computer systems. Neural models have demonstrated significant potential... ver más
Revista: Algorithms

 
Tahira Niazi, Teerath Das, Ghufran Ahmed, Syed Muhammad Waqas, Sumra Khan, Suleman Khan, Ahmed Abdelaziz Abdelatif and Shaukat Wasi    
Code comments are considered an efficient way to document the functionality of a particular block of code. Code commenting is a common practice among developers to explain the purpose of the code in order to improve code comprehension and readability. Re... ver más
Revista: Algorithms