Inicio  /  Algorithms  /  Vol: 17 Par: 2 (2024)  /  Artículo
ARTÍCULO
TITULO

Program Code Generation with Generative AIs

Baskhad Idrisov and Tim Schlippe    

Resumen

Our paper compares the correctness, efficiency, and maintainability of human-generated and AI-generated program code. For that, we analyzed the computational resources of AI- and human-generated program code using metrics such as time and space complexity as well as runtime and memory usage. Additionally, we evaluated the maintainability using metrics such as lines of code, cyclomatic complexity, Halstead complexity and maintainability index. For our experiments, we had generative AIs produce program code in Java, Python, and C++ that solves problems defined on the competition coding website leetcode.com. We selected six LeetCode problems of varying difficulty, resulting in 18 program codes generated by each generative AI. GitHub Copilot, powered by Codex (GPT-3.0), performed best, solving 9 of the 18 problems (50.0%), whereas CodeWhisperer did not solve a single problem. BingAI Chat (GPT-4.0) generated correct program code for seven problems (38.9%), ChatGPT (GPT-3.5) and Code Llama (Llama 2) for four problems (22.2%) and StarCoder and InstructCodeT5+ for only one problem (5.6%). Surprisingly, although ChatGPT generated only four correct program codes, it was the only generative AI capable of providing a correct solution to a coding problem of difficulty level hard. In summary, 26 AI-generated codes (20.6%) solve the respective problem. For 11 AI-generated incorrect codes (8.7%), only minimal modifications to the program code are necessary to solve the problem, which results in time savings between 8.9% and even 71.3% in comparison to programming the program code from scratch.

 Artículos similares

       
 
Heling Cao, Dong Han, Fangzheng Liu, Tianli Liao, Chenyang Zhao and Jianshu Shi    
Automatic program repair has drawn more and more attention since software quality is facing increasing challenges. In existing approaches, the unlimited search space is considered to be the main limitation in finding the correct patch. So how to reduce t... ver más
Revista: Applied Sciences

 
Ivan Kosyanenko,Roman Bolbakov     Pág. 53 - 59
In today's team-based software development, good commit messages - comments on changes made in natural language - are essential. The metric for evaluating a commit message is its relevance. A good commit message should not only describe the changes made,... ver más

 
Aleksandr Romanov, Anna Kurtukova, Anastasia Fedotova and Alexander Shelupanov    
This article is part of a series aimed at determining the authorship of source codes. Analyzing binary code is a crucial aspect of cybersecurity, software development, and computer forensics, particularly in identifying malware authors. Any program is ma... ver más
Revista: Information

 
Vadim Palchikovskiy, Aleksandr Kuznetsov, Igor Khramtsov and Oleg Kustov    
A comparison is considered of the experimentally obtained impedance of locally reacting acoustic liner samples with the impedance calculated using semi-empirical Goodrich, Sobolev and Eversman models. The semi-empirical impedance models are outlined. In ... ver más
Revista: Acoustics

 
Martina Saletta and Claudio Ferretti    
Deep neural networks have proven to be able to learn rich internal representations, including for features that can also be used for different purposes than those the networks are originally developed for. In this paper, we are interested in exploring su... ver más
Revista: Information