Predicting Students’ Outcome in an Introductory Programming Course: Leveraging the Student Background

Köhler, Jacqueline; Hidalgo, Luciano; Jara, José Luis

doi:10.3390/app132111994

Open AccessArticle

Predicting Students’ Outcome in an Introductory Programming Course: Leveraging the Student Background

by

Jacqueline Köhler

^1,2

,

Luciano Hidalgo

^2,3

and

José Luis Jara

^1,2,*

¹

Centro de Investigación en Creatividad y Educación Superior (CICES), Facultad de Ingeniería, Universidad de Santiago de Chile, Santiago 9170022, Chile

²

Departamento de Ingeniería Informática, Facultad de Ingeniería, Universidad de Santiago de Chile, Santiago 9170022, Chile

³

Department of Computer Science, Pontificia Universidad Católica de Chile, Santiago 8331150, Chile

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(21), 11994; https://doi.org/10.3390/app132111994

Submission received: 20 September 2023 / Revised: 28 October 2023 / Accepted: 29 October 2023 / Published: 3 November 2023

(This article belongs to the Special Issue Artificial Intelligence in Online Higher Educational Data Mining)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

For a lot of beginners, learning to program is challenging; similarly, for teachers, it is difficult to draw on students’ prior knowledge to help the process because it is not quite obvious which abilities are significant for developing programming skills. This paper seeks to shed some light on the subject by identifying which previously recorded variables have the strongest correlation with passing an introductory programming course. To do this, a data set was collected including data from four cohorts of students who attended an introductory programming course, common to all Engineering programmes at a Chilean university. With this data set, several classifiers were built, using different Machine Learning methods, to determine whether students pass or fail the course. In addition, models were trained on subsets of students by programme duration and engineering specialisation. An accuracy of 68% was achieved, but the analysis by specialisation shows that both accuracy and the significant variables vary depending on the programme. The fact that classification methods select different predictors depending on the specialisation suggests that there is a variety of factors that affect a student’s ability to succeed in a programming course, such as overall academic performance, language proficiency, and mathematical and scientific skills.

Keywords:

machine learning; cs01; programming in engineering

1. Introduction

The modern world has seen great advances in the automation of both processes and services. As a consequence, professionals of several fields need to be proficient at computer programming [1]. Thus, programming is now taught to children [2], teenagers [3,4], and professionals in diverse fields, including STEM, arts, social sciences, and health [5,6,7,8]. However, learning how to program poses particular challenges [9] that make it difficult for people with no previous training to develop this skill [10,11,12]. Therefore, determining key factors for success and how beginners face programming remains an active and evolving research field [13,14,15,16,17].

While it is widely agreed that there is an association between both mathematics and fundamental scientific skills and programming aptitude, the question of the key determinant factors influencing students’ outcome in an initial programming course remains unanswered. Loksa and Ko [18] face this challenge by investigating the process of learning itself and suggesting a methodical approach to address programming challenges. Conversely, Prather et al. [19] scrutinise these issues through a meta-cognitive lens. Meanwhile, Lopez et al. [20] adopt an alternative stance, tackling the problem by categorising programming proficiency based on its complexity, ranging from reading code to articulating the conceptual intent behind a specific code segment.

Special interest has been placed in finding out which previous skills make programming easier. Qian and Lehman [21] associate programming to language and mathematical skills. Hinckle et al. [22], among other authors, study how gender, psychological, and experiential factors are linked to programming. From another perspective, Álvarez et al. [23] explore possible connections between attitudes, commitment, and autonomous learning ability and success in an introductory programming course imparted for various STEM programmes. The same authors found a link between how students perceive their programming skills and their achievement in a programming course [24]. Other works remark the importance of motivation for student success [25]. However, there is still no consensus on a definitive group of characteristics to predict the outcome of students in introductory programming courses [26].

The objective of this study is to determine which variables best predict students’ outcomes in an introductory programming course offered to 20 engineering programmes at a Chilean university. To accomplish this, only variables that were recorded before the student enrolled in the course were used, including variables recorded upon the student’s admission to the university, the student’s performance during the first academic semester, and the theory and laboratory lecturers with whom each student registered for the course. The underlying hypothesis is that the current course design and implementation may not adequately address the diversity of student profiles, resulting in predetermined outcomes for many students. As a result, across the whole set of variables and across subsets of different students, classifiers were trained using Random Forests (RFs), Multivariate Logistic Regression (MLR), Support Vector Machines (SVMs), and Extreme Gradient Boosting (XGB). The course used for this study, “Fundamentos de Computación y Programación” (FCYP), is mandatory for 20 programmes imparted by the Faculty of Engineering at the Universidad de Santiago de Chile, 12 of which are 6 years long and the remaining 8 are 4 years long. For all programmes, the course is provided in the second semester of the first year, and it is preceded by first-semester common courses: Algebra I, Calculus I, Physics I, Introduction to Engineering, and Study Methods. The structure of the research presented in this paper is summarised in Figure 1 and further described in following sections.

This work is an extension of a preliminary conference paper presented in [27], which attempted to identify those characteristics with the highest correlation with a passing grade in an introductory programming course using the same variable sets. The extensions in this work include (1) the use of statistical methods to remove correlated variables; (2) the exclusion of Machine Learning algorithms that did not yield satisfactory results in the previous study (namely CART Trees and Extreme Learning Machines); (3) training models with subsets of data based on programme duration and the Department of the Faculty of Engineering in charge of each programme; and (4) based on the aforementioned extensions, a new analysis of the results and discussion.

The remaining sections of this article are structured as follows: Section 2 describes related work in the field of predicting academic achievement. Section 3 covers the data set and techniques used in this study. Section 4 shows the results from different Machine Learning approaches, whereas Section 5 discusses on these results. Finally, Section 6 presents the conclusions of the study as well as ideas for further research.

2. Related Work

Several works can be found in the literature that attempt to predict academic achievement in higher education, understood as either final course, semester, or programme grades. In a systematic literature review, Alturki et al. [28] state that most common predictors are previous academic achievement and demographic variables and that Decision Trees (DT) are the most often used approach for prediction. These findings are consistent with those presented by Alsariera et al. [29], who also include internal assessment variables (such as quizzes and other coursework) and family variables (such as parent status or family size) as the most common predictors after academic and demographic variables. Alsariera et al. [29] also mention Artificial Neural Networks (ANNs), DTs, and SVMs as the most commonly utilised methods. This also agrees with the review conducted by Abu Saa et al. [30], who added online learning activities (that is, students’ activity logs obtained from e-Learning systems) and students’ social information as relevant predictors and incorporated Naive Bayes (NB) and ANNs as frequently used methods. Consequently, a review specifically focused on STEM by Ismail and Yusof [31] shows demographic, socioeconomic, and academic variables to be relevant in the addressed studies. This work also states that the most frequently used statistical method for analysing results is a descriptive analysis, namely regression.

Despite the prevalence of the aforementioned approaches, other researchers have engaged with the phenomena using different methods. Al-Fairouz and Al-Hagery [32] use several Machine Learning methods to predict academic success for students taking six different majors in a College of Business and Economics in Saudi Arabia. They report that RFs outperformed other models with a precision of 71.5%. Similarly, RFs also outperformed LR and Robust Linear Regression in the study presented by Sandoval et al. [33], who used academic records and behaviour in the institutional learning management system to predict the final grade of students attending large classes. Falát and Piscová [34] also used several supervised Machine Learning methods to predict students’ grade point average (GPA), namely LR, DTs, and RFs, with the latter being the one that provides the best predictive ability. Beaulac and Rosenthal [35] also use RFs to predict whether students will complete a study programme as well as the major they will pick based on the courses and grades they receive in their first year of study. In contrast, Gil et al. [36] aimed at predicting the success of first-year university students and found the best results using SVMs, closely followed by RFs. This is consistent with Aluko et al. [37], where an SVM surpasses a LR in predicting students’ academic achievement in a four-year undergraduate programme. This illustrates how the most effective methods and the outcomes they yield can differ greatly depending on the phenomenon to be predicted, the time frame considered, and the quantity and quality of data, despite the enormous potential of prediction using the Machine Learning methods [31].

The notion of specifically predicting performance in introductory programming courses is not particularly novel. In fact, researchers were already proposing approaches to it in the 1980s. In order to get the most out of the limited resources available at the time to teach programming, Leeper and Silver [16] provide one of the earliest approaches for estimating success in a programming course based on linear regression, with which they are able to do so in 26% of cases. The following year, Barker and Unger [38] present a predictor based on a questionnaire that evaluates the development of abstract reasoning, with which they are able to distinguish between above-average and below-average students, with the aim to separate students into fast- and slow-paced sections of an introductory Computer Science course. Referring to more recent studies, Costa et al. [39] evaluate four Machine Learning algorithms, namely NB, DTs, ANNs, and SVMs, with the aim of producing an early prediction on two data sets of beginning programming courses, one offered via distance learning and the other one on campus. Using internal course data (e.g., participation) and student characterisation variables such as age and gender, they achieve an accuracy of 83%, with SVMs being the most effective method. Similarly, Sivasakthi and Padmanabhan [40] used NB and achieved an accuracy of 91% employing a variety of students’ features, including problem-solving skills, prior programming experience, and programming aptitude, among others. In order to predict a student’s performance on the final exam of a programming course, Shen et al. [41] propose a different prediction approach. They model each student according to three dimensions (programming skills, personal information, and study log) and use various Machine Learning techniques, such as SVMs, Bayesian Ridge, RFs, Extra Trees, Gradient Boosting, DTs, and Deep Neural Networks, the latter being the best predictor in this case. Furthermore, Van Petegem et al. [42] present a method to predict success in two introductory programming courses based on students’ submission data for both graded and formative assessments. They achieve an accuracy of nearly 65% at the start of the semester, increasing to roughly 80% by the end of the period.

In the local context, Alvarez et al. [24] predict students’ success in a modular introductory programming course using psychometric variables related to implicit conceptions of intelligence, error orientation, and student attitude. Their findings show that features related to self-efficacy and the importance they place on programming skills have the greatest predictive potential. In a different vein, the preliminary study on which this work is based [27] attempted to identify those previously registered variables with the highest correlation with a passing grade in an introductory programming course imparted in the second semester of 20 engineering programmes. For this purpose, three different data sets of students’ features were considered. The results indicate that the best model is a radial kernel SVM, which achieves an accuracy of 68.6% when utilising the data set with the most variables.

3. Data and Methods

FCYP is preceded by five first-semester common courses: Algebra I (which students are required to pass before taking FCYP), Calculus I, Physics I, Introduction to Engineering, and Study Methods. The last two courses may require some explanation: Introduction to Engineering focuses on the development of group-working skills while students develop a semester-long project to provide innovative solutions to an open problem. Study Methods is intended to guide students in their adaptation to University life and to level up key skills needed for academic success, such as time management, reading comprehension, and report writing. Students should, by design, take FCYP in addition to Calculus II, Physics II, Algebra II, and, depending on the programme, Chemistry and the introductory course to their engineering specialisation.

Although a single course, FCYP has two separate parts that students must pass independently. The first component, theory, teaches the core concepts regarding programming in Python: syntax and semantic structures, simple and compound data types, and functions. To pass this part of the course, students have to apply these tools to solve bounded real-world and engineering problems. In the second component, laboratory, students work in groups to develop a semester-long project. Students must create a functional and creative computational solution to an open problem within a given context in order to pass. This work is focused on the theory part of the course, given that the laboratory grade represents the group’s overall performance rather than a single student’s level of success.

Common to all 20 engineering programmes at the University of Santiago de Chile, FCYP is offered in the second semester of each programme. The course is managed by two coordinators who are responsible for ensuring that all lecturers adhere to the specified schedule for each unit of study, as well as the university’s official syllabus. Additionally, they provide the minimum resources and materials required for the course. Likewise, all students enrolled in the course during a given semester take the same assessments simultaneously. These assessments, unique for each semester, are designed by a team of course instructors. During the observation period, assessments were handwritten without the use of computers to test the code. Each assessment consisted of two to three questions, requiring students to create a Python program for a given problem. Course lecturers graded each assessment according to a rubric developed by the coordination team, with a focus on the algorithm, syntax aspects, and code readability. For an in-depth description of the assessment methodology used in FCYP, Araya Sánchez et al. [43] provides a comprehensive account.

The goal of this study is to establish which students’ features best predict students’ outcomes in FCYPs’ theory part. For this purpose, the data set considers the variables described in Table 1, identified as relevant in previous work [44] for being the ones that best predict student dropout by the end of the first semester and in [27] for predicting whether students pass or fail the theory component of the course using the same data set. It should be noted that the Chilean grading scale applies to final grades for the following variables: AL, PHY, ALG, MET, INT, and GPA. These grades range from 1.0 to 7.0, with a minimum passing grade of 4.0.

The data set includes four cohorts of students who took FCYP between the second semester of 2015 and the first semester of 2019 (n = 6516). The first semester of 2019 has been established as the cut-off date because the second semester was irregular due to a social outburst in Chile, and academic activities were carried out in an emergency remote teaching (ERT) format during 2020 and 2021 because of the COVID-19 pandemic lockdown. From 2022 onward, the course suffered major modifications as the institution shortened the 6-year programmes to 5 years and, as a consequence, courses (including FCYP) were adjusted accordingly. It is also worth noting that observations corresponding to students who met the following criteria were excluded:

Passed the course in extra-ordinary instances, such as intensive courses (theory courses conducted between semesters that replace the final theory grade) or special exams (additional sufficiency exams that allow students to improve their final theory grade).
Were repeating the course after a previous failure.
Entered their study programme before the observation period.
Were not studying an engineering programme (e.g., the College programme).
Had an incomplete record.

The response variable (CLASS) is the final status (namely, pass or fail) for the theory part of the course, which students are required to pass independently from the laboratory component. After applying the exclusion criteria, the data set comprises 2372 complete records, with 1191 belonging to the “Pass” outcome and 1181 to the “Fail” outcome. It is important to note that the “Pass” outcome was defined as the positive class.

The data preparation considered the elimination of variables that are linear combinations of other variables and of strongly associated variables. Association was tested using different statistic tests, with a significance level of 0.05, depending on variable types:

Pearson correlation test for two numeric variables.
Independent samples t-test (or Wilcoxon rank sums if conditions are not met) for a dichotomous and a numeric variable.
Independent samples ANOVA (or Kruskal–Wallis if conditions are not met) for a numeric and a categorical variable with more than two levels.
$χ^{2}$ association test for two categorical variables.

Next, Recursive Feature Elimination (RFE) [45] was used on the remaining variables to select the best predictors, with MLR, RFs, and XGB (without hyper-parameter tuning) as base methods. The final feature selection was performed by combining all three results and considering the principle of parsimony.

The following Machine Learning techniques were selected to build classifiers on the selected features: radial kernel SVMs [46], MLR [47], RFs [48], and XGB [49]. These methods were chosen because of their performance in previous work [27]. Each model was tuned considering 10 repeats of a 10-fold cross-validation (for each repeat, observations are separated in 10 groups or folds, and then, for each fold, a model is tuned using the remaining 9 folds as a training set and the current one to assess the model’s performance). Additionally, a grid search was used for hyper-parameter tuning when needed:

Two hyper-parameters must be tuned for radial kernel SVMs: cost (C), which penalises wrong classification, and sigma, which regulates the curvature of the decision border. Values of powers of two ranging from $2^{- 10}$ to $2^{10}$ were considered for both parameters.
The selected implementation of a Random Forest only allows the mtry hyper-parameter to be tuned, which regulates how many of the input features are to be considered when building a decision tree, ranging from one to the number of available features.
For XGB, the nrounds parameter determines the number of trees in the final model, for which the range 100 to 2000 was considered, increasing by 50. eta prevents over-fitting by adjusting feature weights and considers the following possible values: 0.025 and 0.05 to 0.5, increasing by 0.05. max_depth regulates the maximum depth of a tree, ranging from one to the number of available features. min_child_weight regulates the minimum number of observations in each node, ranging from 1 to 10. colsample_bytree is the sub-sample ratio of columns when building each tree, ranging from 0.1 to 1 and increasing by 0.1. gamma regulates the minimum loss reduction needed to further partition a node, considering the same values listed for eta. subsample regulates the subset of observations sampled before growing the trees. This parameter was set to one since cross-validation already separates instances for model assessment.

Models were built for (1) all students, (2) student subsets by programme duration, and (3) student subsets by the department that manages the programme (i.e., by engineering specialisation). All the models for a single subset considered the same data split (i.e., the same folds for each repetition of the 10-fold cross-validation). Moreover, oversampling was used when there was an important class imbalance (above 15 percentage points). Building individual models for the 20 distinct engineering programmes was ruled out in this work due to the shortage of data for some programmes and a large increase in training costs.

To examine the significant effect of the type of model or group on prediction accuracy, a semi-parametric repeated measures ANOVA was conducted using the RM() function from the R-package MANOVA.RM. This function implements a non-parametric bootstrapped procedure for an ANOVA-type statistic (ATS), as proposed by Friedrich et al. [50]. These procedures remain robust even when there are violations of the assumptions of normality and homogeneity of covariance. Subsequently, a post hoc analysis was performed to calculate the corresponding bootstrapped means of pairwise differences and (percent) confidence intervals (CI). In all cases, 4999 bootstrap iterations and 95% confidence levels were considered.

4. Results

The statistic tests showed multiple associations between a numerical and a categorical variable. In these cases, the categorical variable was discarded. Table 2 lists some of those associations to support the elimination of predictive variables (under the Variable 2 column of Table 2).

Additionally, PSU_RAN was discarded because it is a linear transformation of PSU_GPA.

It is worth noting that the associations between course grades and lecturers may result from the course enrolment process. Typically, students within a program have limited schedule options for a course (usually 2 or 3), and lecturers often maintain consistent schedules over multiple semesters, leading to a connection between the lecturer and the students’ program. Whilst all engineering programmes share common first-year courses, each programme has different entry requirements, resulting in differences in student characteristics and aptitudes. This explains why a lecturer who has taught in a schedule open to students with high entry requirements is likely to have students who performed well in high school and during the first semester.

After the data preparation process, the final data set comprised 14 variables, as listed in Table 3, where P_TYPE and DPT were kept only for data filtering purposes. Table 4 summarises the frequencies for each class (“Pass” and “Fail”) for each student subset and indicates if there is a class imbalance.

Table 5 lists the final feature selection for each of the student subsets (the results obtained by each method are not detailed due to space concerns), whereas Table 6 shows the performance for each of the models. Additionally, Figure 2 shows the confusion matrices for each of the classifiers built, considering all students. Appendix A provides a selection of confusion matrices for additional classifiers and student subsets.

A significant interaction between the groups and types of models is detected (bootstrapped ATS yields

p < 0.001

). Post hoc contrasts show inconsistencies in model differences in performance across groups, and, thus, no clear patterns can be identified. Even though a wide range of accuracies between the groups is observed (from confidence interval (CI) = [0.607, 0.680] up to CI = [0.661, 0.919] estimated for Metallurgy and Informatics, respectively), a simple main effects analysis shows that these differences are not significant (bootstrapped ATS yields

p = 0.251

). Similarly, apparent differences between the performance of MLR models (CI = [0.695, 0.732]) and the other types of models (such as XGB, CI = [0.731, 0.765]) are observed, but the main effect also results in borderline non-significance (bootstrapped ATS yields

p = 0.063

).

5. Discussion

The results of the feature selection process show that relevant predictors vary depending on the considered subset of students. Thus, a one-size-fit-all approach does not provide an answer to the question of what the key factors are to succeed in learning to program. When considering the entire set of students, only the first-semester GPA and the language admission test score appear as relevant predictors. Both variables are reasonable since the first semester GPA summarises various factors, including the student’s academic performance, performance in the initial mathematics and science courses, and how well the student has adapted to the demands of higher education. Similarly, the language admission test score makes sense because it reflects various factors, such as the ability to effectively understand the programming problem specification, basic syntax rules particular to language use, and so on. Furthermore, because there is no required standardised English test in Chile, it is impossible to assess whether previous English ability is a meaningful factor in determining success in a programming course. In addition, both variables as a whole suggest the importance of both good mathematical and language skills, which is consistent with findings of Ivanova et al. [51] and Prat et al. [52]. However, a larger number of variables is selected when separating students depending on the duration of their programmes. It is also interesting that selected predictors significantly vary between Engineering specialisations (determined by the department imparting a given programme), thus suggesting different student profiles.

It is also interesting to note that the best model for all students (SVMs) achieved an accuracy of 67.71% with only two predictors, which is slightly below the best result obtained in previous work, where the best accuracy was 68.6% considering 21 predictors [27]. In this case, the same model exhibits better specificity (72.24%) than sensitivity (63.21%). This means it has a slightly lower number of false positive instances (students who failed but were classified as passing) compared to false negatives (passing students classified as failing), indicating that the model is more adept at detecting students who will not pass the course. The relatively lower accuracy in this particular instance, compared to the different subsets, may be attributed to the fact that the data set comprising all students presents a more challenging classification problem. It includes a significantly more diverse sample of students than any of the other subsets, reflecting the varied student profiles associated with each programme.

Despite the fact that the best model performances by programme duration are also slightly below those reported in previous work, the trained models involve less predictors. This emphasises the significance of the prior statistical analyses since reducing the number of predictors not only produces models that are easier to understand but also decreases the computing cost to train them.

Additionally, the achieved accuracy is comparable to that of Van Petegem et al. [42] in the first weeks of the semester, with the difference that by using previously recorded student variables, it is possible to take measures to level students before the course begins. Similarly, the chosen variables are not ad hoc indicators specific to the university or the instructional design of the courses, allowing for the existence of a model that can be extended to other institutions with introductory courses at similar points in their programmes.

With the exception of “6-year”, “industry”, “informatics”, and “mining”, the best model in each subset exhibits a higher specificity value compared to sensitivity. This consistent pattern suggests that identifying students prone to failure is generally more manageable than identifying those likely to pass. Notably, the SVM classifier in the complete data set presents a 9.03 percentage point gap between specificity and sensitivity. Additionally, the best models for “Electricity” and “Geography” display disparities of 22.72 and 15.18 percentage points between positive and negative class classifications. These models manage to achieve high specificity even though they rely on only one or two features for classification, showcasing their robustness.

When looking at the models built for each specialisation, it is worth noting that classification accuracy is above 70% for all departments except Geography and Metallurgy, and above 75% for Industry, Informatics, Mechanics, and Civil. Interestingly, though not statistically significant, the most accurate predictions (87.14%) are achieved for Informatics. This might suggest the relevance of hidden factors such as motivation and the perceived value of programming skills for student outcomes in an introductory programming course, which agrees with the findings of authors such as Bellino et al. [25] and Alvarez et al. [24]. Moreover, in most cases, but again without statistical significance, SVM models exhibited the highest accuracy, which is consistent with Aluko et al. [37]. It is interesting to note that, although an RF is frequently reported to be a good method [34,35,36], it has a low performance when considering all students or subsets by programme duration, but it significantly improves when separating students by specialisation. This might be partially explained by the restrictions of the selected implementation, which limits the hyper-parameters that can be tuned, but it also suggests that the specialisation would be a relevant variable when building each tree.

6. Conclusions and Future Work

The results of the use of four supervised Machine Learning models (RF, MLR, SVM, and XGB) to predict whether students will pass or fail an introductory programming course are presented in this study. The contributions of this can be summarised as follows:

By removing correlated variables, it was possible to obtain results that were closely comparable to those achieved in prior work [27] with simpler and more explainable models.
Models were built using all the students in the sample, as well as subsets based on programme duration and the department responsible for the programme. Except for MLR, all of the methods outperform the others in certain scenarios, with SVMs surpassing the others in the majority of cases.
With only two features—the students’ GPA and the result on the entrance language test—a model was created that can accurately predict whether students pass or fail the theory part of FCYP with an accuracy of 67.71%.
Specific models were built for each subset of students. With the exception of Geography and Metallurgy specialisations, all departmental models improved the accuracy of the base model, with Informatics, Industry, Civil Engineering, and Electricity showing the best results.

Although it may appear paradoxical, the optimal scenario would be for the model, assuming it is correctly trained, to yield results that are virtually random. This would imply that students have the capability to overcome the factors that predestine their outcomes. A model that can reasonably predict a student’s course performance a semester in advance suggests the presence of environmental barriers that a student may not surmount within a single semester.

The results open interesting possibilities for future work. Selected predictors vary greatly between specialisations, implying the existence of diverse student profiles, which should be further investigated. One approach to doing so would be the use of unsupervised Machine Learning techniques, such as clustering or association rules, to see if there are distinguishable student profiles and to determine if (and how) these overlap with programme and academic department choices.

Applying the same process to a more recent data set is relatively straightforward. However, as these models were calibrated using pre-COVID-19 data, the prospect of evaluating their performance with students currently enrolled in FCYP presents a more compelling inquiry. This would facilitate an assessment of the models’ relevance over time and in varying settings, just as it would allow us to determine how various student profiles match a given model. It would also be interesting to apply the same process to courses that are taught simultaneously in order to rule out variables that are general predictors of academic achievement and differentiate domain-specific variables for common STEM courses such as Physics, Calculus, Algebra, Statistics, and Programming.

Author Contributions

Conceptualisation, J.L.J., J.K. and L.H.; data curation, J.K. and L.H.; formal analysis, J.L.J. and J.K.; funding acquisition, J.L.J.; methodology, J.L.J.; project administration, J.L.J.; software, J.K.; supervision, J.L.J.; validation, J.K and L.H.; visualisation, J.K. and L.H.; writing—original draft, L.H. and J.K; writing—review and editing, J.K., L.H. and J.L.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work is partially supported by the Facultad de Ingeniería of Universidad de Santiago de Chile (FING-USACH), the Dirección de Pregrado of Universidad de Santiago de Chile PID 032-2019, the National Agency for Research and Development (ANID), and Scholarship programme, DOCTORADO BECAS CHILE, 7608/2020, and the ANID-Subdirección de Capital Humano/Doctorado Nacional/2022-21220979.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this work are available from the corresponding author upon reasonable request. The data are not publicly available due to Universidad de Santiago de Chile’s internal data protection policy.

Acknowledgments

The authors would like to express their gratitude to the reviewers for their thoughtful comments and suggestions, which helped to improve the clarity of this work, and to the Facultad de Ingeniería of Universidad de Santiago de Chile for making the data available and for their support of the project.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ANN	Artificial Neural Networks
ATS	ANOVA-type statistic
CI	Confidence interval
DT	Decision Tree
FCYP	Fundamentos de Computación y Programación, an initial programming course
FING	Facultad de Ingeniería, Engineering Faculty
GPA	Grade Point Average
LR	Linear Regression
MLR	Multivariate Logistic Regression
NB	Naive Bayes
RF	Random Forest
RFE	Recursive Feature Elimination
STEM	Science, Technology, Engineering y Mathematics
SVM	Support Vector Machine
XGB	Extreme Gradient Boosting

Appendix A. Supplementary Confusion Matrices

The appendix includes the confusion matrices for some of the classifiers that the authors found significant. Particular emphasis is placed on subsets such as the 6-year programmes (Figure A1) and 4-year programmes (Figure A2), along with classifiers for noteworthy departments such as Electricity (Figure A3), Geography (Figure A4), Industry (Figure A5), Informatics (Figure A6) and Civil (Figure A7 ). All of the matrices display the number of normalised classified cases per row for the ten iterations of the ten-fold cross validation process using the same procedure as depicted in Figure 2.

Figure A1. Confusion matrices for classifiers built using 6-year programmes subset.

Figure A2. Confusion matrices for classifiers built using 4-year programmes subset.

Figure A3. Confusion matrices for classifiers trained on the subset of Electricity department students.

Figure A4. Confusion matrices for classifiers trained on the subset of Geography department students.

Figure A5. Confusion matrices for classifiers trained on the subset of Industry department students.

Figure A6. Confusion matrices for classifiers trained on the subset of Informatics department students.

Figure A7. Confusion matrices for classifiers trained on the subset of Civil department students.

References

World Economic Forum. The Future of Jobs: Employment, Skills and Workforce Strategy for the Fourth Industrial Revolution; World Economic Forum: Geneva, Switzerland, 2016. [Google Scholar]
Leidl, K.D.; Bers, M.U.; Mihm, C. Programming with ScratchJr: A review of the first year of user analytics. In Proceedings of the International Conference on Computational Thinking Education, Wanchai, Hong Kong, 13–15 July 2017; pp. 116–121. [Google Scholar]
De Kereki, I.F.; Manataki, A. “Code Yourself” and “A Programar”: A bilingual MOOC for teaching computer science to teenagers. In Proceedings of the 2016 IEEE Frontiers in Education Conference (FIE), Erie, PA, USA, 12–15 October 2016; IEEE: New York, NY, USA, 2016; pp. 1–9. [Google Scholar]
Kalelioğlu, F. A new way of teaching programming skills to K-12 students: Code. org. Comput. Hum. Behav. 2015, 52, 200–210. [Google Scholar] [CrossRef]
Chen, X.; Liu, W. The Value of Python Programming in General Education and Comprehensive Quality Improvement of Medical Students Based on a Retrospective Cohort Study. J. Healthc. Eng. 2022, 2022, 4043992. [Google Scholar] [CrossRef]
dos Santos, M.T.; Vianna, A.S., Jr.; Le Roux, G.A. Programming skills in the industry 4.0: Are chemical engineering students able to face new problems? Educ. Chem. Eng. 2018, 22, 69–76. [Google Scholar] [CrossRef]
Hansen, S.M. Deconstruction/Reconstruction: A pedagogic method for teaching programming to graphic designers. In Proceedings of the Generative Arts Conference 2017, Ravenna, Italy, 13–15 December 2017; Generative Art Conference: Milan, Italy, 2017; pp. 419–431. [Google Scholar]
Lee, Y.J.; Lien, K.W. Reconstruct Programming 101 for Social Science Preference Students. In Proceedings of the 2019 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), Yilan, Taiwan, 20–22 May 2019; IEEE: New York, NY, USA, 2019; pp. 1–2. [Google Scholar]
Piteira, M.; Costa, C. Computer programming and novice programmers. In Proceedings of the Workshop on Information Systems and Design of Communication, Lisbon, Portugal, 11 June 2012; pp. 51–53. [Google Scholar]
Cheah, C.S. Factors contributing to the difficulties in teaching and learning of computer programming: A literature review. Contemp. Educ. Technol. 2020, 12, ep272. [Google Scholar] [CrossRef] [PubMed]
Medeiros, R.P.; Ramalho, G.L.; Falcão, T.P. A Systematic Literature Review on Teaching and Learning Introductory Programming in Higher Education. IEEE Trans. Educ. 2019, 62, 77–90. [Google Scholar] [CrossRef]
Tsai, C.Y. Improving students’ understanding of basic programming concepts through visual programming language: The role of self-efficacy. Comput. Hum. Behav. 2019, 95, 224–232. [Google Scholar] [CrossRef]
Emerson, A.; Rodríguez, F.J.; Mott, B.; Smith, A.; Min, W.; Boyer, K.E.; Smith, C.; Wiebe, E.; Lester, J. Predicting Early and Often: Predictive Student Modeling for Block-Based Programming Environments. In Proceedings of the 12th International Conference on Educational Data Mining, Montreal, QC, Canada, 2–5 July 2019. [Google Scholar]
Sobral, R.; Oliveira, F. Predicting students performance in introductory programming courses: A literature review. In INTED2021 Proceedings, Proceedings of the 15th International Technology, Education and Development Conference, Online, 8–9 March 2021; IATED: Valencia, Spain, 2021; pp. 7402–7412. [Google Scholar] [CrossRef]
Biamonte, A.J. Predicting success in programmer training. In Proceedings of the Second SIGCPR Conference on Computer Personnel Research, New York, NY, USA, 20–21 July 1964; pp. 9–12. [Google Scholar]
Leeper, R.R.; Silver, J.L. Predicting success in a first programming course. ACM SIGCSE Bull. 1982, 14, 147–150. [Google Scholar] [CrossRef]
Bergin, S.; Reilly, R. Programming: Factors that influence success. In Proceedings of the 36th SIGCSE technical Symposium on Computer Science Education, St. Louis, MO, USA, 23–27 February 2005; pp. 411–415. [Google Scholar]
Loksa, D.; Ko, A.J. The role of self-regulation in programming problem solving process and success. In Proceedings of the 2016 ACM Conference on International Computing Education Research, Melbourne, VIC, Australia, 8–12 September 2016; pp. 83–91. [Google Scholar]
Prather, J.; Pettit, R.; McMurry, K.; Peters, A.; Homer, J.; Cohen, M. Metacognitive difficulties faced by novice programmers in automated assessment tools. In Proceedings of the 2018 ACM Conference on International Computing Education Research, Espoo, Finland, 13–15 August 2018; pp. 41–50. [Google Scholar]
Lopez, M.; Whalley, J.; Robbins, P.; Lister, R. Relationships between reading, tracing and writing skills in introductory programming. In Proceedings of the Fourth International Workshop on Computing Education Research, Sydney, Australia, 6–7 September 2008; pp. 101–112. [Google Scholar]
Qian, Y.; Lehman, J.D. Correlates of success in introductory programming: A study with middle school students. J. Educ. Learn. 2016, 5, 73–83. [Google Scholar] [CrossRef]
Hinckle, M.; Rachmatullah, A.; Mott, B.; Boyer, K.E.; Lester, J.; Wiebe, E. The relationship of gender, experiential, and psychological factors to achievement in computer science. In Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education, Trondheim, Norway, 15–19 June 2020; pp. 225–231. [Google Scholar]
Álvarez, C.; Fajardo, C.; Meza, F.; Vásquez, A. An exploration of STEM freshmen’s attitudes, engagement and autonomous learning in introductory computer programming. In Proceedings of the 2019 38th International Conference of the Chilean Computer Science Society (SCCC), Concepcion, Chile, 4–9 November 2019; IEEE: New York, NY, USA, 2019; pp. 1–8. [Google Scholar]
Alvarez, C.; Wise, A.; Altermatt, S.; Aranguiz, I. Predicting academic results in a modular computer programming course. In Proceedings of the 2nd Latin American Conference on Learning Analytics, LALA, Valdivia, Chile, 18–19 March 2019; Volume 2425, pp. 21–30. [Google Scholar]
Bellino, A.; Herskovic, V.; Hund, M.; Munoz-Gama, J. A real-world approach to motivate students on the first class of a computer science course. ACM Trans. Comput. Educ. (TOCE) 2021, 21, 1–23. [Google Scholar] [CrossRef]
Moonsamy, D.; Naicker, N.; Adeliyi, T.T.; Ogunsakin, R.E. A meta-analysis of educational data mining for predicting students performance in programming. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 97–104. [Google Scholar] [CrossRef]
Köhler, J.; Hidalgo, L.; Jara, J.L. Using machine learning techniques to predict academic success in an introductory programming course. In Proceedings of the 2022 41st International Conference of the Chilean Computer Science Society (SCCC), Santiago, Chile, 21–25 November 2022; IEEE: New York, NY, USA, 2022; pp. 1–8. [Google Scholar]
Alturki, S.; Hulpuș, I.; Stuckenschmidt, H. Predicting Academic Outcomes: A Survey from 2007 Till 2018. Technol. Knowl. Learn. 2022, 27, 275–307. [Google Scholar] [CrossRef]
Alsariera, Y.A.; Baashar, Y.; Alkawsi, G.; Mustafa, A.; Alkahtani, A.A.; Ali, N. Assessment and Evaluation of Different Machine Learning Algorithms for Predicting Student Performance. Comput. Intell. Neurosci. 2022, 2022, 4151487. [Google Scholar] [CrossRef] [PubMed]
Abu Saa, A.; Al-Emran, M.; Shaalan, K. Factors Affecting Students’ Performance in Higher Education: A Systematic Review of Predictive Data Mining Techniques. Technol. Knowl. Learn. 2019, 24, 567–598. [Google Scholar] [CrossRef]
Ismail, N.; Yusof, U.K. A systematic literature review: Recent techniques of predicting STEM stream students. Comput. Educ. Artif. Intell. 2023, 5, 100141. [Google Scholar] [CrossRef]
Al-Fairouz, E.; Al-Hagery, M. Students performance: From detection of failures and anomaly cases to the solutions-based mining algorithms. Int. J. Eng. Res. Technol. 2020, 13, 2895–2908. [Google Scholar] [CrossRef]
Sandoval, A.; Gonzalez, C.; Alarcon, R.; Pichara, K.; Montenegro, M. Centralized student performance prediction in large courses based on low-cost variables in an institutional context. Internet High. Educ. 2018, 37, 76–89. [Google Scholar] [CrossRef]
Falát, L.; Piscová, T. Predicting GPA of University Students with Supervised Regression Machine Learning Models. Appl. Sci. 2022, 12, 8403. [Google Scholar] [CrossRef]
Beaulac, C.; Rosenthal, J.S. Predicting University Students’ Academic Success and Major Using Random Forests. Res. High. Educ. 2019, 60, 1048–1064. [Google Scholar] [CrossRef]
Gil, P.D.; da Cruz Martins, S.; Moro, S.; Costa, J.M. A data-driven approach to predict first-year students’ academic success in higher education institutions. Educ. Inf. Technol. 2021, 26, 2165–2190. [Google Scholar] [CrossRef]
Aluko, R.O.; Daniel, E.I.; Shamsideen Oshodi, O.; Aigbavboa, C.O.; Abisuga, A.O. Towards reliable prediction of academic performance of architecture students using data mining techniques. J. Eng. Des. Technol. 2018, 16, 385–397. [Google Scholar] [CrossRef]
Barker, R.J.; Unger, E.A. A Predictor for Success in an Introductory Programming Class Based upon Abstract Reasoning Development. SIGCSE Bull. 1983, 15, 154–158. [Google Scholar] [CrossRef]
Costa, E.B.; Fonseca, B.; Santana, M.A.; de Araújo, F.F.; Rego, J. Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Comput. Hum. Behav. 2017, 73, 247–256. [Google Scholar] [CrossRef]
Sivasakthi, M.; Padmanabhan, K.R.A. Prediction of Students Programming Performance Using Naïve Bayesian and Decision Tree. In Soft Computing for Security Applications; Ranganathan, G., Fernando, X., Piramuthu, S., Eds.; Springer: Singapore, 2023; pp. 97–106. [Google Scholar]
Shen, G.; Yang, S.; Huang, Z.; Yu, Y.; Li, X. The prediction of programming performance using student profiles. Educ. Inf. Technol. 2023, 28, 725–740. [Google Scholar] [CrossRef]
Van Petegem, C.; Deconinck, L.; Mourisse, D.; Maertens, R.; Strijbol, N.; Dhoedt, B.; De Wever, B.; Dawyndt, P.; Mesuere, B. Pass/Fail Prediction in Programming Courses. J. Educ. Comput. Res. 2023, 61, 68–95. [Google Scholar] [CrossRef]
Araya Sánchez, V.; Fuentes Bravo, F.; Salazar Loyola, J.; Melo Fuenzalida, P.; Rickmers Blamey, B. Characterization of Assessments on a First Programming Course in Higher Education. In Proceedings of the 2022 41st International Conference of the Chilean Computer Science Society (SCCC), Santiago, Chile, 21–25 November 2022; pp. 1–8. [Google Scholar] [CrossRef]
Bello, F.A.; Köhler, J.; Hinrechsen, K.; Araya, V.; Hidalgo, L.; Jara, J.L. Using machine learning methods to identify significant variables for the prediction of first-year Informatics Engineering students dropout. In Proceedings of the 2020 39th International Conference of the Chilean Computer Science Society (SCCC), Coquimbo, Chile, 16–20 November 2020; IEEE: New York, NY, USA, 2020; pp. 1–5. [Google Scholar]
Guyon, I.; Weston, J.; Barnhill, S.; Vapnik, V. Gene Selection for Cancer Classification using Support Vector Machines. Mach. Learn. 2002, 46, 389–422. [Google Scholar] [CrossRef]
Noble, W.S. What is a support vector machine? Nat. Biotechnol. 2006, 24, 1565–1567. [Google Scholar] [CrossRef] [PubMed]
Menard, S. Coefficients of determination for multiple logistic regression analysis. Am. Stat. 2000, 54, 17–24. [Google Scholar]
Cutler, A.; Cutler, D.R.; Stevens, J.R. Random forests. In Ensemble Machine Learning; Springer: Berlin/Heidelberg, Germany, 2012; pp. 157–175. [Google Scholar]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 2016 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
Friedrich, S.; Konietschke, F.; Pauly, M. The R Journal: Resampling-Based Analysis of Multivariate Data and Repeated Measures Designs with the R Package MANOVA.RM. R J. 2019, 11, 380–400. [Google Scholar] [CrossRef]
Ivanova, A.A.; Srikant, S.; Sueoka, Y.; Kean, H.H.; Dhamala, R.; O’Reilly, U.M.; Bers, M.U.; Fedorenko, E. Comprehension of computer code relies primarily on domain-general executive brain regions. eLife 2020, 9, e58906. [Google Scholar] [CrossRef] [PubMed]
Prat, C.S.; Madhyastha, T.M.; Mottarella, M.J.; Kuo, C.H. Relating natural language aptitude to individual differences in learning programming languages. Sci. Rep. 2020, 10, 3817. [Google Scholar] [CrossRef]

Figure 1. Summary of the research process followed to conduct this research.

Figure 2. Confusion matrices for classifiers built using all the observations. Values have been calculated considering the 10 repeats of the tuning process. Matrices have been normalised by row.

Table 1. Variables in the complete data set.

Variable	Type	Description
PROG	Categorical	Programme code (20 levels).
P_TYPE	Dichotomous	Indicates the programme duration.
DPT	Categorical	Indicates which department manages the programme.
PREF	Categorical	Preference order of the programme in application process.
PSU_SCI	Integer	Score in Science admission test.
PSU_LAN	Integer	Score in Language admission test.
PSU_MAT	Integer	Score in Maths admission test.
PSU_GPA	Integer	Normalised high-school grade point average.
PSU_RAN	Integer	Normalised ranking score.
PSU_AVG	Integer	Weighted average admission score.
FEE_EX	Dichotomous	Indicates if the student has been granted the fee-exemption benefit.
SCHOOL	Categorical	Type of high school of origin (municipal, private, or subsidised).
QUINTILE	Categorical	Family income quintile at entrance.
PREC_HDI	Real	Human development index of the student’s municipality of residence.
CAL	Real	Final grade after taking Calculus I for the first time $[1.0, 7.0]$ .
PHY	Real	Final grade after taking Physics I for the first time $[1.0, 7.0]$ .
ALG	Real	Final grade after taking Algebra I for the first time $[1.0, 7.0]$ .
MET	Real	Final grade after taking Study Methods for the first time $[1.0, 7.0]$ .
INT	Real	Final grade after taking Introduction to Engineering for the first time $[1.0, 7.0]$ .
GPA	Real	Grade point average after the first semester $[1.0, 7.0]$ .
L_THEO	Categorical	Lecturer with whom the student took the theory part of FCYP.
L_LAB	Categorical	Lecturer with whom the student took the laboratory part of FCYP.
CLASS	Dichotomous	Indicates if a student passed the theory component of FCYP.

Table 2. Associations between variables.

Variable 1	Variable 2	Test	Statistic	p
ALG	PROG	Kruskal–Wallis	H = 70.663	p < 0.001
ALG	P_TYPE	Wilcoxon	W = 620,035.500	p < 0.001
ALG	DPT	Kruskal–Wallis	H = 25.887	p = 0.001
MET	PREF	Kruskal–Wallis	H = 39.730	p < 0.001
GPA	FEE_EX	Wilcoxon	W = 529,103.500	p < 0.001
ALG	SCHOOL	Kruskal–Wallis	H = 19.481	p < 0.001
GPA	QUINTILE	Kruskal–Wallis	H = 21.502	p < 0.001
CAL	L_THEO	Kruskal–Wallis	H = 68.523	p < 0.001
ALG	L_LAB	Kruskal–Wallis	H = 84.681	p < 0.001
GPA	PREC_HDI	Correlation	t = 2738	p = 0.006

Table 3. Variables in the processed data set.

Variable	Type	Description
P_TYPE	Dichotomous	Indicates the programme duration.
DPT	Categorical	Indicates which department manages the programme.
PSU_SCI	Integer	Score in Science admission test.
PSU_LAN	Integer	Score in Language admission test.
PSU_MAT	Integer	Score in Maths admission test.
PSU_GPA	Integer	Normalised high-school grade point average.
PSU_AVG	Integer	Weighted average admission score.
CAL	Real	Final grade after taking Calculus I for the first time [1.0, 7.0].
PHY	Real	Final grade after taking Physics I for the first time [1.0, 7.0].
ALG	Real	Final grade after taking Algebra I for the first time [1.0, 7.0].
MET	Real	Final grade after taking Study Methods for the first time [1.0, 7.0].
INT	Real	Final grade after taking Introduction to Engineering for the first time [1.0, 7.0].
GPA	Real	Grade point average after the first semester [1.0, 7.0].
CLASS	Dichotomous	Indicates if a student passed the theory component of FCYP.

Table 4. Observed per-class frequencies and proportions for each student subsample.

Subset	n	$n_{Pass}$	$n_{Fail}$	% Pass	% Fail	Imbalance
All	2372	1191	1181	50.21%	49.79%	No
6-year programmes	1726	893	833	51.74%	48.26%	No
4-year programmes	646	298	348	46.13%	53.87%	No
Electricity	372	149	223	40.05%	59.95%	Yes
Geography	172	67	105	38.95%	61.05%	Yes
Industry	444	267	177	60.14%	39.86%	Yes
Informatics	248	163	85	65.73%	34.27%	Yes
Mechanics	330	190	140	57.58%	42.42%	Yes
Metallurgy	126	56	70	44.44%	55.56%	No
Mining	260	103	157	39.62%	60.38%	Yes
Civil	180	71	109	39.44%	60.56%	Yes
Chemistry	240	125	115	52.08%	47.92%	No

Table 5. Feature selection for each student subset.

Subset	Selected Features
All	GPA, PSU_LAN
6-year programmes	PSU_LAN, INT, ALG, GPA, PSU_MAT, CAL, PHY, PSU_SCI
4-year programmes	GPA, PHY, PSU_MAT, ALG, PSU_SCI, CAL, MET, PSU_AVG, PSU_GPA
Electricity	GPA, CAL
Geography	CAL
Industry	GPA, CAL, PSU_SCI, PSU_LAN, PHY
Informatics	PSU_LAN, GPA, INT
Mechanics	GPA, PSU_MAT, PSU_AVG, CAL
Metallurgy	PHY, GPA, PSU_LAN, CAL
Mining	GPA
Civil	GPA, ALG, PHY, CAL, INT
Chemistry	GPA, CAL, PSU_AVG, PHY, PSU_GPA, INT, ALG, PSU_SCI, PSU_LAN

Table 6. Model performances. Most accurate model for each subset highlighted in blue.

Subset	Model	Acc	Sens	Spec	ROC
All	RF	62.24%	60.84%	63.65%	66.98%
	MLR	67.60%	70.34%	64.84%	73.23%
	SVM	67.71%	63.21%	72.24%	73.08%
	XGB	66.37%	64.36%	68.41%	72.21%
6-year programmes	RF	66.29%	66.08%	66.51%	71.56%
	MLR	68.30%	72.24%	64.07%	73.56%
	SVM	68.66%	70.29%	66.90%	73.72%
	XGB	67.24%	67.83%	66.61%	73.20%
4-year programmes	RF	64.92%	57.53%	71.25%	69.43%
	MLR	66.79%	60.44%	72.22%	71.57%
	SVM	67.50%	62.00%	72.22%	71.43%
	XGB	66.08%	56.16%	74.58%	70.27%
Electricity	RF	71.66%	78.00%	65.33%	76.70%
	MLR	67.37%	72.15%	62.58%	73.52%
	SVM	73.98%	62.62%	85.34%	76.18%
	XGB	70.32%	77.28%	63.35%	71.22%
Geography	RF	67.23%	64.13%	70.42%	71.61%
	MLR	66.35%	66.44%	66.39%	72.58%
	SVM	68.60%	61.06%	76.25%	72.74%
	XGB	66.02%	53.17%	78.94%	71.44%
Industry	RF	73.80%	69.65%	77.98%	83.25%
	MLR	64.39%	67.19%	61.63%	70.33%
	SVM	78.28%	100.00%	56.57%	78.29%
	XGB	69.78%	64.74%	74.84%	74.04%
Informatics	RF	82.32%	77.82%	86.78%	90.18%
	MLR	65.62%	67.31%	63.91%	75.01%
	SVM	87.14%	98.11%	76.13%	88.33%
	XGB	80.74%	75.85%	85.61%	85.85%
Mechanics	RF	75.05%	70.11%	80.00%	82.85%
	MLR	67.03%	70.68%	63.37%	75.01%
	SVM	72.95%	74.74%	71.16%	77.84%
	XGB	73.39%	68.11%	78.68%	79.21%
Metallurgy	RF	66.15%	53.23%	76.57%	70.19%
	MLR	61.06%	44.23%	74.43%	63.65%
	SVM	63.46%	41.57%	81.14%	66.10%
	XGB	66.71%	56.03%	75.14%	68.38%
Mining	RF	73.27%	78.91%	67.69%	77.22%
	MLR	69.68%	71.23%	68.19%	76.70%
	SVM	72.31%	65.58%	79.07%	71.76%
	XGB	72.95%	68.19%	77.74%	75.68%
Civil	RF	75.87%	79.68%	72.06%	82.89%
	MLR	66.25%	64.66%	67.80%	71.22%
	SVM	78.30%	56.61%	100.00%	77.77%
	XGB	70.82%	76.09%	65.62%	71.47%
Chemistry	RF	71.50%	72.75%	70.16%	75.43%
	MLR	71.33%	73.38%	69.12%	75.91%
	SVM	73.64%	76.37%	70.67%	76.93%
	XGB	74.84%	74.58%	75.14%	77.21%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Köhler, J.; Hidalgo, L.; Jara, J.L. Predicting Students’ Outcome in an Introductory Programming Course: Leveraging the Student Background. Appl. Sci. 2023, 13, 11994. https://doi.org/10.3390/app132111994

AMA Style

Köhler J, Hidalgo L, Jara JL. Predicting Students’ Outcome in an Introductory Programming Course: Leveraging the Student Background. Applied Sciences. 2023; 13(21):11994. https://doi.org/10.3390/app132111994

Chicago/Turabian Style

Köhler, Jacqueline, Luciano Hidalgo, and José Luis Jara. 2023. "Predicting Students’ Outcome in an Introductory Programming Course: Leveraging the Student Background" Applied Sciences 13, no. 21: 11994. https://doi.org/10.3390/app132111994

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predicting Students’ Outcome in an Introductory Programming Course: Leveraging the Student Background

Abstract

1. Introduction

2. Related Work

3. Data and Methods

4. Results

5. Discussion

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Supplementary Confusion Matrices

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI