(Publisher of Peer Reviewed Open Access Journals)

International Journal of Advanced Technology and Engineering Exploration (IJATEE)

ISSN (Print):2394-5443    ISSN (Online):2394-7454
Volume-9 Issue-89 April-2022
Full-Text PDF
Paper Title : Comparative analysis of classification algorithm evaluations to predict secondary school students’ achievement in core and elective subjects
Author Name : Hasnah Nawang, Mokhairi Makhtar and Wan Mohd Amir Fazamin Wan Hamzah
Abstract :

Many researchers in educational data mining (EDM) have explored various machine learning techniques in order to predict students’ performance. However, the most daunting challenge in classification modelling is selecting the most effective algorithm with the highest accuracy. A study was conducted using datasets from two Malaysian premier secondary schools, Maktab Rendah Sains Mara (MRSM) Kuala Berang and Kuala Terengganu. The purpose of this study is to respond to two key questions; the first is to examine which algorithm is the best in predicting secondary students’ achievement in core and elective subjects, while the second is to study whether the same features and algorithms are capable of predicting academic performance based on students’ first semester achievement. To do so, this study analysed the effectiveness of six different classification algorithms, which are naïve Bayes (NB), random forest (RF), k-nearest neighbour (kNN), support vector machine (SVM), sequential minimal optimization (SMO), and logistic regression (LGR). Each model’s prediction accuracy was evaluated using 10-fold cross validation in order to identify the best model. The results showed that the RF model outperformed other models in terms of accuracy, precision, recall, and F1-Measure. With most algorithms achieving significant accuracy levels for both core and elective subjects’ dataset. It is concluded that the prediction of secondary school students' achievement can begin as early as the first semester using RF for core and elective subjects with biology dataset. The accuracy obtained was 96.7% and 97.5%, respectively for the core and elective subjects.

Keywords : Classification, Prediction, Educational data mining, Students’ performance predictions.
Cite this article : Nawang H, Makhtar M, Hamzah WM. Comparative analysis of classification algorithm evaluations to predict secondary school students’ achievement in core and elective subjects. International Journal of Advanced Technology and Engineering Exploration. 2022; 9(89):430-445. DOI:10.19101/IJATEE.2021.875311.
References :
[1]Aziz AA, Starkey A. Predicting supervise machine learning performances for sentiment analysis using contextual-based approaches. IEEE Access. 2019; 8:17722-33.
[Crossref] [Google Scholar]
[2]Romero C, Ventura S. Educational data mining and learning analytics: an updated survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2020; 10(3).
[Crossref] [Google Scholar]
[3]Alhassan A, Zafar B, Mueen A. Predict students’ academic performance based on their assessment grades and online activity data. International Journal of Advanced Computer Science and Applications. 2020; 11(4).
[Google Scholar]
[4]Aydoğdu Ş. Predicting student final performance using artificial neural networks in online learning environments. Education and Information Technologies. 2020; 25(3):1913-27.
[Crossref] [Google Scholar]
[5]Asif R, Merceron A, Ali SA, Haider NG. Analyzing undergraduate students performance using educational data mining. Computers & Education. 2017; 113:177-94.
[Crossref] [Google Scholar]
[6]Yassein NA, Helali RG, Mohomad SB. Predicting student academic performance in KSA using data mining techniques. Journal of Information Technology & Software Engineering. 2017; 7(5):1-5.
[Google Scholar]
[7]Oeda S, Hashimoto G. Log-data clustering analysis for dropout prediction in beginner programming classes. Procedia Computer Science. 2017; 112:614-21.
[Crossref] [Google Scholar]
[8]Tasnim N, Paul MK, Sattar AS. Identification of drop out students using educational data mining. In international conference on electrical, computer and communication engineering 2019 (pp. 1-5). IEEE.
[Crossref] [Google Scholar]
[9]Hussain S, Dahan NA, Ba-alwib FM, Ribata N. Educational data mining and analysis of students’ academic performance using WEKA. Indonesian Journal of Electrical Engineering and Computer Science. 2018; 9(2):447-59.
[Crossref] [Google Scholar]
[10]Flore PC, Mulder J, Wicherts JM. The influence of gender stereotype threat on mathematics test scores of Dutch high school students: a registered report. Comprehensive Results in Social Psychology. 2018; 3(2):140-74.
[Crossref] [Google Scholar]
[11]Wise AF, Jung Y. Teaching with analytics: towards a situated model of instructional decision-making. Journal of Learning Analytics. 2019; 6(2):53-69.
[Google Scholar]
[12]Mai TT, Bezbradica M, Crane M. Learning behaviours data in programming education: community analysis and outcome prediction with cleaned data. Future Generation Computer Systems. 2022; 127:42-55.
[Crossref] [Google Scholar]
[13]Baashar Y, Hamed Y, Alkawsi G, Capretz LF, Alhussian H, Alwadain A, et al. Evaluation of postgraduate academic performance using artificial intelligence models. Alexandria Engineering Journal. 2022; 61(12):9867-78.
[Crossref] [Google Scholar]
[14]Nawang H, Makhtar M, Hamzah WM. A systematic literature review on student performance predictions. International Journal of Advanced Technology and Engineering Exploration. 2021; 8(84):1441-53.
[Crossref] [Google Scholar]
[15]Cornillez JEE, Treceñe JK, De LSJR. Mining educational data in predicting the influence of mathematics on the programming performance of university students. Indian Journal of Science and Technology. 2020; 13(26):2668-77.
[Crossref] [Google Scholar]
[16]Tsai YS, Gasevic D. Learning analytics in higher education-challenges and policies: a review of eight learning analytics policies. In proceedings of the seventh international learning analytics & knowledge conference 2017 (pp. 233-42).
[Crossref] [Google Scholar]
[17]Joshi A, Desai P, Tewari P. Learning Analytics framework for measuring students’ performance and teachers’ involvement through problem based learning in engineering education. Procedia Computer Science. 2020; 172:954-9.
[Crossref] [Google Scholar]
[18]Lang C, Siemens G, Wise A, Gasevic D. Handbook of learning analytics. New York: SOLAR, Society for Learning Analytics and Research; 2017.
[Google Scholar]
[19]Hartama D, Windarto AP, Wanto A. The application of data mining in determining patterns of interest of high school graduates. In journal of physics: conference series 2019 (pp.1-6). IOP Publishing.
[Crossref] [Google Scholar]
[20]Ndukwe IG, Daniel BK. Teaching analytics, value and tools for teacher data literacy: a systematic and tripartite approach. International Journal of Educational Technology in Higher Education. 2020; 17(1):1-31.
[Crossref] [Google Scholar]
[21]Niyogisubizo J, Liao L, Nziyumva E, Murwanashyaka E, Nshimyumukiza PC. Predicting students dropout in university classes using two-layer ensemble machine learning approach: a novel stacked generalization. Computers and Education: Artificial Intelligence. 2022.
[Crossref] [Google Scholar]
[22]Karlos S, Kostopoulos G, Kotsiantis S. Predicting and interpreting students’ grades in distance higher education through a semi-regression method. Applied Sciences. 2020; 10(23):1-19.
[Crossref] [Google Scholar]
[23]Qazdar A, Er-Raha B, Cherkaoui C, Mammass D. A machine learning algorithm framework for predicting students performance: a case study of baccalaureate students in Morocco. Education and Information Technologies. 2019; 24(6):3577-89.
[Crossref] [Google Scholar]
[24]Almarabeh H. Analysis of students performance by using different data mining classifiers. International Journal of Modern Education and Computer Science. 2017; 9(8):9-15.
[Crossref] [Google Scholar]
[25]Okubo F, Yamashita T, Shimada A, Ogata H. A neural network approach for students performance prediction. In proceedings of the seventh international learning analytics & knowledge conference 2017 (pp. 598-9).
[Crossref] [Google Scholar]
[26]Navamani JM, Kannammal A. Predicting performance of schools by applying data mining techniques on public examination results. Research Journal of Applied Sciences, Engineering and Technology. 2015; 9(4):262-71.
[Google Scholar]
[27]Rodríguez-hernández CF, Musso M, Kyndt E, Cascallar E. Artificial neural networks in academic performance prediction: systematic implementation and predictor evaluation. Computers and Education: Artificial Intelligence. 2021.
[Crossref] [Google Scholar]
[28]Aiken JM, De BR, Hjorth-jensen M, Caballero MD. Predicting time to graduation at a large enrollment American university. Plos one. 2020; 15(11):1-28.
[Crossref] [Google Scholar]
[29]Hoogland K, De KJ, Bakker A, Pepin BE, Gravemeijer K. Changing representation in contextual mathematical problems from descriptive to depictive: the effect on students’ performance. Studies in Educational Evaluation. 2018; 58:122-31.
[Crossref] [Google Scholar]
[30]Fok WW, He YS, Yeung HA, Law KY, Cheung KH, Ai YY, et al. Prediction model for students future development by deep learning and tensorflow artificial intelligence engine. In international conference on information management 2018 (pp. 103-6). IEEE.
[Crossref] [Google Scholar]
[31]Mokhairi M, Nawang H, Wan SN. Analysis on students performance using naïve. Journal of Theoretical and Applied Information Technology. 2017; 31(16):3993-4000.
[Google Scholar]
[32]Patil R, Tamane S. A comparative analysis on the evaluation of classification algorithms in the prediction of diabetes. International Journal of Electrical and Computer Engineering. 2018; 8(5):3966-75.
[Crossref] [Google Scholar]
[33]Pandey A, Jain A. Comparative analysis of KNN algorithm using various normalization techniques. International Journal of Computer Network and Information Security. 2017; 9(11):36-42.
[Crossref] [Google Scholar]
[34]Costa-mendes R, Oliveira T, Castelli M, Cruz-jesus F. A machine learning approximation of the 2015 Portuguese high school student grades: a hybrid approach. Education and Information Technologies. 2021; 26(2):1527-47.
[Google Scholar]
[35]Priyam A, Abhijeeta GR, Rathee A, Srivastava S. Comparative analysis of decision tree classification algorithms. International Journal of Current Engineering and Technology. 2013; 3(2):334-7.
[Google Scholar]
[36]Gil PD, Da CMS, Moro S, Costa JM. A data-driven approach to predict first-year students’ academic success in higher education institutions. Education and Information Technologies. 2021; 26(2):2165-90.
[Crossref] [Google Scholar]
[37]Adekitan AI, Salau O. The impact of engineering students performance in the first three years on their graduation result using educational data mining. Heliyon. 2019; 5(2).
[Crossref] [Google Scholar]
[38]Hasan R, Palaniappan S, Mahmood S, Abbas A, Sarker KU, Sattar MU. Predicting student performance in higher educational institutions using video learning analytics and data mining techniques. Applied Sciences. 2020; 10(11):1-20.
[Crossref] [Google Scholar]
[39]Viloria A, López JR, Leyva DM, Vargas-Mercado C, Hernández-Palma H, Llinas NO, et al. Data mining techniques and multivariate analysis to discover patterns in university final researches. Procedia Computer Science. 2019; 155:581-6.
[Crossref] [Google Scholar]
[40]Akçapınar G, Altun A, Aşkar P. Using learning analytics to develop early-warning system for at-risk students. International Journal of Educational Technology in Higher Education. 2019; 16(1):1-20.
[Crossref] [Google Scholar]
[41]Hashim AS, Awadh WA, Hamoud AK. Student performance prediction model based on supervised machine learning algorithms. In IOP conference series: materials science and engineering 2020 (pp. 1-18). IOP Publishing.
[Crossref] [Google Scholar]
[42]Livieris IE, Kotsilieris T, Tampakas V, Pintelas P. Improving the evaluation process of students’ performance utilizing a decision support software. Neural Computing and Applications. 2019; 31(6):1683-94.
[Crossref] [Google Scholar]
[43]Tsiakmaki M, Kostopoulos G, Kotsiantis S, Ragos O. Implementing AutoML in educational data mining for prediction tasks. Applied Sciences. 2019; 10(1):1-27.
[Crossref] [Google Scholar]
[44]Baars GJ, Stijnen T, Splinter TA. A model to predict student failure in the first year of the undergraduate medical curriculum. Health Professions Education. 2017; 3(1):5-14.
[Crossref] [Google Scholar]
[45]Hung HC, Liu IF, Liang CT, Su YS. Applying educational data mining to explore students’ learning patterns in the flipped learning approach for coding education. Symmetry. 2020; 12(2):1-14.
[Crossref] [Google Scholar]
[46]Marbouti F, Diefes-dux HA, Madhavan K. Models for early prediction of at-risk students in a course using standards-based grading. Computers & Education. 2016; 103:1-15.
[Crossref] [Google Scholar]
[47]Akçapınar G, Hasnine MN, Majumdar R, Flanagan B, Ogata H. Developing an early-warning system for spotting at-risk students by using eBook interaction logs. Smart Learning Environments. 2019; 6(1):1-15.
[Google Scholar]
[48]Mengash HA. Using data mining techniques to predict student performance to support decision making in university admission systems. IEEE Access. 2020; 8:55462-70.
[Crossref] [Google Scholar]
[49]Zohair LM. Prediction of Student’s performance by modelling small dataset size. International Journal of Educational Technology in Higher Education. 2019; 16(1):1-18.
[Crossref] [Google Scholar]
[50]Hussain M, Zhu W, Zhang W, Abidi SM, Ali S. Using machine learning to predict student difficulties from learning session data. Artificial Intelligence Review. 2019; 52(1):381-407.
[Crossref] [Google Scholar]