WSEAS Transactions on Information Science and Applications
Print ISSN: 1790-0832, E-ISSN: 2224-3402
Volume 22, 2025
Early Identification of Vulnerable Students with Machine Learning Algorithms
Authors: ,
Abstract: Education is an important component in defining the overall development of a country. It is also a significant tool for achieving success in life. One of the important aspects influencing any educational institution's success is its students' academic achievement. In educational institutions, student dropout is a complex problem. Educational managers consider it vital to predict a student's risk of dropping out as soon as possible. It still needs to be easier to predict accurately in advance. The major problems in the present research work include overfitting in a predictive model, complex variable relationships, insufficient feature extraction, and data pre-processing complexity. The key goal of this study is to improve student achievement, decrease the number of dropouts, create support plans, and constantly modify these plans based on ongoing progress monitoring. Specifically, this research aims to identify at-risk students early using machine learning algorithms, allowing educational institutions to take timely and targeted interventions. Identifying the student's needs early in their time with you will ensure that vulnerable students get the support they need, help prevent dropout rates from increasing, and significantly benefit their general academic performance. In this work, the King Abdulaziz University database was used. Exploratory Data Analysis (EDA) is heavenly for understanding the characteristics of the data, identifying anomalies, recognizing trends, and directing further data pre-treatment procedures. Genetic Algorithm-optimized Latent Dirichlet Allocation (GA-LDA) is used for feature extraction. We utilize canopy clustering with a Gaussian Flow Optimizer (GFO) for accurate student grouping. Finally, a hybrid Logistic Regression-K-Nearest Neighbour (LR-KNN) technique is used for data classification. Accuracy, precision, recall, F1-score, sensitivity, and specificity metrics were used to examine the proposed model.
Search Articles
Keywords: Machine learning, Data mining, Feature extraction, Data classification, Gaussian Flow Optimizer, Regression-K-Nearest Neighbour, Exploratory Data Analysis, Education
Pages: 166-188
DOI: 10.37394/23209.2025.22.16