Predicting Students’ Mobility using Different Statistical Tools:

Basis for Students’ Success

PAULO V. CENAS, JENNIFER M. PARRONE, DANIEL BEZALEL A. GARCIA,

FREDERICK F. PATACSIL

College of Computing,

Pangasinan State University,

Urdaneta City, Pangasinan,

PHILIPPINES

Abstract:- This paper investigates students’ success at Pangasinan State University by identifying patterns and

models that might be used to correctly classify and predict if a student will transfer or finish their studies. In

this study, three categorical variables or attributes and one continuous variable were considered independent

variables due to the availability of the data. The results from the binary logistic regression model with the high

school general average and course as independent variables (Model 3), and the decision tree model with

transition gain as a splitting criterion were fitted to the dataset to generate a model that possibly best describes

the students’ mobility in Pangasinan State University Urdaneta City Campus. The decision tree model is better

than the binary logistic regression model based on accuracy, AUC, and sensitivity values. This implies that the

decision tree model is better at correctly classifying observations as "transferred" than Model 3. Thus, it was

concluded that the decision tree model with information gain as the splitting criterion best describes the

mobility of PSU students. The results of this paper can be used for school administration involving students’

mobility/success, particularly in classifying whether a student will transfer based on other.

Key-Words: - Mobility, Success, Statistical Tools, Decision Tree, Logistic Analysis

Received: March 11, 2022. Revised: September 27, 2022. Accepted: October 19, 2022. Published: November 24, 2022.

1 Introduction

Every university is responsible for preparing

students for good jobs and personal growth, as well

as assisting them in contributing to the betterment of

society. For this, universities should improve their

programs, implement updated and relevant

curricula, and build personal and cultural resources.

For the past five years, the Pangasinan State

University has consistently done its mandates to

improve the teaching and learning experience, equip

the faculty members through training, seminars, and

academic scholarships, provide facilities such as

laboratories, and build and renovate existing

edifices, all for the success of the students. The

students' success plays a crucial role in every

university, as it is commonly used as a performance

metric for every academic institution, [1]. Students'

success can be explained by the number of board or

licensure passers and topnotchers, or the highest

employability rates of its graduates, and of course,

the number of successful graduates. Completing an

academic degree is, in fact, one of the most regarded

achievements of any student. Many students will

enrol and start their first year of college at the

Pangasinan State University Urdaneta City Campus,

yet it is a constant observation that few students will

finish their degree within the maximum year of

residency. The reason for this is either the student

transferred to another university or dropped out of

the university. This implies that the university is

losing a significant percentage of its enrolees every

semester. Also, low completion rates among

relevant attributes aside from biological sex, course,

hometown, and high school general average.

students have been an immense threat to the key

performance indicators of the university system, [2].

This kind of student mobility is a perpetual

predicament, not exclusively at PSU Urdaneta but,

also in other schools and universities, and has

deeper consequences for those students involved.

Though transferring from one school to another may

not necessarily have any significant effect on the

overall academic achievement of the students, [3],

changing schools is almost certain to create discord

in their overall learning experience, [4]. Also,

transferring from one school to another may benefit

some students, but it has an overall negative impact

on the possibility of getting a degree, [5]. Further,

transferring may have an intense emotional effect,

and social and academic problems for the students,

[6]. These imply that there is a greater need for

adjustment in social, emotional, and academic

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2022.19.29

Paulo V. Cenas, Jennifer M. Parrone,

Daniel Bezalel A. Garcia, Frederick F. Patacsil

E-ISSN: 2224-3402

277

Volume 19, 2022

aspects on the part of the students, which may

contribute to any difficulties when it comes to

learning when maladaptation occurs.

Reasons for transferring may be linked to various

factors. Academic performance has a huge impact

on students’ retention, and transfer, [7], [8], [9].

Transferring schools or institutions can affect access

to most degree programs. This is common among

students who are unsure of what they want to study

in college and choose any degree program on a

whim. Another reason is financial difficulties or the

availability of scholarships. Students look for a

college or university where they can transfer, that

offers free tuition, scholarships, or student loans. It

is an undeniable fact that earning a degree requires a

huge amount of money, but due to free tuition at

state or local colleges and universities, the expenses

are reduced but still require a significant amount of

money. Many other factors significantly contribute

to the decision of students to transfer, and it is a

challenge for every academic institution to know

these in order to propose and create solutions for

student mobility.

The university loses its accomplishments as

students transfer, and the students who transfer or

drop out sacrifice the benefit of the continuity of the

services offered by the university. Low completion

rates due to transferring and dropping out affect not

only the student, but also the systematic changes

projected by university or school reform policies

[10]. Hence, early detection of student risk is

necessary, and should be used for policy making,

particularly in the admission of students to ensure

higher completion rates within the university. Using

the data from the Pangasinan State University

Urdaneta City Campus, this study aims to provide a

model that will describe the students' completion

based on their profiles and provide

recommendations based on the result of the model.

The main objective of this paper is to find a

model that possibly best describes student mobility

at the Pangasinan State University Urdaneta City

Campus as a basis for predicting students' success.

In particular, this study sought to:

1. to describe the nature and characteristics of the

collected data;

2. to generate a model that possibly best describes

the students’ mobility in Pangasinan State

University Urdaneta City Campus using:

a. Decision Tree Model; and

b. Binary Logistic Regression Model;

3. to compare the generated models using Decision

Tree and Binary Logistic Regression based on the

following criteria:

a. Accuracy;

b. Area Under the Curve (AUC); and

c. Sensitivity.

2 Methodology

The classification algorithms were implemented

using RapidMiner and RStudio, both of which are

open-source software primarily used for data

science. The decision tree model is applied to

further understand patterns in students’ mobility.

This is named a "decision tree" because the result

after using this model is a collection of nodes

intended to create a decision that is akin to a tree

when represented as a graph. The process of

creating decision tree models depends on the

purpose, whether for classification or regression. In

this study, the decision tree model for classification

was applied because the target attribute assigned as

a label, which is the student's status (whether the

student will transfer or graduate), is not numerical.

Thus, the decision tree rule is utilized to separate the

values belonging to different categories or classes.

The criterion used for splitting in this study is the

information gain criterion. The gini index criterion

was also considered, but it is more applicable for

larger distributions. The accuracy and gain ratio

criteria were also tried, but based on the accuracy,

precision in predicting the transferred class, and area

under the curve (AUC) values, the information gain

method for splitting is more applicable. Also, the

information gain method is perfect for smaller

partitions with a variety of mixed and diverse

values. The application of the information gain

method requires the splitting of the dataset into

training and testing data sets. The rule of thumb in

assigning percentages for the training and testing

datasets was implemented, that is, 70% for training

and 30% for testing. Stratified random sampling is

used to preserve the distribution of the label (status)

in both training and testing datasets,[11].

Table 1. Comparison of the four Criteria for

Splitting

Method

Accuracy

Precision

(Transferred

Class)

AUC

Information

Gain

70.73%

72.70%

0.714

Gain Ratio

69.40%

71.07%

0.674

Gini Index

70.55%

72.64%

0.708

Accuracy

70.55%

72.64%

0.670

Note: Values are derived based on the actual dataset. The

highest values are in boldface. In AUC, a value closest to

1.00 is the best.

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2022.19.29

Paulo V. Cenas, Jennifer M. Parrone,

Daniel Bezalel A. Garcia, Frederick F. Patacsil

E-ISSN: 2224-3402

278

Volume 19, 2022

The training dataset is used to generate a decision

tree model based on a maximal depth of 4 after

splitting. The maximal depth parameter is used to

restrict the depth of the decision tree model and

depends on the size and characteristics of the

dataset. This is one of the stopping criteria for

decision tree models. Pruning was also allowed in

this model, thus, some branches in the tree model

will be replaced by leaves based on the set value for

the confidence parameter. The confidence parameter

prescribes the confidence level used for the

pessimistic error calculation of pruning. The default

confidence level value is 0.1, but in this study, the

confidence level value is set to 0.3 to provide a

decent and less complicated decision tree graph. All

other parameters are set to default values. The

model generated from the training dataset is then

applied to the test dataset to predict the label. The

decision tree model's accuracy and prediction

performance are based on the class precision and

class recall values.

Fig. 1: Methodological Process Applied in

Generating Decision Tree Model

After that, the binary logistic regression model

which is a classification algorithm used to predict a

dichotomous variable based on a set of independent

variables is employed since this study is concerned

with whether a student will transfer, or graduate

based on their biological sex, course enrolled in

PSU, hometown municipality/province, and high

school GWA. The application of binary logistic

model was applied using of RStudio. Various

necessary model fitting and visualization packages

in R were used in this study. The list of packages

and their uses can be seen in Table 2.

Table 2. R packages used for Binary Logistic

Regression Modelling

R Package

Use

caret

For fitting and evaluation

of the binary logistic

regression model

ggplot2, visreg

For visualization of data

and regression models

plotROC

For constructing the ROC

curves

The first step in the binary logistic regression

modelling is to plot the data to determine if the

independent variables are related to the binary

outcome of academic survival at PSU (graduate or

transfer). Take note that the plot results are just

rough estimations in determining the relationship.

The next thing to do is to formulate the binary

logistic regression model to be implemented. Take

note that the generalized additive model is in the

form.

Table 3. Binary Logistic Regression Models

Considered

del

Equation

Mod

el 1

Mod

el 2

Mod

el 3

Mod

It can be observed from the models considered in

the study and presented in the table above that they

consist of some combinations of independent

variables, specifically the combination of high

school GWA and any of the other independent

variables. The independent variables of any binary

logistic regression model can be continuous or

categorical. In this study, it is the choice of the

researchers to always include high school GWA

since it sensible to think that this variable

contributes to whether the student will transfer or

graduate. The best model relative to other models

presented in Table 1 was chosen based on the in-

sample and out-sample accuracy, Cohen’s kappa,

and AUC.

Fig. 2: Methodological Process in the Binary

Logistic Regression Model

Finally, the generated models from the decision

tree and binary logistic model were compared based

on the values of AUC, accuracy, and Cohen’s

Kappa.

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2022.19.29

Paulo V. Cenas, Jennifer M. Parrone,

Daniel Bezalel A. Garcia, Frederick F. Patacsil

E-ISSN: 2224-3402

279

Volume 19, 2022

3 Results and Discussion

The total number of transferred students from 2010

to 2021 according to the records of the campus

registrar’s office, is 3,002. Any entry from the data

with vague or no information was removed,

resulting in 2,442 entries left after the data cleaning.

A random sample of 1,304 entries from the

graduated class since 2000 was chosen so that it will

comprise 35% of the total combined transfer-

graduated data. Take note that the data considered in

this study is based on the availability of data from

the records of the campus registrar’s office. The

graph below depicts the proportion of transferred

and graduated classes.

Fig. 3: The proportion of Transferred and Graduated

Classes Considered in this Study

Figures 4-6 depict a visual inspection of

independent variables based on the models

presented in Table 3 in terms of status (transferred

or graduated).

Fig. 4: Visual Inspection of High School GWA

across Biological Sex in terms of Status

Figure 4 suggests that the variables high school

GWA and biological sex are related to the binary

outcome of status, that is, whether the student

transferred or graduated. It can be shown that the

range of values for the high school average for those

who transferred, both male and female, is longer,

particularly at the lower bound, implying that there

are more transferred students whose high school

average is below 80 than those who graduated.

There are evidently extreme values, that is, an

average greater than 97.5, under the transferred

class for both males and females.

Fig. 5: Visual Inspection of High School GWA

across Courses in terms of Status

Figure 5 also suggests that the variables High

School GWA and Course are related to the binary

outcome of Status. The graph above depicts the

range of high school averages for those who

transferred and graduated from the nine college

majors available at the PSU Urdaneta City Campus.

For instance, the range of values of the high school

average in the information technology program is

longer at the lower bound for those who transferred,

implying that there are more transferred students

whose high school average is lower than 80 than

those who graduated. The same pattern can be seen

in the other eight remaining courses. Further, there

are extreme values, as evident in Figure 4, that are

the same extreme values evident in the mechanical

engineering program under the transferred class.

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2022.19.29

Paulo V. Cenas, Jennifer M. Parrone,

Daniel Bezalel A. Garcia, Frederick F. Patacsil

E-ISSN: 2224-3402

280

Volume 19, 2022

Fig. 6: Visual Inspection of High School GWA

across Hometown in terms of Status

It can be observed based on Figure 6 that there

are municipalities or cities within Pangasinan that

have more transferred data than graduated, such as

Alaminos City, Anda, Basista, Dagupan City, Dasol,

Infanta, Mabini, Urbiztondo, and San Jacinto. The

same is true for La Union. The possible reasons for

this are either because these areas are

geographically distant from Urdaneta City or they

transferred to a university or college closer to their

hometowns, such as Dagupan City and La Union.

Some areas are geographically distant from

Urdaneta City but have almost the same proportion

of transfers and graduates, such as Bani and Sual in

Pangasinan, Tarlac, and Nueva Ecija. There are also

areas in Pangasinan with no recorded graduates

from Labrador or Bugallon. Looking at Figure 7

suggests that the variables High School GWA and

Hometown may or may not be related to the binary

outcome of Status due to the complexity or lack of

information in some areas.

3.1 Decision Tree Model

The generated decision tree model using

information gain as a splitting criterion is shown in

the figure below.

Fig. 7: The Decision Tree Model of the Student

Mobility Dataset with Information Gain as the

Splitting Criterion (maximal depth = 4)

It can be observed from Figure 7 that some of the

evident rules present here are those enrolled in the

Information Technology program with a high school

general average of below 86.537 who later

transferred, and those enrolled in Civil Engineering

program with a high school average of less than

88.434 or greater than 88.434 but less than 92.035

who later transferred. Based on the graph, the most

striking case is in the part of the architecture where

no splitting occurred. The model showed that when

a student is enrolled in the Architecture program,

there is a huge possibility that he/she will transfer,

regardless of their biological sex, high school

general average, or hometown. Another difference is

that biological sex is only used in splitting in the

English language program. This means that in the

English Language program, there is a huge

possibility that a student will transfer if the student

has a high school general average of less than

84.779 and is female when compared to a male.

Overall, it is sensible to think that the high school

general average has the biggest factor that might

affect students’ mobility. The branches of the

decision tree graph in text form below.

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2022.19.29

Paulo V. Cenas, Jennifer M. Parrone,

Daniel Bezalel A. Garcia, Frederick F. Patacsil

E-ISSN: 2224-3402

281

Volume 19, 2022

Course = Architecture: Transferred {Graduated=19,

Transferred=163}

Course = Civil Engineering

| HSAverage > 88.434

| | HSAverage > 92.035: Graduated {Graduated=37,

Transferred=17}

| | HSAverage ≤ 92.035: Transferred {Graduated=80,

Transferred=129}

| HSAverage ≤ 88.434: Transferred {Graduated=68,

Transferred=274}

Course = Computer Engineering

| HSAverage > 87.404: Graduated {Graduated=36,

Transferred=17}

| HSAverage ≤ 87.404: Transferred {Graduated=13,

Transferred=48}

Course = Education

| HSAverage > 89.585: Graduated {Graduated=44,

Transferred=4}

| HSAverage ≤ 89.585

| | HSAverage > 84.800: Graduated {Graduated=90,

Transferred=47}

| | HSAverage ≤ 84.800: Transferred {Graduated=8,

Transferred=19}

Course = Electrical Engineering

| HSAverage > 87.370

| | HSAverage > 91.393: Graduated {Graduated=16,

Transferred=9}

| | HSAverage ≤ 91.393: Transferred {Graduated=47,

Transferred=61}

| HSAverage ≤ 87.370: Transferred {Graduated=35,

Transferred=115}

Course = English Language

| HSAverage > 84.779

| | HSAverage > 93.565: Transferred {Graduated=0,

Transferred=3}

| | HSAverage ≤ 93.565: Graduated {Graduated=61,

Transferred=25}

| HSAverage ≤ 84.779

| | Sex = Female: Transferred {Graduated=5, Transferred=19}

| | Sex = Male: Graduated {Graduated=5, Transferred=4}

Course = Information Technology

| HSAverage > 86.537

| | HSAverage > 90.517: Graduated {Graduated=45,

Transferred=26}

| | HSAverage ≤ 90.517: Transferred {Graduated=114,

Transferred=161}

| HSAverage ≤ 86.537: Transferred {Graduated=45,

Transferred=375}

Course = Mathematics

| HSAverage > 85.705: Graduated {Graduated=51,

Transferred=25}

| HSAverage ≤ 85.705: Transferred {Graduated=7,

Transferred=28}

Course = Mechanical Engineering

| HSAverage > 90.082

| | HSAverage > 96.423: Transferred {Graduated=0,

Transferred=2}

| | HSAverage ≤ 96.423: Graduated {Graduated=33,

Transferred=7}

| HSAverage ≤ 90.082: Transferred {Graduated=54,

Transferred=131}

Table 5. Parameter Estimates for the Binary Logistic

Regression Model 3

3.2 Binary Logistic Regression Model Fitting

The results of performance measures of the binary

logistic regression model fitting of the four models

considered in this study are in Table 6.

Table 6. Performance Measures of the Four Binary

Logistic Regression Models

In-

Sampl

Accur

acy

Out-

Samp

Accur

acy

Kapp

Sensit

ivity

Specif

icity

Model

0.6839

295

0.69

242

0.683

9554

0.213

9176

0.3013

109

0.8882

875

Model

0.6855

312

0.69

240

0.682

7809

0.210

8405

0.2992

361

0.8876

336

Model

0.7178

324

0.74

875

0.715

6181

0.318

7784

0.4209

419

0.8736

256

Model

0.6922

0.71

119

0.678

1635

0.214

7846

0.3302

894

0.8661

349

Note: Values are derived based on the actual dataset. The

highest values are in boldface.

It is important to look at the following

performance measures to identify the best binary

logistic regression model compared to other models

in comparison. The simplest among these

performance indicators is the in-sample accuracy,

which is defined as the proportion of correct

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2022.19.29

Paulo V. Cenas, Jennifer M. Parrone,

Daniel Bezalel A. Garcia, Frederick F. Patacsil

E-ISSN: 2224-3402

282

Volume 19, 2022

classifications a model makes. Based on the results,

Model 3 exhibits the largest in-sample accuracy

value of 71.78%. Take note that this is called "in-

sample accuracy" because the interpretation of this

is limited to our dataset. The generalized version of

this is the out-of-sample accuracy. Model 4 achieves

the highest out-of-sample accuracy among all

models considered in this study. Another

performance metric is the area under the receiver

operating characteristic (ROC) curve, also known as

"area under the curve." An AUC value close to 0.5

indicates that the classification is based on random

guessing, while an AUC value equal to 1.0 indicates

perfect classification. Thus, a value closer to 1.0 is

better.

Based on Table 5, the Model 3 exhibits the

highest AUC value. Another performance measure

is the Kappa, which measures the inter-rater

reliability and is commonly used to measure the

level of agreement between the model’s predictions

and the actual data. A kappa value of at least 0.60 is

considered substantial. Based on the results, all

kappa values are fair, nevertheless, Model 3 exhibits

the largest kappa values among the other models in

consideration. Lastly, the sensitivity, which

measures the capability of the model to correctly

classify an observation as "transferred,", and

specificity, which measures the capability of the

model to correctly classify an observation as

"graduated," are other performance measures that

might help us to further identify the best model

relative to other models in this study. All models in

comparison have high specificity values, which

indicate these models are good at correctly

classifying observations as "graduated.". Since our

concern is the ability of the model to correctly

classify those who have been transferred, Model 3 is

probably the best model based on the sensitivity

value. Overall, Model 3 is the best model compared

to the other binary logistic regression models

considered in this study. The estimated coefficient

values for Model 3 are shown in the next table.

In the interpretation of the estimates, it is important

to remember that each coefficient represents an

additive linear contribution on the log-odds scale.

For the case of a categorical variable just like the

course in Model 3, if the observation belongs to one

of these 9 courses, then its value is equal to 1 for

that particular course and 0 for the other 8 courses,

making it the baseline of the model in terms of the

Course variable. In this model, each 1-unit increase

in the high school general average decreases the log

odds of transferring by 0.23313, and if the student is

enrolled in the English Language Program, the log

odds of transferring will further decrease by

2.73283. Another way to interpret this is by

exponentiating the coefficients. For

instance, suggests that the odds of transferring

change by a factor of approximately 0.7921 for each

1-unit increase in the high school GWA. Moreover,

observe that the decrease in log-odds of transferring

depends on the per-unit increase in the high school

average and the enrolled degree program, except for

architecture. This means that if a student is enrolled

in the Architecture program, the decrease in the log

odds of transferring solely depends on the per-unit

increase in his/her high school general average. The

Wald test was used to test the significance of the

individual regression coefficients in Model 3. All

regression coefficients, aside from that for

architecture as a course, were found to be significant

based on the results.

Fig. 8: The Visualization of Model 3

Figure 8 shows the visualization of Model 3. This

was done by plotting the relationship between the

independent variable high school GWA and the

probability of transferring on the y-axis per course.

3.3 Model Comparison

The decision tree model with information gain

splitting criterion and the binary logistic regression

model with high school GWA and course as

predictors are compared based on the performance

measures presented in Table 7.

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2022.19.29

Paulo V. Cenas, Jennifer M. Parrone,

Daniel Bezalel A. Garcia, Frederick F. Patacsil

E-ISSN: 2224-3402

283

Volume 19, 2022

Table 7. Comparison between the Decision Tree and

Binary Logistic Regression Models based on

Accuracy, AUC, and Sensitivity

Model

Accuracy

AUC

Sensitivity

Decision Tree

Model

0.7073

0.7490

0.8827

Binary Logistic

Regression:

Model 3

0.7178

0.7488

0.4209

Note: Values are derived based on the actual dataset. The

highest values are in boldface.

It can be seen that, though Model 3 of the binary

logistic regression model has a higher in-sample

accuracy than the decision tree model, the latter

exhibits a slightly higher AUC and significantly

higher sensitivity values. This means that the

decision tree model is better at correctly classifying

observations as "transferred" than model 3. This

supports many studies and literatures that suggest

decision tree models are usually superior to binary

logistic models. In this study, the decision tree

model with information gain as the splitting

criterion best describes the mobility of PSU

students.

Table 8. Confusion Matrix of the Decision Tree

Model

true

Graduat

true

Transfer

red

class

precision

pred.

Graduated

148

63.25%

pred.

Transferred

243

647

72.70%

class

recall/sensitivi

37.85%

88.27%

Table 8 above shows the performance of the

Decision Tree Model with information gain splitting

criterion using the testing dataset.

Fig. 9: The AUC (optimistic) graph of the Decision

Tree Model

4 Conclusion

This paper investigates students’ success at

Pangasinan State University by identifying patterns

and models that might be used to correctly classify

and predict if a student will transfer or finish their

studies. In this study, three categorical variables or

attributes and one continuous variable were

considered independent variables due to the

availability of the data. These are the biological sex,

hometown, course, and the high school general

average. The objectives of this study were

accomplished by applying decision tree models, and

binary logistic regression models. The results from

the binary logistic regression model with the high

school general average and course as independent

variables (Model 3), and the decision tree model

with transition gain as a splitting criterion were

fitted to the dataset to generate a model that possibly

best describes the students’ mobility in Pangasinan

State University Urdaneta City Campus. The

decision tree model is better than the binary logistic

regression model based on accuracy, AUC, and

sensitivity values. This implies that the decision tree

model is better at correctly classifying observations

as "Transferred" than Model 3. Thus, it was

concluded that the decision tree model best

described the mobility of the students using

information gain as the splitting criterion. The

decision tree model shows some significant

findings. First, it can be concluded that there is a

high chance that when a student is enrolled in an

architecture program, the student will transfer

regardless of biological sex, high school general

average, or hometown. There is no splitting present

under the architecture program when compared to

other degree programs present in the model. One

possible reason for this is that it might be the case

that the independent variables considered as factors

in modeling the students’ mobility in this study are

not applicable under the Architecture program. This

means that other factors aside from those considered

in this study, such as the number of absences or

college GWA might be used instead. Another

possible reason is that the degree program itself is

very challenging because it requires strong

analytical skills while focusing on detail with a huge

academic workload and pressure. A testament to its

difficulty is that the degree program requires 5 years

of regular residency to finish and 2 years of

apprenticeship in architectural firms or industry

experience before taking the licensure. Second, the

results from the decision tree model can be used as

the basis for admission. It can be seen that for a

student to have a high chance of survival enrolled in

any of the following majors: civil engineering,

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2022.19.29

Paulo V. Cenas, Jennifer M. Parrone,

Daniel Bezalel A. Garcia, Frederick F. Patacsil

E-ISSN: 2224-3402

284

Volume 19, 2022

computer engineering, education, electrical

engineering, English language, information

technology, mathematics, or mechanical

engineering, they must have at least a high school

general average of 92.035, 87.404, 84.800, 91.393,

84.779, 90.517, 85.705, or 90.082, respectively.

Generally, a student must have at least a high school

general average of 85.000 for non-engineering

courses and at least 87.000 for engineering

courses. Lastly, the results of this paper can be used

for future research involving students’ mobility,

particularly in classifying whether a student will

transfer based on other relevant attributes aside from

biological sex, course, hometown, and high school

general average.

References:

[1] Alyahyan, E., & Düştegör, D. (2020).

Predicting academic success in higher

education: literature review and best practices.

International Journal of Educational

Technology in Higher Education, 17(1).

doi:10.1186/s41239-020-0177-7.

[2] Priyadarshini, M., Gurnam, K., Sian Hoon, T.,

Geethanjali, N., & Yuen Fook, C. (2022). Key

Factors Influencing Graduation on Time

Among Postgraduate Students: A PLS-SEM

Approach. Asian Journal Of University

Education, 18(1), 51-64.

doi:10.24191/ajue.v18i1.17169.

[3] Wing, Michael D. (2008). Student Transfer:

The Effect of Timing on Academic

Achievement. Electronic Theses and

Dissertations. 415.

https://digitalcommons.library.umaine.edu/etd

/415.

[4] Alexander, K.L., Entwisle, D.R., & Dauber,

S.L. (1994). Children in motion: Schools

transfers and elementary school performance.

Paper presented at the American Sociological

Association Annual Meeting, Los Angeles,

CA.

[5] Pascarella, E. T., & Terenzini, P. T. (2005).

How college affects students: A third decade

of research. San Francisco: Jossey-Bass.

[6] Grais, Benjamin M. (2011). High School

Transfer Student Transitions and Changes:

Risk, Success, Failure, and the Vital Role of

the Counseling Curriculum. Dissertations. 66.

https://ecommons.luc.edu/luc_diss/66

[7] Allen, J., Robbins, S. B., Casillas, A., & Oh,

I.-S. (2008). Third-year College Retention and

Transfer: Effects of Academic Performance,

Motivation, and Social Connectedness.

Research in Higher Education, 49(7), 647–

664. doi:10.1007/s11162-008-9098-3

[8] Colobong, R. & Cenas, P. (2011). Student

Mobility in Pangasinan State University.

Pangasinan State University Urdaneta City

Campus Research Journal s. 2011-2012. pp.

13-20

[9] Dela Cruz, R. O. (2015). Persistence and

retention towards degree xcompletion of BS

agriculture students in selected State

Universities in Region IV-A, Philippines.

African Journal of Agricultural Research,

109130: pp. 1543 – 1556 Retrieved from:

https://academicjournals.org/journal/AJAR/art

icle-full-text-pdf/6AA4CD351984. Retrieved

on: June 7, 2022.

[10] Kerbow, D. (1996). Patterns of Urban Student

Mobility and Local School Reform. Journal of

Education for Students Placed at Risk

(JESPAR), 1(2), 147–

169. doi:10.1207/s15327671espr0102_5

[11] Hazra, A. (2019). Top 7 Cross-Validation

Techniques with Python Code, Analytics

Vidhay, Retrieved :

https://www.analyticsvidhya.com/blog/2021/1

1/top-7-cross-validation-techniques-with-

python-code/

Creative Commons Attribution License 4.0

(Attribution 4.0 International, CC BY 4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.0/deed.en

_US

WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS

DOI: 10.37394/23209.2022.19.29

Paulo V. Cenas, Jennifer M. Parrone,

Daniel Bezalel A. Garcia, Frederick F. Patacsil

E-ISSN: 2224-3402

285

Volume 19, 2022