Abstract: The fourth industrial revolution emerges from a demanding need for reskilling and upskilling every

active working person. Furthermore, the European Commission included key policy instruments for resilience,

social fairness, and sustainable competitiveness in the European Skills Agenda. Distance training and education

programs are key factors to succeed in the targets mentioned above. Due to the o COVID-19 pandemic, already

30% of the total education in European countries has further expanded. As a result, online evaluation

approaches are more than necessary. Various methodologies have been applied to evaluate the online training

sessions, from traditional statistics to context analysis and, the newly introduced text mining and sentiment

analysis. This work used conventional descriptive statistical methods and advanced text mining methods to

analyse data collected by private sector online training seminars—a total of 50 trainees in 5 seminars conducted

by the private sector during COVID-19 pandemic training activities. A typical text mining analysis performed

on a low is some open questions and a small number of texts.

Key-Words: training, evaluation, E-Learning, text mining, AI

Received: June 7, 2022. Revised: December 22, 2022. Accepted: January 24, 2023. Published: February 22, 2023.

1. Introduction

The fourth industrial revolution emerged,

demanding frequent reskill and upskilling of the

workers [1]. The European Commission included in

the European Skills Agenda key policy instruments

for resilience, social fairness, and sustainable

competitiveness [2]. This is a crucial factor in the

implementation of distance training and education

programs. Nowadays, almost 30% of the total

education is in European countries [3]. The COVID-

19 pandemic further expands online learning

activities within all sectors, private, public, non-

profit, and engineering, either in rural or urban

places [4-5].

In addition to that, the identification of robust and

reliable online evaluation approaches online is more

than important [6-7]. Educational and training

evaluation procedures have been extensively

examined in the literature during the last decade,

leading to various methods and revealing the

different strengths and weaknesses as needed

concerning the other educational processes [8-10].

During the last decades, data mining was

introduced as a set of new data analysis methods for

general applications and applied to training and

learning evaluation methods [11-13].

More specifically, newer text-mining methods have

been used to evaluate healthcare training sessions

[14]. Text mining aims to analyse data and find

sentiment within the text. This approach snowballs

to explore sensations, attitudes, moods, affection,

sentiments, opinions, and appeals of text within any

electronically written document [15]. The first

applications were found in the scientific field of

behavioural sciences [16] and then expanded to

other scientific areas, including education [17-18].

Additional worth mentioning sentiment analysis

techniques applied in the education field are [19],

which enables educators to understand their

students' needs and preferences, and [20], which

helps to minimise the distance between the e-

learners and the trainers in situations of distance

learning seminars.

The present work applied both traditional

descriptive analysis and text-mining methods to

investigate the opinion of trainees. In addition, low-

text data numbers' performance tests text mining's

Evaluation of Vocational E-Learning Seminars

DIMITRIOS TSIMARAS

Aegean University

School of Humanities

GREECE

EMMANOUIL ZOULIAS

Health Informatics

Laboratory,

Faculty of Nursing,

National and Kapodistrian

University of Athens, Athens

GREECE

CHRYSSI VITSILAKI

Aegean University

Faculty of Humanities

GREECE

capabilities.

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION

DOI: 10.37394/232010.2023.20.5

Dimitrios Tsimaras,

Emmanouil Zoulias, Chryssi Vitsilaki

E-ISSN: 2224-3410

Volume 20, 2023

2. Data Description

The data are from 5 seminars during 2020-2022,

performed as fully online synchronous and

asynchronous seminars. A small number of about

ten trainees took part in each seminar. For each

seminar, the same questionnaire was provided to

trainees as the evaluation method of the training

program. This questionnaire is composed of a set of

questions, both open and closed.

The questionnaire comprised 56 questions, and

only 4 were open questions. Regarding the closed

questions, a descriptive analysis was performed

using simple percentage and counting methods. The

four available questions were “If it's your choice, do

you prefer to have the camera on or off during

modern telelearning and why?”, “What do you think

went well in modern distance learning c? What,

what did you like?”, “What do you think did not go

well in modern distance learning c? What, what did

you not like?” and “Please provide a brief,

additional comment or observation about distance

education that you consider important” were further

analysed with text mining analysis. The analysis is

based on the open-text questions deployed by the

RapidMiner tool using a text-mining method. The

text mining analysis aimed to identify patterns in

phrases, words, and sentences which declare a

positive or negative position towards online

training.

3. Classification Procedure

In this study, the initial data (sentences) were taken

by the open-ended questions of the questionnaire

mentioned above and retrieved as an excel file. The

community-free version of RapidMiner software

was used to apply the methodology.

The initial step was to load the data into the

software (Figure 1 – Read Excel). The second step

was the conversion of all nominal attributes to string

attributes since the tools used later process only

string attributes (Figure 1 – “Nominal to Text”).

The step has only one parameter (attribute filter),

which is set to “all” in this work. This resulted in

selecting all available attributes of the example set.

The third step has various sub-steps. In Figure 1

is the operator named “Process Documents from

Data.” This operator has a word as an input list and

results in attributes as an output in the form of a

processed word list. Within this step, data from files

are converted to texts ready to be processed. The

analysis of the third step to the sub-step appears in

Figure 2.

There are various options on how to apply

tokenization. The First Internal Step is

“Tokenization” [21]. The work of tokenizing a

document is to split every text into divided elements

or items, such as words. In this work, the

“Tokenize” operator is used (Figure 2) of the

RapidMiner. This work selects splitting text into

single words as an attribute. Clarifying this, the

sentence "I already used Knowledge from the course

in my Job,” the application of tokenization, will

result in the following series of words: "I,”

"already,” "used,” "Knowledge,” "from,” "the,”

"course,” "in,” "my,” "Job.”

The second Internal step, the “Transform Cases”

[22], is used to increase the number of common

words. The aim is to identify all common words,

avoiding, lowercase, uppercase, and mixed cases. In

this work, all character’s issues within each

document decided to be transformed to uppercase

using the relevant operator. The application of the

operator resulted in the fact that words in the

document as “Like” and “like” are supposed to be

equally and handled the same; in this specific

example it means that students like something if

they state “Like” or “like.” In this work it was

decided to transform all document characters as

lower case. The operator for that is the “Transform

Cases” operator (Figure 2). This step is supposed to

prepare for the following internal step of filtering

stop words.

The third internal step is the “Filter Stop Words”

[23] (Figure 2). The role of this step is to

discriminate between non-case sensitive or case

sensitive of the Greek dictionary. In a text mining

analysis, the operator of Filter Stop Words is used

for removing common words, in this case, Greek

works, that do not add anything to text explanations.

This work used a set of 847 Greek stop words as a

dictionary for this implementation. An indicative

example of the Filter Stop Words role is that the

word “Like” is in the dictionary; this operator will

remove the word “Like” for the analysis of texts.

The fourth internal step is about the generation of

“n-Grams.” The term “n-Grams” [24] is a series of

consecutive tokens of length n in any document. In

this work, the n-Grams were generated using the

operator “Generate n-Grams” of Rapid Miner. To

fully understand the role of this operator, an

indicative example will be presented. They suppose

that a document includes the phrase “like a lot”

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION

DOI: 10.37394/232010.2023.20.5

Dimitrios Tsimaras,

Emmanouil Zoulias, Chryssi Vitsilaki

E-ISSN: 2224-3410

Volume 20, 2023

composed of three different words, “like,” “a,” and

“lot.” Supposing that the number attribute of Grams

number is set to n=3, the operator will produce the

output of all consecutive tokens one, two, and three

lengths. Those are all possible combinations with

one, two, and three words. The result of the operator

will be six different Grams, which are: “n-Grams”:

“a,” “lot,” “like,” “like a,” “a lot,” and “like a

lot.” The Grams generated are made one-word word

length, and two- and three-word length are

extracted. In this work, the fourth internal step

(Figure 2) sets the attribute of Grams number 5,

which means (5-Grams).

The final fifth internal step, referred to as “Filters

Tokens” (Figure 2) [22], deals with the length of the

words, which means the number of characters of

each word. These further filter common words like

“and,” “or” and words with a small length that do

not have any value to the analysis of texts. This

work selected that the minimum number of

characters in each word included in the study will be

from 5 to a maximum of 9999 characters.

Among the results received by the text mining

analysis, some further contextual analysis was

performed on the basis that ordinary meaning,

equivalent, and meaning phrases and statements, not

equal, are considered to express the same positive or

negative information and are added to the

occurrences. This context analysis is more critical

since the initial documents are limited.

The applied process is illustrated in Figure 1 and

Figure 2.

Figure 1

Figure 2

4. Results

In this work, only the most important results of the

questionnaire are presented. The demographic

profile of the trainees taking part was 52% female

and 48% male. The age categories of the trainees

were 30% (18-30), 14% (31-40), 30% (41-50), 20%

(51-60), and 6% (>60). Furthermore, the educational

level was 68% (higher education), 14% (post

lychee), and 18% (lychee). The answer for the

trainee’s distance from the physical performance of

the seminar was 70% up to 70 kilometres, 26% over

100 kilometres, and 4% between 51 and 100

kilometres.

Some other questions tried to investigate the

situation and technical issues of the trainees. The

systems used were desktops, laptops, tablets, and, to

a small extent, smartphones. Furthermore, the

equipment seems equal to 3 years (50% in total) and

more than three years, and the operating system

used is mainly Microsoft Windows (74%).

Regarding the internet connection speed, 24,5% had

VDSL over 100 Mbps and 40,8% 50 Mbps.

The method of attending the distance education

courses was 55,1% blended, 38,8% Synchronous,

and the remaining Asynchronous.

Α a further context analysis was performed to find

terms with shared meaning, which is essential to

reach conclusions. Applying the above-described

text mining methodology on all 50 trainees’ replies

resulted in 1340 primarily different appeared

phrases/words. The expressions and words are

separated into favourable positions (Table 1) and

negative positions (Table 2). In Tables 1 and 2, the

Occurrences refer to the times the statement

appeared in various phrases but always with the

same meaning.

Table 1 – Positive Emotion

Phrases/Words

Occurrences

Save time, money, and fatigue by

avoiding motion

Exams sharing

Avoid disturbance

Follow the timeline

More multimedia material

Possibility to follow programs

from far away Universities

Total

104

Table 2 – Negative Emotion

Phrases/Words

Occurrences

Need of proper equipment

and connection, which

costs

Need more intervals due to

the tedious process

Live sessions are better

Prefer to have camera off,

personal data

Teachers need to have the

proper knowledge of

technology

Tire complete process, too

many hours in front of the

camera

Total

170

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION

DOI: 10.37394/232010.2023.20.5

Dimitrios Tsimaras,

Emmanouil Zoulias, Chryssi Vitsilaki

E-ISSN: 2224-3410

Volume 20, 2023

The results in Table 1 and Table 2 reveal that

trainees have either positive or negative positions

toward online training activities during COVID-19.

The phrases and words in Table 1 support the above

statement. The most critical issue for the trainees

was that they “Save time, money and fatigue by

avoiding motion” (a number of 50), meaning

towards the learning infrastructures. Near this point

is the belief that they have the “Possibility to follow

programs from far away Universities” (a number of

5). Furthermore, positive energy was the ability to

“Exams sharing” (a number of 19) and the quieter

environment in their homes (“Avoid disturbance”- a

number of 15). Another point stated as a positive

one was that during online training, they strictly

follow the timeline (a number of 10). Finally, an

expected issue is that they can access “More

multimedia material” (a number of 5).

This was extracted after applying the text mining

method, followed by a thorough context analysis of

all final occurrences of positive and negative

phrases. Phrases and words that seem similar or

have similar meanings are combined to extract a

better result.

5. Conclusions and Future work

Within this work, a former method [14] was applied

to evaluate online seminars within the private sector.

The current methodology and tools support

administrative and training performers to locate

strengths and weaknesses that have yet to be seen.

An innovation of this work is that it applies those

innovative methodologies to private sector online

training sessions during the COVID-19 pandemic.

This work has been much more extended, and the

main one is to apply the methodology to many texts

from the private online training sector and a cross-

country application. Finally, an exciting extension

can be the application of a method on “big data,”

resulting in a tool for learning analytics for the

private sector.

References

[1] Teo, T., Unwin, S., Scherer, R., Gardiner, V.,

Initial teacher training for twenty-first century

skills in the Fourth Industrial Revolution (IR

4.0): A scoping review. Computers &

Education, Vol. 170, 2021, pp.104223.

[2] European Commission. Communication on a

European Skills Agenda for Sustainable

Competitiveness, Social Fairness, and

Resilience; European Commission: Brussels,

Belgium, 2020.

[3] Schneller, C., Holmberg, C., Distance Education

in European Higher Education: The Offer.

International Council for Open and Distance

Education, 2014.

[4] LeCavalier, J., E-Learning Success Stories in

the Not-for-Profit Sector, 2003.

[5] Pimenidis, E., Iliadis, L., Jahankhani, H., E-

Learning in the work-places in the Rural Sector

of northeastern Greece. Operational Research,

Vol.5, 2005, pp.35-47.

[6] Firmansyah, R.; Putri, DM.; Wicaksono, MGS.,

Putri, SF., Widianto, AA., Palil, MR,

Educational Transformation: An Evaluation of

Online Learning Due To COVID-19, Int. J.

Emerg. Technol. Learn. (iJET), No. 16, 2021,

pp. 61-76.

[7] Umair, M., Hakim, A., Hussain, A., Naseem, S.,

Sentiment Analysis of Students' Feedback

before and after COVID-19 Pandemic, Int. J.

Emerg. Technol., No. 12, 2021, pp.177-182.

[8] Horton, WK., Evaluating E-Learning, The Astd

E-Learning Series, American Society for

Training & Development, 2001.

[9] McCutcheon K., Lohan M., Traynor M. Martin

D., A systematic review evaluating the impact

of online or blended learning vs. face-to-face

learning of clinical skills in undergraduate nurse

education, Journal of Advanced Nursing, 2014.

[10] Barneche Naya, V., Hernández Ibáñez, LA.,

Evaluating user experience in joint activities

between schools and museums in virtual worlds.

Universal Access in the Information Society,

Vol.14, 2015, pp. 389-398.

[11] Bala, M., Ojha, DB., Study of applications of

data mining techniques in education,

International Journal of Research in Science

and Technology, Vol. 1, No. 4, 2012, pp. 1-10.

[12] Kumar, SA., Vijayalakshmi, MN., Discerning

learner’s erudition using data mining

techniques. International Journal on

Intergrating Technology in Education, Vol., No.

1, 2013, pp. 9-14.

[13] AlAjmi, MF., Khan, S., Sharma, A., Studying

data mining and data warehousing with different

e-learning system. International Journal of

Advanced Computer Science and Applications,

Vol.4, No.1, 2013.

[14] Alimisis, D., Zoulias, E. Aligning technology

with learning theories: A simulator-based

training curriculum in surgical robotics.

Interactive Technology and Smart Education

Vol.10, No.3, 2013, pp. 211-229.

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION

DOI: 10.37394/232010.2023.20.5

Dimitrios Tsimaras,

Emmanouil Zoulias, Chryssi Vitsilaki

E-ISSN: 2224-3410

Volume 20, 2023

[15] Karlgren, J., Sahlgren, M., Olsson, F.,

Espinoza, F., Hamfors, O., Usefulness of

sentiment analysis. In Advances in Information

Retrieval: 34th European Conference on IR

Research, ECIR 2012, Barcelona, Spain, April

1-5, 2012. Proceedings, Vol.34, 2012, pp. 426-

435.

[16] Panksepp, J., Toward a general

psychobiological theory of emotions.

Behavioral and Brain sciences, Vol.5, No.3,

1982, pp. 407-422.

[17] Lundqvist, K., Liyanagunawardena, T.,

Starkey, L, Evaluation of student feedback

within a MOOC using sentiment analysis and

target groups. International Review of Research

in Open and Distributed Learning, Vol.21,

No.3, 2020, pp. 140-156.

[18] Bulusu, A., Rao, KR., Sentiment Analysis of

Learner Reviews to Improve Efficacy of

Massive Open Online Courses (MOOC's) - A

Survey. In Proceedings of the 2021 Fifth

International Conference on I-SMAC (IoT in

Social, Mobile, Analytics, and Cloud) (I-

SMAC), Palladam, India, 2021, pp. 933-941.

[19] Berardinelli, N., Gaber, M., Haig, E., Sentiment

Analysis for Education, IOS Press, Vol.255,

2013.

[20] Zhou, J., Ye, JM., Sentiment analysis in

education research: a review of journal

publications. Interactive learning environments,

Vol.1, No.13, 2020.

[21] Grefenstette, G., Tapanainen, P., What Is a

Word, What Is a Sentence? Problems of

Tokenisation. In Proceedings of the

International Conference on Computational

Lexicography, COMPLEX-94, Budapest,

Hungary, 1994, pp. 79-87.

[22] Lazarinis, F., Engineering and Utilizing a

Stopword List in Greek Web Retrieval. J. Am.

Soc. Inf. Sci. Technol. Vol.58, 2007, pp. 1645-

1652.

[23] Baeza-Yates, RA., Ribeiro-Neto, B., Modern

Information Retrieval, Addison-Wesley

Longman Publishing Co., Inc.: Boston, MA,

USA, 1999.

[24] Zamora, EM.; Pollock, JJ.; Zamora, A. The Use

of Trigram Analysis for Spelling Error

Detection. Inf. Process. Manag., Vol.17, 1981,

pp. 305-316.

WSEAS TRANSACTIONS on ADVANCES in ENGINEERING EDUCATION

DOI: 10.37394/232010.2023.20.5

Dimitrios Tsimaras,

Emmanouil Zoulias, Chryssi Vitsilaki

E-ISSN: 2224-3410

Volume 20, 2023

Contribution of Individual Authors to the

Creation of a Scientific Article (Ghostwriting

Policy)

The authors equally contributed in the present

research, at all stages from the formulation of the

problem to the final findings and solution.

Sources of Funding for Research Presented in a

Scientific Article or Scientific Article Itself

No funding was received for conducting this study.

Conflict of Interest

The authors have no conflicts of interest to declare

that are relevant to the content of this article.

Creative Commons Attribution License 4.0

(Attribution 4.0 International, CC BY 4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.0/deed.en

_US