ones. The main reason that SMEs are becoming a
target for cybercriminals is the assumption that their
security systems are less strong compared to the big
companies. In SMEs, the vulnerabilities often arise
because of not taking adequate cyber security mea-
surements, mainly due to the lack of financial and hu-
man resources. By doing so, they increase the risk
to guarantee the data confidentiality and integrity of
their clients. For example, in the year 2019, around
58% of SMEs have been a victim of a cyberattack, re-
sulting on average downtime for every breach in more
than 8 hours[4]. In terms of money, these attacks are
estimated to cost around $3 million, resulting in los-
ing profits, but most important losing clients because
of trustiness. On the other hand, big enterprises un-
like SMEs have human resources, technical expertise,
and finance to protect their information assets as ex-
plained in the Kshetri[5]. So, the solution is to in-
crease the cyber security investment.
Machine learning techniques are wildly used in IDSs
to achieve effectiveness with datasets that are not suf-
fering from irrelevant, and redundant feature sets.
The aim is to analyze the impact and consequences
of cyber-attacks in an information system with a fo-
cus on SMEs, and to show the effectiveness of apply-
ing machine learning techniques in intrusion detec-
tion systems. For example, in cases when an attacker
tends to gain access or interrupt normal operations
of an information system, almost always he is trying
to cause damage and malfunctions. Different super-
vised and unsupervised machine learning techniques
are used to address the major challenges faced by
IDSs such as Decision Tree algorithm (DTA) and Sup-
port Vector Machine (SVM) as shown by Ektefa re-
search[6]. Some methods outperform others in terms
of classification accuracy, but less interest is shown
in computational time that is an important factor in
choosing the right algorithm and is addressed in this
work.
With the new General Data Protection Regulation
(GDPR)[7], which came into force in May 2018, new
regulations must be followed by enterprises during a
data breach. If the company systems incur any data
breach, it should be documented no later than 72 hours
after having become aware of it. In these circum-
stances, implementing strong IDS can guarantee the
enterprises to monitor the network or the systems for
malicious activity and policy violations, and have the
possibility to document it, for example through logs.
In this paper, the focus is to investigate the different
machine learning techniques used in the context of
IDS to ascertain the potential presence of any tech-
nique through experimental exploration which can be
used for SME scenarios by showing the power of
feature selection methods in improving the classifi-
cation of different attacks into classes. The purpose
is to show the effectiveness of using the right ma-
chine learning techniques for the IDS to solve the
most significant challenges faced such as high com-
putational time and low accuracy. To evaluate these
two parameters on the IDSs, several experiments were
conducted with real data, the Aegean Wi-Fi Intrusion
Dataset (AWID) dataset[8]. Initially, the data were
pre-processed, and then the relevant features were ex-
tracted to reduce the dimensionality of the dataset.
These two steps were important for improving the
classification accuracy and reducing the computation
time. In the end, different machine learning methods
were applied, and the results were compared through
the metrics of accuracy, FPR, and total time to build
the classification model.
2 Materials and Methods
2.1 Intrusion Detection Systems
Cyber security experts implement different methods
to defend from malicious attacks like firewalls, Intru-
sion Prevention System (IPS), or IDS. The latter is
one of the most essential components of computer se-
curity used to detect attacks before they are widely
spread. An intrusion is classified as the set of actions
aimed to compromise the security goals that are in-
tegrity, confidentiality, and availability of computer
resources[9]. An IDS is a device or software that de-
tects any malicious activity or attack on protected as-
sets. It can analyze the collected data in a given net-
work to identify malicious behavior or policy viola-
tions and then prepare a report for the system admin-
istrator to handle the intrusion, summarizing the func-
tions of IDS such as:
• to monitor user and system activity;
• to detect attacks as soon as possible;
• to enforce the network traffic;
• to analyse statistical patterns;
• to audit of operating system.
There is also, a classification on types of IDS that
are Network-based IDS (NIDS) or Host-based IDS
(HIDS), depending on weather the system monitors
a single host or a network[10].
2.1.1 HIDS
A HIDS relies heavily on audit trials, becoming lim-
ited in finding new attacks. It monitors and analy-
ses the input/output packets from a single device per-
forming log analyses, file integrity checking, policy
monitoring, etc. In any case, HIDS tends to be desir-
able for some reasons. For example, because it can
WSEAS TRANSACTIONS on BUSINESS and ECONOMICS
DOI: 10.37394/23207.2022.19.43
Nevila Baci, Kreshnik Vukatana, Marius Baci