malware, [2], in the field of cybersecurity. It works
by locating patterns or signatures linked to
dangerous software. However, as malware has
become exponentially more prevalent, this once-
dominant approach has run into problems and has
become extremely inefficient. The main problem is
that it can only identify malware signatures that
have been observed, which makes it inefficient for
protecting against new potential threats that have
never been observed before.
The weakness of signature-based detection
becomes more evident since the cybersecurity
profile changes dynamically, requiring an
investigation and the application of more intelligent
and proactive measures to counter the increasing
variety of cyber threats. Signature-based detection
has been the way of detecting malware until the
1990s. Then machine learning-based malware
detection techniques were developed and improved,
[3]. Support Vector Machines (SVM), Random
Forests (RF), Logistic Regression (LR), Naïve
Bayes (NB), and Adaboost , [4], are part of the
machine learning methodology proposed to be
useful in malware detection and classification
techniques, achieving higher performance and
accuracy.
The problem nowadays is not just understanding
how malware evolves but also understanding
effectively how processing large and diverse
datasets works and to extract useful information.
The first is unable to emphasize the significance of
data processing in malware detection. Thoroughly
analyzing and preparing a dataset is essential before
implementing machine learning models to mitigate,
detect, and prevent malware attacks. The complexity
of malware behavior and the variety of possible
attack avenues necessitate a careful approach to data
preparation. This entails correcting problems that
can greatly affect the effectiveness of detection
algorithms, such as imbalances, biases, and missing
or irrelevant data. Considering the above issues,
Mitigating Malware Threats on Emerging
Technology framework “MMTET” is proposed in
this paper that will help mitigate the risk of
intrusion.
In this paper, Section 2 summarizes the
literature review on the existing methodologies
proposed by different authors. Section 3 discusses
the methodology, and Section 4 describes different
models that are applied to this work. The analysis is
done in Section 5, and Section 6 summarizes the
findings and the importance of advancing
cybersecurity solutions for emerging technologies.
2 Related Work
The DREBIN, a detection system is presented in [5],
which allows the identification of malware
applications on smart devices. In DREBIN, the
authors consider a dataset of 131,611 applications
including malware software. Mainly, they apply a
broad statistical analysis to extract features from
different sources and analyse them in an expressive
vector space. The DREBIN worked on 123,453
applications and 5,560 malware samples, and the
detection rate was 93,9%.
The Internet of Medical Things (IoMT) method
is proposed in [6], to categorize and identify
malware. The framework used multidimensional
Deep Learning (DL) approaches for an optimal
feature analysis to detect malware and perform a
classification into categories based on the byte
representation of the executable and linkable file.
For an excellent outcome, different methods were
used for their framework, including Convolutional
Neural Network (CNN), bidirectional Long Short-
Term Memory (LSTM), and other model for IoMT
malware classification comparison. Two separate
datasets, Big-2015 datasets, and CDMC-2020-IoMt-
Malware were used to evaluate the performance of
the framework. D TensorFlow was used on the back
end and Keros for the front-end library, and scikit-
learn for Machine Learning (ML) algorithm
implementation. IoMT framework obtains 95%
accuracy which is better than the other DL
approaches like RNN, LSTN, GRU, CNN, and
bidirectional LSTM. It also gives a better
performance in terms of precision (96%), recall
(95%), and F1-score (95). The result demonstrates
the effectiveness of their framework for malware
detection and classification.
Microsoft malware dataset is used in [7], for
training and testing of Light Gradient Boosted
Machine (LGBM) technique to detect malware
attack on Microsoft cloud as a framework. The
LGBM decision tree model is used for classification
and regression using the AutoML tool and another
model to enhance prediction accuracy. Based on that
study, LGBD was the perfect model to use for
evaluating the framework in [7], on large data. An
outcome of 67,78% of F1-score and 66,18%
accuracy revealed that the suggested methodology
was more accurate in predicting malware than
AutoAI and other models.
The authors propose an innovative security
framework ‘MobiSentry’ in [8], for detecting
malware and mobile categorization with a
substantial dataset comprising 184,486 benign and
21,306 malware instances in Android devices.
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.27
Aanmar Abdou Salam, Md. Abdul Based,
Mohamed Islam Houssam,
Mohammad Shorif Uddin