Several search mechanisms, such as exhaustive,
random, and greedy approaches, have been
proposed, but as the feature size increases, the
feature selection task gradually becomes a
computationally expensive, time-consuming,
complex optimization task, [4]. Recently, several
nature-inspired algorithms have been successfully
applied to solve complex non-linear optimization
tasks. Through their intrinsic property of exploration
and exploitation mechanism, these meta-heuristic
approaches avoid optimal local solutions and hence
do not suffer from premature convergence. So,
considering the complexity of the feature selection
task, meta-heuristic methods are well suited to solve
it while maintaining the accuracy level of the model.
Recently, several nature-inspired algorithms
have been employed to solve the feature selection
task either through the wrapper approach or in the
hybrid form, along with filter techniques in the
machine learning domain. Researchers have
designed and are still working to find several new
meta-heuristic methods to solve various
optimization techniques, including the feature
selection problem. Genetic Algorithm (GA), [5],
Particle Swarm Optimization (PSO), [6], Ant
Colony Optimization (ACO), [7], Crow Search
Algorithm (CSA), [8], and Differential Evolution
(DE), [9], are some of the approaches which have
been successfully applied to feature selection tasks
in various problems in their original as well as
hybrid form.
The arithmetic Optimization Algorithm is a
recently proposed meta-heuristic search algorithm
that works on the principles of basic mathematical
functions Addition, Subtraction, Multiplication, and
Division, [10]. The AOA solves several real-life
optimization problems from various domains, [11].
Since feature selection is considered an
optimization problem so, in the present work, the
AOA is modified to solve the binary feature
selection problem. The explore and exploit the
whole solution space, AOA only utilizes the best
solution obtained; hence in specific scenarios, it
fails to explore the entire search space and is thus
stuck to the optimal local solution. The present
works propose a Modified Binary Arithmetic
Optimization Algorithm (MB-AOA) by introducing
a variable search operator and a set of optimal
solutions to delve into the search space. The
performance of MB-AOA is demonstrated through
three evaluation criteria, average accuracy, F-score,
and feature subset size over seven real-life datasets,
and is compared to standard AOA.
The rest of the paper is structured as follows
section-2 represents a brief literature review;
section-3 describes the overall methodology of
standard AOA, its drawbacks, and MB-AOA and
application of MB-AOA as a wrapper method for
feature selection task. Section-4 discusses the
experimental parameters, datasets, and the obtained
results. Finally, section-5 concludes the whole work
and the present work's prospect.
2 Literature Review
Feature selection has become the most prominent
step in domains like bioinformatics, pattern
recognition, machine learning, and various
disciplines with large feature sets. Accordingly,
researchers have done multiple studies in the past
and still proposing different new approaches due to
the emergence of the huge volume of data. In the
past, several meta-heuristic techniques have been
applied as a wrapper method for feature selection
problems. In this section, we have studied some
modified implementations of AOA approaches
successfully applied to feature selection problems.
In [12], the authors, have proposed two binary
variants of AOA, BAOA-V, and BAOA-S, for
feature selection for high-resolution image data for
tumor detection. The BAOA-V hyperbolic tangent
and the BAOA-S sigmoid functions transform
standard AOA into binary form for the feature
selection problem. Even within BAOA-V and
BAOA-S, BAOA-S performs better by selecting
small and more relevant feature subsets than
BAOA-V.
In another recent work, [13], hybridized AOA
with Simulated Annealing (SA) and combined the
hybrid approach with a filter method for feature
selection in a high-dimensional cancer gene-
expression dataset. The crossover concept is further
applied to enhance the exploratory capability of the
hybrid approach. The proposed approach is used
over ten gene-expression datasets to evaluate the
performance of the hybrid method.
In, [14], the authors have applied the AOA used
to optimize SVM to detect and categorize the
defects over the chip surfaces. Here AOA is used to
determine the optimal kernel function for the SVM,
which is further applied for categorizing and
detecting defects over the chips.
In, [15], the authors have proposed k-NN-AOA
for detecting fake news spread during the covid-19
pandemic by improving the k-NN classifier
accuracy level by selecting relevant feature subsets.
The proposed approach is applied to the real-life
Koirala dataset. The proposed work is further
compared with other similar techniques for feature
selection using the k-NN classifier, and the obtained
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2023.11.18
Rajesh Ranjan, Jitender Kumar Chhabra