Proposed Test Case Generation Model using Fuzzy Logic
(TCGMFL)
AHMED ALTHUNIBAT1, MUSTAFA MAHMOOD1, HEND ALNUHAIT2,
SALLY ALMANASRA2, HANEEN A AL-KHAWAJA3
1Faculty of Science and Information Technology,
Al-Zaytoonah University of Jordan, Amman,
JORDAN
2Faculty of Computer Studies,
Arab Open University, Riyadh,
SAUDI ARABIA
3Faculty of Business,
Amman Arab University, Amman,
JORDAN
Abstract: - This research addresses the pressing need to enhance software testing, specifically focusing on
white-box testing and basis path generation. Software testing is a linchpin in the software development process,
ensuring software operates flawlessly and aligns with its intended objectives. However, this phase often incurs
substantial time and resource investments. The primary aim of this study is to introduce an efficient and
automated approach for basis path generation, a crucial component of white-box testing. The model
commences by transforming source code into a tailored control flow graph (CFG), streamlining the automated
generation of test paths. Central to this model is an algorithm for generating test paths (AGTP), meticulously
traversing CFG nodes from source to destination. The algorithm's design aims to comprehensively cover all test
paths within the CFG. To enhance testing process efficiency, the model employs k-means clustering to generate
and cluster inputs. Path coverage is rigorously assessed for each cluster, and fuzzy logic is used to determine
the optimal path. The overarching goals of this research are to reduce time and financial costs associated with
software testing while maintaining precision and efficiency. The model's effectiveness in generating test cases
is confirmed through the examination of multiple examples, underscoring its valuable contribution to software
testing. This study marks a significant advancement toward more effective and cost-efficient software testing
methodologies.
Key-Words: - Software Testing; Fuzzy logic (FL); Test Case Generation; k-mean clustering; Basis Path
Testing; Automatic test case generation; Automatic Testing
Received: June 17, 2022. Revised: September 19, 2023. Accepted: October 21, 2023. Available online: November 22, 2023.
1 Introduction
Software testing (ST) is an important component in
the software development process, and it is one of
the essential stages in the software development life
cycle (SDLC), [1]. The main goal of this process is
to create efficient software that is free of bugs,
faults, and failures by executing the software with
good test cases. Choosing test cases is an important
step in software testing, and successful testing is
supported by good test selection to make testing
more efficient. All features in the software that
should be checked must be covered by the test cases
in the software, [2].
The process of designing and validating a large
number of test cases that involve software testing is
considered a complicated process due to the effort,
time, and cost involved, [3]. One of the issues faced
by software testing is software accuracy, which
necessitates a high number of test cases and their
appropriateness, [4], [5], As previously stated, [6],
this phase of the Software development life cycle
(SDLC) is the most expensive since it necessitates a
significant amount of time and effort. The tester
requires a sufficient number of test cases to ensure
that the testing is thorough. When a product has a
complicated branching structure, the tester must
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
161
Volume 12, 2024
usually pick test cases manually. However,
manually determining test cases becomes incredibly
laborious, necessitating the use of automated test
case creation, [7].
As a result, automation of Software testing
approaches plays a critical role in the development
of test cases to save time, effort, and money, as well
as to provide more reliable results than manual
testing methods that are prone to human error,
particularly the generation of test data. The
development of testing data, in particular, improves
the efficiency of software testing as a whole. Rather
than creating data from scratch to test, [8].
Structure-oriented test methods are extensively
used and define test cases based on internal program
structures. During software development, one of the
most essential structure-oriented test methodologies
is the path-based test. How to construct linear
independent paths automatically and efficiently is a
critical challenge in path testing in software
engineering, [9].
Structural testing (sometimes known as white
box testing) is a type of program testing that is
commonly utilized. Many test criteria, such as
statement and branch coverage, necessitate the
discovery of a collection of executable paths that
match certain criteria. We must discover appropriate
values for the input variables to run each path. This
issue is known as test data generation. If the path
cannot be exercised with any set of input data, the
path is called an infeasible path, [10].
2 Problem Statement
In the software development process, software
testing is an essential phase. The test case design
process is an important phase in software testing.
Effective testing requires a good test case the test
cases in the set must cover all features that should
be tested, and the number of test cases should not be
too large. the path coverage appears as the issue via
time complexity due to the height number of paths
which will increase the time in the testing process
and the size of a test suite, view aspects use the
concept of machine learning (ML) such as fuzzy
logic regarding fuzzy logic (FL) the outcome more
complexity versus other rules mechanism.
We developed an approach that automates the
process of test case generation in software testing by
proposing a technique that first generates all test
paths from the corresponding control flow graph
(CFG) automatically by developing an algorithm
that takes the control flow graph as input and
generates a set of the test path. the algorithm
traverses each node from source to destination to
find these paths, moreover, the technique generates
test data in a given domain and group data into the
cluster using k-mean which help in determining the
basis path set by using a path coverage as a test
adequacy criteria with the help of fuzzy logic
optimal path has determined, [11].
This new approach will help to complete path
testing and determine the optimal path to prioritize
the process of test case selection. Also, the new
approach will reduce the time and cost in the testing
phase. The general objectives of this research are: to
investigate the terms test case, and fuzzy logic in the
software testing phase, to develop the model to
reduce the time complexity and improve the testing
process, and to evaluate the model using statistical
analysis measures, [12].
3 Literature Review
Software testing is the process of running software
against a set of test cases to find flaws. The various
testing approaches are defined by the artifact that is
used to generate Test cases. Functional or black-
box testing gets its name from the fact that it tests
the functionality of a system. Test cases are derived
from a program's specification or description;
structural or white-box testing is derived from
the program's specification or description. Test
cases are derived from implementations; fault-based
testing is derived from implementations. Test cases
are derived from fault models based on typical
programming errors, while model-based testing is
derived from fault models. Models of system
specifications are used to create test cases, [13].
To determine whether a software system is
right, one might examine every piece of the system's
input domain and compare the result to the intended
outcome, [14].
Aside from testing methodology and criteria,
there are other parts to the testing process. For
example, manual testing of programs is
prohibitively expensive; as a result, software testing
typically relies on tools to automate the generation,
execution, and collection of test results when flaws
are discovered during program testing, they must be
located, and corrected, [15], [16].
Software Testing (ST) Estimating testing
efforts, finding a competent test team, building Test
cases, running the program with those Test cases,
and reviewing the results produced by those
executions are all part of the software testing
process. According to studies, testing accounts for
40-50 percent of the cost of software development,
[17], with the amount for testing important software
being much higher, [14]. As a result, software
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
162
Volume 12, 2024
testing (ST) is defined as the process of validating
and verifying a software product.
Software testing is the practice of putting a
program through its paces with well-designed input
data to find flaws, [18], [19].
Software testing techniques assist in the
construction of better test cases as follows:
*Structural Testing: The major purpose of these
tests is to trigger specific areas in the software, such
as specific statements, program branches, or paths to
be run. White Box testing is a sort of testing that is
used to test the structure of code. The expected
performance is determined using coverage metrics
such as path coverage, branch coverage, and data-
flow coverage, [20], [21].
*Functional Testing: is a sort of software testing
that validates a software system's functional
requirements/specifications. Black box testing is
another name for it. Functional tests are used to
evaluate each function of a software application by
giving appropriate input and comparing the output
to the functional requirements In other words,
functional testing isn't concerned with the
application's source code, [21], [22].
Computer programs have now infiltrated every
part of life, enabling the manipulation of a wide
range of complex applications. Many of these
applications are large, complex, and life-threatening.
As a result, highly dependable software, or software
with a high level of reliability, is required. Software
testing is an important and commonly used
methodology, in addition to the many other
techniques for boosting reliability, [23].
The major goal of this process is to generate
efficient software free of bugs, faults, and failure,
[1]. Despite the fact that there are several resources
for assuring software quality through testing, the
majority of software products are not sufficiently
tested. Insufficient testing will result from a large
number of unhandled failures, resulting in increased
software development time and expenses. The
software development lifecycle entails more than
just developing code. Testing occurs after the
process, but it is just as important. Testing uncovers
defects and verifies software, allowing for bug fixes
and modifications. The phrase "fuzzy logic" has
been applied in two ways. In a strict sense, fuzzy
logic is a logical system for reasoning under
uncertainty that generalizes conventional two-
valued logic. Fuzzy logic, in a wide sense, refers to
any theories and methods that use fuzzy sets, or
classes having unsharp bounds, [24].
[25], [26], proposed fuzzy set theory, which
gave rise to the name fuzzy logic, [25], [26], [27],
Fuzzy Logic (FL) is a multivalued logic that allows
for the definition of intermediate values between
conventional evaluations such as true/false, yes/no,
high/low, and so on. To apply a more human-like
way of thinking in computer programming, concepts
like quite tall or very fast can be mathematically
expressed and processed by computers, [26].
The truth value of variables in fuzzy logic can
be any real number between 0 and 1, both inclusive.
It's used to deal with the concept of partial truth,
where the true value can be somewhere between 100
percent true and 100 percent false, [28]. The truth
values of variables in Boolean logic, on the other
hand, can only be the integer values 0 or 1. [27],
established the theory of fuzzy sets, which gave rise
to fuzzy logic. A fuzzy set assigns a degree of
membership to elements of a universe, which is
commonly a real number from the interval [0, 1] [0,
1].
By assigning degrees of truth to propositions,
fuzzy logic emerges. The standard set of truth
values (degrees) is [0,1] [0,1], with 00 being
"completely false," 11 denoting "completely true,"
and the remaining digits denoting "partial truth,"
i.e., intermediate degrees of truth
Not only does fuzzy logic provide a meaningful
and powerful representation for measuring
uncertainties, but it also provides a meaningful
representation of muddled concepts stated in
everyday language. Fuzzy logic is a mathematical
theory that deals with the idea of ambiguity in
defining concepts and meanings. Expressions like
"low" and "high," for example, include ambiguity or
fuzziness since they are imprecise and relative. As a
result, the variables analyzed are referred to as
"fuzzy" rather than "crisp." Fuzziness is merely one
way of expressing uncertainty; (FL) employs a non-
binary logic (i.e., true or false),
Fuzzy set theory has been proved to be a
valuable tool for describing scenarios involving
imprecise or ambiguous data. Fuzzy sets deal with
these problems by assigning a degree of belonging
to a set to each object, [25], [26], [27]. The fuzzy set
is a variation of the traditional set. The membership
of items in classical crisp set theory follows a binary
logic: either the element belongs to the crisp set or it
does not.
While a fuzzy set does not have a crisp, clearly
defined boundary, it can contain elements with
degrees of membership ranging from 0 to 1, [29].
This is because a fuzzy set does not have a crisp,
clearly defined boundary, and its fuzzy boundary is
described by membership functions, which make the
degree of membership of elements range from 0 to
1. [26], presented membership functions in the first
study on fuzzy sets A membership function (MF) is
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
163
Volume 12, 2024
a curve that specifies the feature of a fuzzy set by
assigning a membership value, or degree of
membership, to each element. It converts each point
in the input space into a membership value in the [0,
1] closed unit interval.
Fig. 1: Fuzzy logical operations
A general membership function curve is shown
in Figure 1. The horizontal axis represents an input
variable x, while the vertical axis shows the input
variable x's corresponding membership value (x).
The range in which the input variable will have a
nonzero membership value is explained by the
Support of Membership Function Curve. When x is
any point between point a and point d in this
diagram, (x) = 0. While the Core of Membership
Function Curve interprets the range in which the
input variable x has a full degree of membership
((x) = 1), in other words, any point within the
interval [b, c] belongs to a fuzzy set described by
this membership function.
Because standard binary logic is a specific
example of fuzzy logic in which the membership
values are always 1 (totally true) or 0 (entirely
false), fuzzy logic must have the same consistency
as standard logic. AND, OR, and NOT are the most
fundamental logical operations. The operands A and
B, unlike conventional logical operations, are
membership values in the interval [0, 1]. The logical
AND is expressed by the function min in fuzzy
logical operations, therefore the phrase A AND B is
equal to min (A, B). Because the function max
defines logical OR, A OR B becomes identical to
max (A, B). The logical NOT transforms operation
NOT A into operation 1 – A.
A test case is a set of inputs, execution
circumstances, and expected outputs that are
designed to accomplish a specific goal, such as
exercising a specific scenario, a specific scenario
sequence, or validating compliance with a set of
arguments, [30].
A test case, according to, [31], is a set of
conditions that a tester must test to determine
whether the System under Test (SUT) is satisfied
with the expected outcome appropriately.
The problem of developing test cases that meet
arbitrary test criteria is difficult, and many
researchers have focused on automating this
activity. For the automatic development of test
cases, they used a variety of approaches, [32] Hand-
written testing entails a significant amount of effort,
a long testing period, and a large number of flaws. It
is a prominent research topic to automatically
generate testing scenarios.
According to, [33], identifying appropriate test
cases is dependent not only on the faults found but
also on how errors have been defined using test
adequacy criteria to evaluate a test case. When all of
the stated criteria have been met, the testing
procedure comes to a close, [34].
A test adequacy criterion assists in identifying
test objectives or goals to be accomplished during
software testing. For example, path coverage
requires that each test case be performed at least
once for each path.
Path testing is a test-case design method and is a
sort of method of white-box testing, [35]. During the
software unit testing stage, it is commonly utilized.
In fact, among all white-box testing methods, the
path testing method has the best error detection
capacity during the unit testing stage, [35].
Although the path testing method can detect most
software errors during software testing, the number
of program paths will increase if a program contains
branch or loop statements, [36].
Control flow graphs, also known as program
graphs, are commonly used to describe the control
flow of programs in software analysis, [37].
Program statements are the nodes of a control flow
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
164
Volume 12, 2024
graph, and the edges reflect the control flow
between the statements. Except for the entry and
exit nodes, the control flow graph is a graph in
which each node (apart from the entry and exit
nodes) corresponds to one statement in the program
code.
The measure of cyclomatic complexity refers to
the number of fundamental cycles that appear in
connected and undirected graphs, [38]. Figure 2
shows a section of the various pseudocodes and the
control flow graph.
Fig. 2: Example of pseudo-codes and corresponding
control flow graph.
According to, [39], Path coverage is one of the
most accurate code coverage metrics because it
focuses on a sequence of branch decisions rather
than a single branch decision. Because even if a
particular branch choice or statement has already
been handled, the combinations with other branch
decisions (or statements) may not have been tested,
path coverage is better than statement and branch
coverage. Unfortunately, due to the limitless number
of paths, path coverage is difficult, if not
impossible, to measure.
A path is essentially a collection of nodes
connected by edges in a finite number of ways. A
path that begins at the first node, travels via some
nodes between the first and last nodes and ends at
the last node, [40]. There are numerous obstacles to
overcome.
1. Test cases are incomplete: It's frequently
impractical to create test cases that cover all
possible paths, especially for real-world
applications. This is because I the total number
of possible pathways in a program is enormous:
at least exponential to the code size, and
possibly infinite due to loops and recursive
functions; and (ii) producing an input to cover
any specific path is potentially incomputable,
[39], Although symbolic unit test generation,
[41], and dynamic test generation, [41], has
made significant progress recently, they are still
constrained by the above-mentioned unbounded
computation complexity problem and have
many difficult-to-handle cases, such as loops.
2. Other states' dependency: When the control
path is also dependent on system states (such as
the time of day), hardware states (such as disk
condition), resource states (such as how much
virtual memory has been allocated), and
application states, the situation becomes much
more difficult (e.g., the total number of
outstanding requests). While fault injection-
based testing can help with this issue to some
extent, it is usually quite expensive and does not
cover all conceivable state combinations, [7].
3. Efficiency of Testing and Human Effort: In
real-world programs, enforcing each test case
can take a significant amount of human labor
and time. As a result, the number of examples
that can be evaluated in the real world is
restricted; it's crucial to reduce the number of
test cases without reducing overall test
coverage, [41], [42].
[6], explain how to automatically generate test
cases using an evolutionary structural testing
technique. The genetic algorithm (GA) is an
evolutionary method that is used to automatically
generate test cases to cover its def-use associations.
The GA conducts its search by creating new test
data using previously generated test data that has
been determined to be effective. The technique
generates a set of randomly generated test cases
within a domain, which are then classified (clusters)
using the K-Mean clustering algorithm. Each cluster
identifies a collection of paths that have been
covered, refines these clusters based on the paths
that have been covered, and uses a genetic algorithm
to generate new test cases in each cluster to increase
path coverage. The Genetic Algorithm is used to
create fresh inputs for path coverage. When
compared to random generation, the Genetic
Algorithm provided 100 percent path coverage.
[8], proposes an approach for discovering the
error-prone path in software that combines Genetic
Algorithms and Binary Search (BSGA). The BSGA
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
165
Volume 12, 2024
improves software testing by combining the Genetic
Method (GA) with the Binary Search (BS)
algorithm, which employs the BD as input values
for program path coverage. The BSGA is a robust
nonlinear search approach that provides a higher-
quality answer, resulting in cost savings in the
software testing industry. The results of the
experiments show that using the BS to improve the
performance of the GA in terms of finding
appropriate test cases and test data for the input Big
Data domain values has a positive influence. These
results, on the other hand, reduce the cost of testing.
[2], proposes a genetic algorithm-based method for
creating test cases. The paths of the graph are
described by following path coverage criteria when
a flow graph is built from the program source code.
The genetic algorithm is then utilized to create test
cases. Finally, we employ mutation testing to assess
the test cases' effectiveness. The experiments show
that the tests created are quite effective at detecting
errors. In addition, based on the proposed strategy,
the research has developed a test case generation
tool.
By combining the evolutionary algorithm
technique with the static technique for test case
creation, it will be possible to eliminate all faults
and increase the quality of software in the future.
Clustering is the division of a data set into
subsets (clusters) so that the data in each subset
share certain common qualities, such as proximity
according to some defined distance metric.
Clustering is vital in our lives since we live in a
data-driven society where we encounter a
tremendous volume of data.
Classifying or grouping these data into a set of
categories or clusters is an important part of
working with them. Clustering is used in a variety of
fields. Data clustering, for example, is a common
statistical data analysis technique utilized in a
variety of domains such as machine learning, data
mining, pattern recognition, image analysis, and
bioinformatics. Clustering is also used to find
relevant knowledge in data, [43].
Clustering is an unsupervised classification
strategy that seeks to arrange things into clusters so
that objects in the same cluster are extremely similar
and objects in other clusters are substantially
different. Cluster analysis is a well-known concept
in the field of data mining, [44]. It's the first step
toward an amazing new world of information.
Data is separated into different clusters in hard
clustering, with each data element belonging to only
one cluster.
3.1 The k-mean Clustering Algorithm
For prediction analysis, the k-mean clustering
technique is used to group similar types of data. The
likelihood of the most relevant function is estimated
in the k-mean clustering procedure, and the
functions are grouped using the Euclidian distance
formula.
The fundamental k-means clustering algorithm,
[45], is based on the partitioning method and is
utilized for various clustering applications, notably
with low-dimension datasets. It divides n objects
into k clusters using k as a parameter so that objects
in the same cluster are similar to one another but
distinct from objects in other clusters.
3.2 The c-mean Clustering Algorithm
[46], introduced the Fuzzy C-Means clustering
approach, which is an extension of the Hard C-
Mean clustering method. FCM is an unsupervised
clustering approach that can be used to solve a wide
range of feature analysis, clustering, and classifier
design challenges, [46].
Agricultural engineering, astronomy, chemistry,
geology, image analysis, medical diagnostics, and
form analysis and target recognition are just a few
of the fields where it's used, [47], the FCM
clustering algorithm, which is based on Ruspini
Fuzzy clustering theory, was proposed in the 1980s
with the growth of fuzzy theory. This algorithm is
used to perform analysis based on the distance
between different data points in the input. The
clusters are constructed based on the distance
between data points, and each cluster has its cluster
center. The clustering algorithms used in test case
generation are presented in Table 1.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
166
Volume 12, 2024
Table 1. Clustering algorithms used in test case generation
Clustering
algorithms
Research usage
Result
k-mean
Divide the test cases into clusters, each with a group
of paths covered, and use a genetic algorithm to
generate new test cases in each cluster to increase
path coverage.
The experiment yielded a promising outcome.
K-mean
Eliminating redundant test cases to reduce the
number of test suite
The experimental results show improved clustering
accuracy and a large reduction in redundant test cases.
K-mean
Target Path Selection is accomplished by separating
paths into groups using the K-means algorithm so
that paths with a high degree of similarity are
grouped. Then choose the cluster centers as targets,
making sure that the target paths chosen have a
greater degree of differentiation.
The experimental findings show that the proposed
method is effective.
k-mean
Effective Test Cases are identified by separating test
cases into two groups: effective and non-effective
test cases.
The clustering-based test case classification can
discover effective test cases with a high recall ratio
and a significant accuracy percentage, according to an
empirical study.
C-mean
Reduction in the number of tests required
the main goal is to reduce the number of test cases
and thus the amount of time spent testing.
The methodology's final test suite will produce good
results in terms of conditions and path coverage.
4 Methodology
The Test Case Generation Model using Fuzzy Logic
(TCGFL) is designed to automate and improve the
test case generation process in the context of
software testing. The primary objective is to reduce
the time complexity and enhance the effectiveness
of the software testing phase within the system
development life cycle (SDLC). Here is a detailed
description of how the TCGFL model operates:
1. Control Flow Graph (CFG) Construction: The
process begins by automatically constructing a
Control Flow Graph (CFG) from the source code.
A CFG is a graphical representation of the
program's control flow, illustrating how the
program's execution moves through various
branches and decisions.
2. Variable Extraction: The TCGFL model extracts
variables from the source code. These variables
are crucial for creating meaningful test cases that
assess the behavior of the software with different
input values.
3. Input Generation: Within a specified domain, the
model generates inputs. These inputs are used as
test data to evaluate the software. It's important
to cover a broad range of input values to test
various scenarios and uncover potential issues.
4. Basis Path Generation: The model proceeds to
generate basis paths from the constructed control
flow graph. Basis paths represent fundamental
paths through the program, and they are essential
for thorough testing.
5. K-Means Clustering: The TCGFL model
employs a k-means clustering algorithm for the
extracted variables. Clustering helps group
similar variables together, making it easier to
handle and assess the data. This step contributes
to more organized and efficient testing.
6. Path Coverage Assessment: The clustered data is
used to assess path coverage for each basis path.
Path coverage is a critical metric in software
testing, ensuring that all possible paths through
the program are tested. The clustered data allows
for a more systematic and comprehensive path
coverage analysis.
7. Fuzzy Logic Optimization: Finally, fuzzy logic is
used to determine the optimal path. Fuzzy logic
is employed to make nuanced decisions and
select the most appropriate path based on various
criteria, improving the overall quality of test case
selection.
The process and workflow of the TCGFL model
are summarized in Figure 3, providing a visual
representation of how the various components
interact to automate the test case generation process.
This research aims to make software testing more
efficient, reduce testing time and costs, and ensure
thorough testing, contributing to more effective
software development within the SDLC.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
167
Volume 12, 2024
Fig. 3: Approach of Proposed Test Case Generation
Model (TCGMFL)
Triangle problem, [48], [49], [50], [51], has
been chosen as an ongoing case study as previous
study and benchmark data, as well as a program to
check Prime Numbers to demonstrate the
application of our proposed methodology.
The triangle program accepts three variables as
input and returns the type of triangle formed as an
output on the other hand Prime program accepts one
variable as input and returns if the input is a prime
number or not to execute the program, we have used
an integrated development environment (IDE) for
python called visual studio code version 1.57.1 It is
the most widely used IDE that provides a base
workspace and can be easily customized due to its
extensible plug-in system.
The analysis of a program under test (PUT)
written in Python is the first step in the TCGFL
model to understand the flow inside a program.
Through transforming the code into a set of nodes
and edges, The Control flow graph (CFGs) is a
direct graph that usually focuses on a graph with a
set of edges and a set of nodes, with each node
focusing on a collection of sequential statements
that form a primary block, and edge connections
between the nodes. Because CFGs may precisely
reflect the flow inside a program, they are
commonly used in static analysis and compiler
applications. Every node in CFG represents a
statement in the program under test, allowing every
path of statements to be sequenced. We introduced
an algorithm that takes a list generated through a
control flow graph (CFG) as input and a list
containing different paths for a given program as
output to find the possible test paths of the CFG
starting from the first node and ending at the exit
node.
With the help of cyclomatic complexity which
is calculated by the algorithm, the number of
independents is determined. A number of paths can
be determined for any graph but even so, doesn’t
ease our work as there are several infeasible paths in
the control flow graph. The cyclomatic complexity
V(G) = E-N+2. Where N is the number of nodes and
E is the number of edges, the algorithm traverses
each node from source to destination to find all
possible paths into corresponding CFG this
algorithm is used for the Python programming
language.
In this phase, the prepared datasets are imported
into Python for clustering we employed the k-means
clustering algorithm that was previously explained
in section 2.2.1 Clustering Algorithms The
algorithm was used because it is suitable for the
scope of this work.
We employed k-mean on the datasets to group
test data into the different clusters, clustering is
applied on different data sets 1-dimensional space
for the prime program and 3-dimensional space for
the triangle program after all datasets are
transformed into numbers, and the data can be
grouped using the K-Mean Clustering method.
Several procedures must be completed to group
these data into different clusters. Test Case
Generation for each path is accomplished through
symbolically executing each path through a program
the algorithm traverses a test path across the
program's flow graph from the source node to the
sink node, visiting each node along the path. When
a node is visited, the algorithm symbolically
executes the program's corresponding statement.
As inputs to the test case, the algorithm employs the
clustered data created in the previous phase. The
expected outputs along the execution path are
presented in terms of the test case's inputs and the
test case's symbolic inputs can be turned into a set of
actual inputs by using such expressions to evaluate
their intended results, [7].
The current study introduces a novel approach
for test case generation in the context of software
testing using the Test Case Generation Model using
Fuzzy Logic (TCGFL). While the topic of software
testing and test case generation has been explored in
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
168
Volume 12, 2024
existing literature, the current study distinguishes
itself in several ways:
Use of Fuzzy Logic: The primary differentiator is
the incorporation of fuzzy logic into the test case
generation process. Fuzzy logic is employed to
make nuanced decisions, particularly in
determining the optimal test path. This approach
is not commonly found in conventional test case
generation methods and offers a more
sophisticated way of selecting the most suitable
test cases.
Application to Specific Programs: The study
focuses on two specific programs, the Triangle
Problem and a Prime Number checker, which
serve as ongoing case studies. This targeted
approach allows for practical and real-world
application of the proposed methodology, making
it more relevant and actionable.
Integration of Clustering: The study utilizes k-
means clustering to group test data, thereby
enhancing the organization of test cases and
making the process more efficient. Clustering is
applied to different datasets for the two programs,
demonstrating adaptability and customization.
Comprehensive Path Coverage: The TCGFL
model aims to cover all possible test paths within
the control flow graph. It addresses the issue of
infeasible paths and ensures thorough testing. This
level of comprehensive path coverage is a notable
feature of the proposed methodology.
Python Programming Language: The study
focuses on the Python programming language, and
the algorithm and approach are tailored
specifically for Python. This specificity allows for
a more in-depth exploration of test case generation
within this language, which may have unique
characteristics and challenges.
Detailed Algorithm Description: The study
provides a detailed explanation of the algorithm
used for generating test paths and symbolically
executing test cases. This transparency in the
methodology helps readers understand the process
clearly.
In summary, the current study distinguishes
itself from existing literature by incorporating fuzzy
logic, applying the approach to specific programs,
using clustering, addressing infeasible paths,
focusing on the Python programming language, and
providing a comprehensive description of the
algorithm. These unique aspects make the TCGFL
model a valuable contribution to the field of
software testing and test case generation.
5 Conclusion
Many researchers have recently become interested
in the problem of automating the software testing
process. The majority of currently available tools do
not cover all aspects of the testing phase.
Furthermore, these techniques are not fully
automated, implying that human work, whether
from the customer or the developer, is required to
complete the testing or design test cases. The
approach used in this study was aimed to assist
developers or testers in automatically generating test
cases.
In this study, a control flow graph is
automatically created and weighted each edge to use
in optimal path determination then the source code
converts into XML to extract the variables and
generate random data in the given domain for thus
variables and all possible paths are extracted
through traversed each node in the flow graph from
source to destination a clustering is done for the
variables test data to check the feasibility and
coverage of paths. Finally, the fuzzy logic will
determine the optimal path.
Triangle and Prime programs are used as a case
study to show how our test case generation tool
(TCGFL) was implemented with a completely
automated approach from the first phase to produce
test cases and evaluate paths. Furthermore, we use
an existing approach to analyze the model using
metrics of "cyclomatic complexity." The results
show that the proposed automated approach TCGFL
is more efficient in terms of testing time, cost, and
effort the findings indicate that the generated basis
path from the suggested approach will be
compatible with the complexity of the software that
will represent the case of this study (prime and
triangle programs) the cost and effort necessary for
testing will be reduced and test cases are generated
efficiently in addition, the prioritizing the selection
of test case through determining the optimal path.
6 Future work and Limitations
Our methodology can be improved in several
respects in future work, we aim to improve the
model by providing a solution for generating new
test data for the cluster not having 100 percent path
coverage.
Furthermore, more complex programs are
required for assessing our model and applying it to
other programming languages such as Java, c#, and
C++, as well as more research into test case
generation optimization methods such as hybrid GA
and PSO. Furthermore, the approach assumes that
the program unit under test isn't extremely complex
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
169
Volume 12, 2024
and thus lacks control dependencies. However, if
the program being tested has control dependencies,
it won't be able to divulge them, therefore the test
cases developed won't be able to completely test the
program.
Acknowledgment:
The authors would like to thank the Arab Open
University and Al-Zaytoonah University for
providing the necessary scientific research supplies
to implement the research
References:
[1] G. Kumar and P. Bhatia, "Software testing
optimization through test suite reduction using
fuzzy clustering," CSI Transactions On ICT,
vol. 1, no. 3, pp. 253-260, 2013. doi:
10.1007/s40012-013-00233.
[2] N. Chuaychoo and S. Kansomkeat, "Path
Coverage Test Case Generation Using
Genetic Algorithms", [Online].
Journal.utem.edu.my (Accessed Date: March
15, 2021).
[3] T. A. Alrawashed, A. Almomani, A.
Althunibat, and A. Tamimi, "An automated
approach to generate test cases from use case
description model," Computer Modeling in
Engineering & Sciences, vol.119, no. 3,
pp.409-425, 2019.
[4] I. Septian, R. S. Alianto, and F. L. Gaol,
"Automated Test Case Generation from UML
Activity Diagram and Sequence Diagram
Using Depth First Search Algorithm,"
Procedia Computer Science, vol.116, no. 1,
pp.629-637, 2021.
[5] A. Kalaee and V. Rafe, "An Optimal Solution
for Test Case Generation Using ROBDD
Graph and PSO Algorithm," Quality and
Reliability Engineering International, vol.32,
no. 7, pp.2263-2279, 2016.
[6] S. R. Khan, A. Nadeem, and A. Awais,
"TestFilter: a statement coverage-based test
case reduction technique," in Multitopic
Conference, Dec 2006, pp.275-280, 2006.
[7] T. K. Wijayasiriwardhane, P. G. Wijayarathna
and D. D. Karunarathna, "An automated tool
to generate test cases for performing basis
path testing," 2011 International Conference
on Advances in ICT for Emerging Regions
(ICTer), Colombo, Sri Lanka, 2011, pp.95-
101, doi: 10.1109/ICTer.2011.6075032.
[8] A. Alhroob, W. Alzyadat, A. Imam, and G.
Jaradat, "The Genetic Algorithm and Binary
Search Technique in the Program Path
Coverage for Improving Software Testing
Using Big Data," Intelligent Automation and
Soft Computing, 2020. doi:
10.31209/2020.100000168.
[9] L. Barqawi, "The Impact of Using Artificial
Intelligence in Pharmaceutical Companies,"
Al-Zaytoonah University of Jordan Journal
for Legal studies, vol.4, no. 1, pp.217-236,
2023. doi: 10.15849/ZUJJLS.230330.11.
[10] R. Omeish, "Messing with the Blockchain
Technology to Commit a Crime," Al-
Zaytoonah University of Jordan Journal for
Legal Studies, vol.3, no. 1, pp.91-108, 2022.
doi: 10.15849/ZUJJLS.220330.06.
[11] A. Alshehadeh, G. Elrefae, A. Belarbi, A.
Qasim, and H. Al-Khawaja, "The impact of
business intelligence tools on sustaining
financial report quality in Jordanian
commercial banks," Uncertain Supply Chain
Management, vol.11, no. 4, pp.1667-1676,
2023.
[12] G. Y. Quba, H. Al Qaisi, A. Althunibat, and
S. AlZu’bi, "Software requirements
classification using machine learning
algorithms," in 2021 International Conference
on Information Technology (ICIT), pp.685-
690, IEEE, July 2021.
[13] M. Soda, M. Makhlouf, Y. Oroud, A.
Alshehadeh, R. Omari, and H. Al-Khawaja,
"Does the audit quality have any moderating
impact on the relationship between ownership
structure and dividends? Evidence from
Jordan," Uncertain Supply Chain
Management, vol.11, no. 4, pp.1789-1800,
2023.
[14] Lemos O. A. L., Ferrari F. C., Eler M. M.,
Maldonado J. C., and Masiero P. C.,
"Evaluation studies of software testing
research in Brazil and in the world: A survey
of two premier software engineering
conferences," The Journal of Systems and
Software, vol. 86, pp. 951–969, 2012.
[Online]. Available via: The Journal of
Systems and Software. [Accessed 6 April
2020].
[15] H. Al-Shafei, "Computerization of Programs
for Teaching Arabic to Non-Native Speakers:
Android Applications as a Model," Al-
Zaytoonah University of Jordan Journal for
Human and Social Studies, vol. 3, special
issue, pp.301-323, 2022. doi:
10.15849/ZJJHSS.220508.15.
[16] A. Kinani, "The Arabic Language in Social
Networking Sites, Between the Arabic and
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
170
Volume 12, 2024
Latin Letters," Al-Zaytoonah University of
Jordan Journal for Human and Social
Studies, vol.4, no. 1, pp.89-106, 2023. doi:
10.15849/ZJJHSS.230330.05.
[17] Wang X., Jiang, Y. and Tian W. (2015). An
Efficient Method for Automatic Generation of
Linearly Independent Paths in White-box
Testing. International Journal of Engineering
and Technology Innovation, vol.5, pp.108-
120.
[18] R. Mall, "Fundamentals of software
engineering," 2nd ed. New Delhi: Prentice-
Hall of India Ltd, 2008.
[19] B. Alsawareah, A. Althunibat, and B.
Hawashin, "Classification of Arabic Software
Requirements Using Machine Learning
Techniques," in 2023 International
Conference on Information Technology
(ICIT), Amman, Jordan, 2023, pp. 631-636,
doi: 10.1109/ICIT58056.2023.10225789.
[20] A. P. Mathur, "Foundations of Software
Testing," Addison-Wesley Professional, USA,
2008.
[21] B. Beizer, "Software Testing Techniques,"
2nd edn. Van Nostrand Reinhold Co., New
York, 1990.
[22] ISO/IEC/IEEE International Standard -
Systems and software engineering.
ISO/IEC/IEEE 24765:2010(E). Dec. 2010,
pp.1–418.
[23] R. Amro, A. Althunibat, and B. Hawashin,
"Arabic User Requirements Classification
Using Machine Learning," in 2023
International Conference on Information
Technology (ICIT), Amman, Jordan, 2023, pp.
483-488, doi:
10.1109/ICIT58056.2023.10225936.
[24] Kumar M., Lata Misra and Gyan Shekhar. “A
Survey in Fuzzy Logic : An Introduction.”
(2015).
[25] L. A. Zadeh, "Outline of a New Approach to
the Analysis of Complex Systems and
Decision Processes," 1973.
[26] L. A. Zadeh, "Making computers think like
people," IEEE Spectrum, vol.8, pp.26-32,
1984.
[27] L. A. Zadeh, "Fuzzy sets," Information and
Control, vol.8, no. 3, pp.338-353, 1965.
[28] V. Novák, I. Perfilieva, and J. Močkoř,
"Mathematical principles of fuzzy logic,"
Dordrecht: Kluwer Academic, 1999.
[29] D. Dubois and H. Prade, "Fuzzy Sets and
Systems," Academic Press, New York, 1988.
[30] R. V. Binder, "Testing Object-Oriented
System Models, Patterns, and Tools," NY:
Addison Wesley, 1999.
[31] R. Ibrahim, A. Bani Amin, S. Jamel, and J.
Wahab, "EPiT: A Software Testing Tool for
Generation of Test Cases Automatically,"
International Journal of Engineering Trends
and Technology, vol.68, no. 7, pp.8-12, 2020.
[32] P. Wang, X. Hu, N. Qiu, and H. Yang,
"White-Box Test Case Generation Based on
Improved Genetic Algorithm," in Recent
Advances in Computer Science and
Information Engineering, 2012, pp.489-495.
[33] J. B. Goodenough and S. L. Gerhart, "Toward
a theory of test data selection," in Proceedings
of the international conference on Reliable
software, 1975, pp.493–510.
[34] A. Andrews, R. France, S. Ghosh, and G.
Craig, "Test adequacy criteria for UML
design models," Software Testing,
Verification and Reliability, vol.13, pp.95–
127, 2003.
[35] B. Beiter, "Software system testing", Van
Nostrand Reinhold Company, 1990.
[36] P. G. Frankl and E. J. Weyuker, "An
Applicable Family of Data Flow Testing
Criteria," IEEE Transactions on Software
Engineering, vol.14, no. 10, pp.1483-1498,
1988.
[37] N. Vijay K and V. K. S. Kumar, "Automated
Test Path Generation using Genetic
Algorithm," International Journal of
Engineering Research, vol.6, Issue 7, 2017.
[38] A. Watson and T. McCabe, "Structured
Testing: A Testing Methodology Using the
Cyclomatic Complexity Metric," Special
Publication, National Institute of Standards
and Technology, August 1996.
[39] Roper and Marc, "Software Testing,"
McGraw-Hill, 1994.
[40] M. R. Girgis, "Automatic Test Data
Generation for Data Flow Testing Using a
Genetic Algorithm," Journal of Universal
Computer Science, vol.11, no. 6, pp.898-915,
2005.
[41] N. Gupta, A. Mathur, and M. Soffa,
"Automated test data generation using an
iterative relaxation method," ACM SIGSOFT
Software Engineering Notes, vol.23, no. 6, pp.
231-244, 1998. doi: 10.1145/291252.288321.
[42] Y. Al-Kasabera, W. Alzyadat, A. Alhroob, S.
Al Showarah, and A. Thunibat, "An
Automated Approach to Validate
Requirements Specification," Compusoft,
vol.9, no. 2, pp.3578-3585, 2020.
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
171
Volume 12, 2024
[43] Shaheda Akthar and SkMd Rafi, "Improving
the software architecture through fuzzy
clustering technique," Indian Journal of
Computer Science and Engineering, vol.1, no.
1, pp.54-57, 2010.
[44] Daljit Kaur and Kiran Jyot, "Enhancement in
the Performance of K-means Algorithm,"
International Journal of Computer Science
and Communication Engineering, vol.2, no. 1,
2013.
[45] K. A. Abdul Nazeer and M. P. Sebastian,
"Improving the Accuracy and Efficiency of
the k-means Clustering Algorithm," in
Proceedings of the International Workshop on
Computer Education, July 1-3, 2009, London,
U.K.
[46] J. C. Bezdek, "Pattern Recognition with Fuzzy
Objective Function Algorithms," New York:
Plenum Press, 1981.
[47] Y. Yong, Z. Chongxun, and L. Pan, "A Novel
Fuzzy C-Means Clustering Algorithm for
Image Thresholding," Measurement Science
Review, vol.4, no. 1, 2004.
[48] A. Pandey and A. Malviya, "Enhancing test
case reduction by k-means algorithm and
elbow method," International Journal of
Computer Sciences and Engineering, vol.6,
no. 6, pp.299-303, 2018.
[49] Z. Yan, Q. Li, W. Xingya, C. Jingying, and L.
Xuefei, "Automatic Software Testing Target
Path Selection using K-Means Clustering
Algorithm," International Journal of
Performability Engineering, vol.15, no. 10,
p.2667, 2019.
[50] Y. Pang, X. Xue, and A. Namin, "Identifying
Effective Test Cases through K-Means
Clustering for Enhancing Regression
Testing,", 12th International Conference on
Machine Learning and Applications, 2013.
[51] L. Briand, Y. Labiche, and Z. Bawar, "Using
Machine Learning to Refine Black-Box Test
Specifications and Test Suites," The Eighth
International Conference on Quality
Software, 2008.
Contribution of Individual Authors to the
Creation of a Scientific Article (Ghostwriting
Policy)
- Ahmed Althunibat carried out the framework,
Research Methodology, and Conceptualization.
- Mustafa Shuker Mahmood has implemented the
Literature review, Previous Studies, Validation.
- Hend AlNuhait has organized and executed the
Discussion, Conclusion, and Supervision.
- Sally Almanasra was responsible for the Data
curation, Software, Results, and analysis.
- Haneen A. Al-Khawaja wrote the main
Hypothesis, Writing - original draft, Visualization.
Sources of Funding for Research Presented in a
Scientific Article or Scientific Article Itself The
authors extend their appreciation to the Arab Open
University for funding this work through AOU
research fund No. (AOURG-2023-021)
Conflict of Interest
The authors have no conflicts of interest to declare.
Creative Commons Attribution License 4.0
(Attribution 4.0 International, CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en
_US
WSEAS TRANSACTIONS on COMPUTER RESEARCH
DOI: 10.37394/232018.2024.12.16
Ahmed Althunibat, Mustafa Mahmood,
Hend Alnuhait, Sally Almanasra,
Haneen A. Al-Khawaja
E-ISSN: 2415-1521
172
Volume 12, 2024