Development of Mathematical Algorithm for Detecting XSS Attacks on
Web Applications
KOMIL FIKRATOVICH KERIMOV1, ZARINA ILDAROVNA AZIZOVA2
1Department of System and Application Programming,
Tashkent University of Information Technologies named after Muhammad al-Khwarizmi,
108 Amir Temur Avenue, Yunusabad paradise, 100084, Tashkent,
REPUBLIC OF UZBEKISTAN
2Department of Information Security,
Tashkent University of Information Technologies named after Muhammad al-Khwarizmi,
108 Amir Temur Avenue, Yunusabad paradise, 100084, Tashkent,
REPUBLIC OF UZBEKISTAN
Abstract: - The widespread usage of web applications has led to an increase in security threats, with Cross-Site
Scripting (XSS) attacks being one of the most prevalent and damaging. Detecting and mitigating XSS attacks is
crucial to ensure the integrity and confidentiality of sensitive user data. This article presents the mathematical
algorithm and a way to identify XSS attacks using a bounded function from below, which depends on the input
string, and highlights its potential impact in bolstering web application security. To construct this function, we
used special characters and keywords that are frequently found in the construction of XSS attacks.
Key-Words: - XSS, vulnerability, signatures, information security, artifacts, testing.
Received: March 2, 2024. Revised: August 26, 2024. Accepted: October 2, 2024. Published: November 5, 2024.
1 Introduction
Nowadays the internet has become an integral part
of our daily lives, with web applications serving as
the backbone of various online services. Web
applications are increasingly targeted by malicious
actors, particularly through Cross-Site Scripting
(XSS) attacks. These attacks involve injecting
harmful scripts into web pages, enabling data theft,
session manipulation, and further exploitation.
Successful XSS attacks can result in financial
losses, reputational damage, and compromised trust.
However, this increased reliance on web
applications has also made them attractive targets
for malicious actors seeking to exploit
vulnerabilities and compromise user data. Among
the various attack vectors, Cross-Site Scripting
(XSS) attacks have emerged as a significant security
concern.
Traditional approaches to detecting XSS attacks
have relied on signature-based methods or pattern-
matching techniques. While these techniques can be
effective to a certain extent, they often struggle to
detect sophisticated and obfuscated XSS payloads.
Furthermore, the dynamic nature of modern web
applications poses challenges for static analysis-
based detection methods. To address these
limitations, we propose a novel mathematical
algorithm for detecting XSS attacks on web
applications. Our approach combines the principles
of XSS vulnerability detection based on the specific
weight of each attack symbol signature.
In this paper, we propose a mathematical
modeling and a method to identify XSS attacks
using a bounded function from below, which
depends on the input string. To construct this
function, we used special characters and keywords
that are often found in the construction of XSS
attacks. In the proposed method, it is possible to
detect XSS attacks using a single special character
or a single keyword. However, it can be shown
experimentally that the proposed detection method
using a set of multiple characters and words can
detect the vulnerability of a type of XSS attack more
accurately.
1.1 Related Research Papers on XSS Attack
Detection
Article [1] proposes a machine learning approach
for detecting XSS attacks, which relies on diverse
and representative training data for reliable results.
Static analysis methods [2] detect XSS
vulnerabilities during development but require
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.46
Komil Fikratovich Kerimov,
Zarina Ildarovna Azizova
E-ISSN: 2224-3402
509
Volume 21, 2024
manual verification and adjustment of rules for
improved accuracy. Combining dynamic analysis
and machine learning [3] enhances detection
accuracy but poses challenges in scalability and
adapting to evolving attack techniques.
Applying NLP techniques to enhance XSS
detection shows promise but faces challenges with
obfuscated payloads, [4]. Symbolic execution
detects DOM-based XSS attacks by analyzing
JavaScript code but can be computationally
expensive and requires careful handling of complex
code, [5]. According to [6], signature-based
detection systems are based on predefined patterns
or signatures of known XSS attacks. Although
signature-based methods are optimal and, under
certain conditions, effective in detecting known
attacks when the signature database is regularly
updated, they are ineffective in detecting new or
evolving attack patterns. Regular updates and
customization of WAF rules based on known attack
patterns are essential [7]. XSS attacks inject
malicious code into HTTP requests. Existing
prevention techniques may not work against
unknown attacks, emphasizing the need for new
prevention methods, [8]. In [9], a mathematical
method for detecting XSS-type vulnerabilities is
proposed based on the specific weight of each attack
symbol signature.
These research works provide valuable insights
into the different approaches and techniques for
XSS attack detection. However, it is important to
consider the limitations, challenges, and potential
trade-offs associated with each approach.
Also noteworthy is the work [10], which
presents a method based on the subsequence
parliamentary algorithm. The key diagram of the
algorithm for compiling a sequence in a common
substring between the input parameters and the
generated data, with setting a threshold to limit the
length of the common substring. In addition, the
analysis of the advantages and disadvantages of the
static and algorithmic XSS detection methods used
in the article carries valuable information load.
The paper [11] described a web vulnerability
assessment methodology and a risk distribution
model of vulnerabilities, supplemented by a survey
of web developers. The results of the testing and the
survey allowed us to define a "triangular model" of
the causes and consequences of web vulnerabilities,
as well as the necessary measures to reduce their
number and frequency. In turn, in [12], the author's
attention is drawn to the use of methods for
detecting XSS attacks by applying various tools,
including the XSS Detector developed by them.
The difference between our proposed method is
that it is based on processing existing types of XSS
vulnerability types. A mathematical algorithm for
detecting attacks on a web application using
information related to the frequency and importance
factor of attack symbols does not require complex
mathematical techniques like optimization etc. Since
the proposed method is based on the processing of
available types of real-world vulnerabilities of XSS
kind, and is general in nature, we can expect that our
proposed algorithm can be used as an online cross-
site scripting attack detection platform.
The proposed method involves creating a
character set that distinguishes between attack and
normal strings using a predefined threshold.
Experiments with synthetic data showed that a
character set containing a space, a semicolon, and a
right parenthesis performed well over a range of
weights for attack and normal strings. However, it
should also be noted that there are problems in
collecting samples of real attacks and conventional
data, as well as in capturing the entire pattern of
attack methods.
2 Mathematical Algorithm for
Detecting XSS Attacks
To ensure the protection of web resources, XSS
attack detection is a relevant solution. XSS attacks
actively exploit vulnerabilities in web applications
to execute malicious scripts, which affects the
integrity of system and user data. This section
contains information about XSS attack detection
methods that are aimed at improving the security
level of web applications.
A comprehensive approach to protecting web
applications includes several methods for effectively
detecting XSS attacks. Static analysis allows you to
identify vulnerabilities in a web application at the
development stage, while dynamic analysis and
machine learning-based detection methods allow
you to observe changes in real time and regularly
monitor application performance. Signature-based
XSS attacks and WAF detection using strictly
defined rules also allow to monitor network traffic.
Combining these methods allows you to implement
in practice an effective and reliable strategy of
defence against XSS attacks.
In this paper, to validate the proposed algorithm,
we need to determine whether the input string is an
XSS attack or not, namely to model an information
object, i.e., a query that consists of a sequence of
special characters from Table 1 and keywords from
Table 2 respectively.
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.46
Komil Fikratovich Kerimov,
Zarina Ildarovna Azizova
E-ISSN: 2224-3402
510
Volume 21, 2024
The following characters and keywords are
often used to construct XSS attacks:
Table 1. Special characters for defining XSS attacks
Variable
Special
symbols
Importance
ratio
Special
symbols
Importance
ratio
А1
0,561
}
0,043
А2
>
0,331
#
0,552
А3
/
0,511
+
0,033
А4
<
0,331
!
0,047
А5
Spacebar
0,485
,
0,067
А6
=
0,609
@
0,017
А7
0,318
?
0,047
А8
:
0,174
[
0,013
А9
.
0,997
]
0,013
А10
(
0,532
-
0,144
А11
)
0,532
~
0,003
А12
-
0,144
*
0,023
А13
;
0,622
|
0,003
А14
(yen
sign)
0,023
^
0,003
А15
&
0,622
%
0,037
А16
{
0,043
$
0,003
Table 2. Special keywords for defining XSS attacks
Variable
Key
-words
Importance
ratio
Variable
Key-words
Importance
ratio
А33
FScommand
0,003
А56
Jscript
0,171
А34
&lt;script&gt;
0,074
А57
Wscript
0,012
А35
&lt;/script&gt;
0,164
А58
Vbscript
0,003
А36
on\w*
0,003
А59
Vbs
0,003
А37
style
0,097
А60
Ilayer
0
А38
xmlns:xdp
0,003
А61
Iframe
0,015
А39
Formaction
0,010
А62
Applescript
0
А40
Form
0,020
А63
Jar
0
А41
xlink:href
0,003
А64
Eval
0,006
А42
seekSegment
Time
0,003
А65
Document
0,064
А43
FSCommand
0,003
А66
base64
0,018
А44
Applet
0,003
А67
</script>
0,164
А45
Audio
0,003
А68
<script
0,161
А46
Basefont
0,030
А69
Keygen
0
А47
Base
0,023
А70
1Object
0,012
А48
Behavior
0,003
А71
Plaintext
0
А49
Bgsound
0,003
А72
Mochaе
0
А50
Blink
0
А73
Style
0,097
А51
view-source
0
А74
Javascript
0,171
А52
Embed
0,020
А75
Xml
0,047
А53
Livescript
0
А76
Math
0
А54
Mocha
0,003
А77
Source
0,006
А55
Beсhavior
0,003
А78
Svg
0,017
Suppose some input string is observed
L
and let
3221 ,...,,xxx
the frequency of occurrence in
L
pecial characters from Table 1 and let
783433 ,...,,xxx
the frequency of occurrence of
special keywords from Table 2,
79
х
frequency of
occurrence of all other signs and numbers 0,1,2,...,9
in the line
L
. From the point of view of defining
XSS attacks, common characters
zba ...,,,
and
numbers 0, 1, , 9 do not play an important role.
So in this paper, we always assume that the
frequency of occurrence of all these symbols and
numbers in the observed string
L
is equal to1, i.e.
1
79 х
. In this way, any string
L
can be defined
with certain characteristics as follows:
),,...,,,...,,( 7978333221 xxxxxxL
, as an element of
some phase space
X
. From the definition
L
it is
seen that any element
L
from the constructed space
X
lies on the hyperplane
1:,...,,,....,,797978333221 xxxxxxxLG
.
Using this hyperplane equation, we can assume
that the greater the frequency of occurrence of
special characters and keywords in the input string,
the more obvious is the proximity of the input string
L
to XSS attacks. Therefore, it is natural to assume
that the attack definition function should be
increasing in the variables
78333221 ,...,,,...,,xxxxx
,
and decreasing in the variable
х79
. Based on these
considerations, the following function is proposed to
define XSS attacks, which is an increasing function
on the variables
78333221 ,...,,,...,,xxxxx
and
decreasing in
х79
:
79
78
1
78
1
7978333221 ),,...,,,...,,()(
xx
x
xxxxxxfLf
ii
ii
As in this paper, we always assume that the
frequency of occurrence of all other characters and
numbers 0,1, 2,..., 9 in the line
L
is equal to
1
, then
from the last equality we obtain
1
),,...,,,...,,()( 78
1
78
1
7978333221
ii
ii
x
x
xxxxxxfLf
(1)
If the input string is
L
is an XSS attack, then
this string must at least contain one special character
from Table 1 or one keyword from Table 2.
Therefore
1
78
1
ii
x
and because the function
)(Lf
is increasing for each of the variables
i
x
its
minimum at
1
78
1
ii
x
is reached at the point
0
L
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.46
Komil Fikratovich Kerimov,
Zarina Ildarovna Azizova
E-ISSN: 2224-3402
511
Volume 21, 2024
for which
1
78
1
ii
x
. Thus, if
L
random string and
2/1)( Lf
, then
L
is probably an XSS attack, in
which case it is built using a minimum of either 1 a
special character from Table 1 and 1 keyword from
Table 2 are used to build it, or 1 keyword from
Table 2. If, on the other hand
2/1)( Lf
then the
input string is probably normal. Therefore, function
(1) can be used to recognize XSS attacks and
normal strings constructed with special characters
and keywords.
Further, let us refine the algorithm by using the
importance coefficients of special characters in the
construction of the recognition function. To do this,
let us construct the function:
1
),,...,,,...,,()( 78
1
78
1
797833322111
iii
iii
xk
xk
xxxxxxfLf
(2)
where
,10, ii kk
– sign importance coefficients from
Table 1. The importance coefficients are calculated
by examining 299 pieces of real XSS attacks. Using
the values of importance coefficients and the form
of function (2), it is easy to determine the minimum
of the new function
)(
1Lf
under the condition
003,0
78
1
ii
x
. The minimum of this function is reached
at the point
0
L
, for the coordinates of which the
equality
003,0
78
1
ii
x
. Thus, from the new function
)(
1Lf
we have that if
L
arbitrary string and
003,0)(
1Lf
then
L
is probably an XSS attack, and in this case at
least 1 special character from Table 1 and 1
keyword from Table 2 are used to build it, or 1
keyword from Table 2. If, however
003,0)( Lf
then
the input string is normal. Thus, function (2) can be
used to detect XSS attacks and regular strings
formed with special characters and keywords.
The advantage of our proposed mathematical
algorithm is its adaptability. It can be subjected to
training and tuning based on both previously known
data and real-time data, which leads to its
continuous improvement and enhancement of XSS
attack detection capabilities. To improve the
accuracy of attack detection and reduce the number
of false positives and false negatives, a complete
dataset containing different scenarios and focus of
XSS attacks is required.
This algorithm can be used in conjunction with
other web application security methods, including
input validation, content content security policy, etc.
The proposed method does not require high
computing power as the mathematical calculations
are not difficult. Based on the application of static
real-time processing of input data, the importance
coefficients of the available special characters and
keywords used in the construction of XSS attacks
were obtained. The results are presented in Tables 1
and 2, respectively. The obtained results, taking into
account the importance coefficients of special
characters, were necessary to demonstrate the
effectiveness of the new XSS attack recognition
algorithm. Based on the available characteristics of
XSS attacks using special characters and special
keywords, an information object in the form of
queries was modeled in the process. Thus this
method can be used as an online tool to detect XSS
attacks.
3 Conclusion
The mathematical algorithm for detecting XSS
attacks on web applications, taking into account the
frequency of occurrence and importance coefficient
of characters involved in the construction of
incoming requests presented in this research
leverages various mathematical techniques and data
analysis methodologies to identify potential XSS
vulnerabilities. In this case, from Table 1 and Table
2 we can see that the importance coefficients for
special characters (Table 1) have a significant
difference, and the importance coefficients for
keywords are very close to zero. Therefore, if we do
not take into account some parameters with zero
coefficients when constructing functions (1), and
(2), the recognition of incoming queries is possible
with some small error.
In conclusion, a mathematical algorithm for
detecting XSS attacks on web applications is a
promising approach to improving the security of
web applications. Using mathematical principles
and data analysis techniques, it provides an
adaptable and systematic method for identifying
potential XSS vulnerabilities. However, it must be
used as part of a comprehensive security framework
to provide a robust defense against XSS attacks.
Our further research will focus on addressing
the challenges of collecting real-world attack
samples and conventional data and the development
of a software tool based on the proposed algorithm
and its evaluation. Further research and evaluation is
needed to confirm the validity of the chosen
method.
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.46
Komil Fikratovich Kerimov,
Zarina Ildarovna Azizova
E-ISSN: 2224-3402
512
Volume 21, 2024
References:
[1] Banerjee Raima, Baksi Aritra, Singh Nidhi,
Bishnu Sohan Kanti, Detection of XSS in web
applications using Machine Learning
Classifiers, International Conference on
Electronics, Materials Engineering & Nano-
Technology (IEMENTech) 2020 4th, Kolkata,
2020, pp. 1-5, doi:
10.1109/IEMENTech51367.2020.9270052.
[2] Monali Shetty, Chirantar Nalawade, Hybrid
approach for Detection and Analysis of SQL
and XSS vulnerabilities, International Journal
of Engineering Trends and Technology,
Vol.59, 2018, pp. 37-41, doi:
10.14445/22315381/IJETT-V59P206.
[3] Wassermann Gary, Su Zhendong, Static
detection of cross-site scripting
vulnerabilities, 2008 ACM/IEEE 30th
International Conference on Software
Engineering, Leipzig, Germany, 2008, pp.
171-180, doi: 10.1145/1368088.1368112.
[4] Fawaz Mokbal, Dan Wang, Xiaoxi Wang,
(2022). Detect Cross-Site Scripting Attacks
Using Average Word Embedding and Support
Vector Machine, International Journal of
Network Security, Vol.24, No.20-28, doi:
10.6633/IJNS.202201_24(1).03.
[5] Sanjukta Mohanty, Arup Abhinna Acharya,
Detection of XSS Vulnerabilities of Web
Application Using Security Testing
Approaches, Intelligent and Cloud
Computing, 2021, pp.267-275, doi:
10.1007/978-981-15-6202-0_27.
[6] Jasleen Kaur, Urvashi Garg, Gourav Bathla,
Detection of cross-site scripting (XSS) attacks
using machine learning techniques: a review.
Artificial Intelligence Review, Vol. 56, 2023,
pp. 1-45, doi: 10.1007/s10462-023-10433-3.
[7] Garn Bernhard, Sebastian Lang Daniel,
Leithner Manuel, Richard Kuhn D., Kacker
Raghu, Simos Dimitris, Combinatorially
XSSing Web Application Firewalls, 2021
IEEE International Conference on Software
Testing, Verification and Validation
Workshops (ICSTW), Porto de Galinhas,
Brazil, 2021, pp. 85-94, doi:
10.1109/ICSTW52544.2021.00026.
[8] Domarev V.V., Security of information
technologies. Methodology of creation of
protection systems, DiaSoft, 2002. - 688 p.,
ISBN 966-7992-02-0.
[9] Sonoda Michio, Matsuda Takeshi, Koizumi
Daiki, Hirasawa Shigeichi, On Automatic
Detection of SQL Injection Attacks by the
Feature Extraction of the Single Character,
Proceeding in the 4th International
Conference on Security of Information and
Networks, SIN 2011, Sydney, NSW,
Australia, , 2011, pp.81-86,
https://doi.org/10.1145/2070425.2070440.
[10] Zhang Jingyu, Hu Hongchao, Huo Shumin an
Li Huanruo, A XSS Attack Detection Method
Based on Subsequence Matching Algorithm,
Proceeding in 2021 IEEE International
Conference on Artificial Intelligence and
Industrial Design (AIID), Guangzhou, 2021,
pp. 83-86, doi:
https://doi.org/10.1109/AIID51893.2021.9456
515.
[11] Blerim Rexha, Arbnor Halili, Korab Rrmoku
and Dren Imeraj, Impact of secure
programming on web application
vulnerabilities, Proceeding in 2015 IEEE
International Conference on Computer
Graphics, Vision and Information Security
(CGVIS), Bhubaneswar, India, 2015, pp. 61-
66, doi:
https://doi.org/10.1109/CGVIS.2015.7449894
.
[12] D. Azshwanth, G. Sujatha, A novel automated
method to detect XSS vulnerability in
webpages, Proceeding in 2022 International
Conference on Computer Communication and
Informatics (ICCCI), Coimbatore, India,
2022, pp. 1-4, doi:
https://doi.org/10.1109/ICCCI54379.2022.974
0937.
Contribution of Individual Authors to the
Creation of a Scientific Article (Ghostwriting
Policy)
The authors equally contributed in the present
research, at all stages from the formulation of the
problem to the final findings and solution
Sources of Funding for Research Presented in a
Scientific Article or Scientific Article Itself
No funding was received for conducting this study.
Conflict of Interest
The authors have no conflicts of interest to declare.
Creative Commons Attribution License 4.0
(Attribution 4.0 International, CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en
_US
WSEAS TRANSACTIONS on INFORMATION SCIENCE and APPLICATIONS
DOI: 10.37394/23209.2024.21.46
Komil Fikratovich Kerimov,
Zarina Ildarovna Azizova
E-ISSN: 2224-3402
513
Volume 21, 2024