Efficient Top-K Continuous Query Processing Over Sliding Window
Model (SWM) Method on Uncertain Data Stream
RAJA AZHAN SYAH RAJA WAHAB, SITI NURULAIN MOHD RUM, HAMIDAH IBRAHIM,
ISKANDAR ISHAK
Faculty of Computer Sciences & Information Technology,
Universiti Putra Malaysia,
43400 UPM Serdang, Selangor Darul Ehsan,
MALAYSIA
Abstract: - Query processing using the Uncertain Data Stream (UDS) can be complex in many technological
scenarios due to inconsistencies, unclear information, and interpretation latency. As a result of both the sheer
amount of data generated and the rate of change, traditional processing methods are in dire need of an upgrade.
UDS consists of a finite set of states known as possible worlds (PW), and enhancing data organization can lead
to more accurate extraction of user preferences. The number of possible world instances in UDS grows
exponentially, making achieving Top-k query processing quickly a significant challenge. Different methods are
available to handle Top-k queries in various types of UDS, and their key concerns include reducing duplicate
scans of the entire dataset, enhancing uncertainty computation, and focusing on processing the latest tuple item
entry. It appears that there have been limited studies conducted on the issue of UDS using the Sliding Window
Model (SWM). The current approach for handling continuous queries on UDS within the SWM has proven to be
ineffective, resulting in complex trade-offs between maximizing probability and generating high-scoring result
sets. The challenge is to find the correct result list that satisfies a Top-k query predicate with scoring and
probability. This study proposes a framework for processing Top-k queries for UDS using the sliding window
model to improve efficiency. The study also discusses an improved optimization method for reducing
computational redundancy in the context of the sliding window model and Top-k query processing. Overall,
this research will significantly contribute to the Top-k computational query processing field.
Key-Words: - Top-k, Uncertain Data Stream (UDS), Sliding Window Model (SWM), Tuple Items, Possible
Worlds (PW), Query Processing, Bucket Set, Computation, Segmentation, Optimization.
1 Introduction
Conventional query processing methods utilizing in-
memory algorithms need help to handle the extensive
data stream volumes. Therefore, creating efficient
techniques for processing Top-k queries and
integrating a reliable Data Stream Management
System (DSMS) to facilitate query processing from
various data sources is crucial. The transition from a
centralized to a distributed data environment is
critical in ranked retrieval, [1]. A user-defined
scoring function and a specific query generate k-
tuple items with the highest scores, [2]. This method
is highly important in a range of emerging
applications. These include object tracking, RFID
technology, sensor networks, information extraction,
and data integration, [3]. Probability distributions are
frequently used to describe situations with
uncertainty in data values rather than predictability,
[4].
Processing UDS can be quite challenging due to
a few factors, including the real-time generation of
tuple items, the lack of control over their arrival
order, the unlimited scale of data streams, and the
discarding of processed data stream objects, [5].
Overestimating the required window size could
cause unexpected and undesired tuple item returns,
[6]. UDS is considered uncertain when it contains an
uncertain object model, a possible world semantic
model, or both. Several studies have been conducted
to create models that describe UDS in semantics.
These models can represent relationships in the form
of sequences of events [7], including semi-structured
data models [8], [9], stream data models [10],
relational data models [11], [12], and
multidimensional data models [12], among others.
The main focus of UDS research is possible
instances in the world, which are represented by
possible worlds, [13], [14]. The Sliding Window
Model (SWM) represents all possible world settings
Received: April 13, 2024. Revised: August 16, 2024. Accepted: September 9, 2024. Published: October 29, 2024.
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
283
Volume 19, 2024
by generating tuples in combinations. Within this
model, multiple potential worlds can be generated at
a specific timestamp, which leads to a significant
increase in the number of tuples as the size of the
sliding window expands, [15].
Processing continuous Top-k queries over a
SWM is a complex problem when dealing with
uncertain data streams, [16], [17], [18]. To maintain
an up-to-date SWM, it is crucial to continuously
process sliding window queries and alert users of any
changes in query results, [19]. Therefore, this study
focuses on exploring the best combination of tuple
items that satisfy the scoring requirements,
probabilities, or both for the Top-k query.
After this introduction, the paper is organized as
follows. Section 2 reviews related work that serves
as a basis for identifying gaps in the research.
Section 3 highlights the study's contribution,
followed by Section 4 presents a preliminary
problem statement for better understanding. The
methodology is detailed in Section 5, including
algorithms and examples. Section 6 presents the
results of the experiment. Finally, in Section 7, we
will provide concluding remarks for this study.
2 Related Work
This study delves into the essential principles of
continuous query preferences in a centralized
environment. It focuses on various types of
continuous queries such as top-k, skyline, and top-k
dominating that DSMS executes with stream inputs
limited by sliding windows. In the literature, many
techniques have been proposed based on top-k
queries, [20], [21], [22], [23], [24], [25], [26], [27],
[28] that researchers have explored in various areas.
These areas include both centralized environments
with SWM [29], [30], [31], [31], [32], [33], [34],
Dominant Relationship Analysis (DRA) [35],
without SWM [36], [37], [38], UFIM with
UFIMTopK [39], and Top-k query over an
Incomplete Data Stream (Topk-iDS) [40], [41], [42],
[43] approach. The complexity arises from the need
to aggregate scores of candidate items and their
probabilities in UDS, [43].
2.1 Top-k Queries
To define various top-k ranking query semantics,
different parameters can be adapted, such as UTopk
[44], PT-k [44], PTk-S [44], eScore Rank [44],
Global Topk [44], UTR [44], PTD [45], among
others. UTopk focuses on probabilistic threshold
top-k queries. It utilizes user-provided probability
thresholds to filter and rank data items. PT-k
analyzes large datasets with significant uncertainty,
making it easier to derive top-k results based on
user-defined scores. Significant findings were made
regarding top-k best probability queries. This study
emphasizes the selection of probabilistic tuples with
the highest top-k scores and probabilities. The
algorithm for selecting the top-k best probabilities
demonstrates superior speed and efficiency
compared to the Probability Threshold Technique
(PT-k). In study proposed by DRA, it minimizes the
number of tuple elements needed for query
processing. By adopting this approach, they were
able to decrease the number of tuple items that
require processing and limit the generation of
potential world instances.
2.2 Sliding Window Model (SWM)
In response to continuous and uncertain user
requests, this study delves into the topic of UDS and
proposes methods for computing top-k on tuple
items. A variety of tuple item combinations are
generated by the SWM, which characterizes the
probable world context, [46]. Tuple items can be
generated in the SWM using a timestamp and
several possible worlds. However, the number of
tuple elements grows exponentially with the size of
the sliding window frame. It takes a lot of time and
energy to handle incoming and outgoing tuple
elements in rapid streams effectively. Handling
queries consisting of the most probable top-k tuple
query sets is the most difficult attempt to handle,
[46]. The method uses probabilities and scores to
select the top k tuples within each sliding window,
regardless of the number of tuple items.
2.3 Tuple-Level Uncertainty (TLU)
In 2018, a novel method was introduced for
addressing preference queries. They developed two
algorithms, UFIM and UFIMTopK, to efficiently
detect frequent item sets from uncertain data streams
based on thresholds and ranks. To facilitate efficient
top-k queries, the SAP method employs a
partitioning mechanism. Finding and keeping a
selected set of tuple items in the window frame is
essential for getting answers when the window
changes. To decrease the re-scanning interval even
further, one could reduce the re-scanning frequency
in the sliding window, mainly when high-scoring
tuple items are located within the window frame.
Topk-iDS algorithm that can determine the top-k
tuple items with the highest-ranking scores from an
incomplete data stream. Their algorithm uses a
sliding window framework that combines count-
based and time-based SWM to monitor the highest-k
tuple elements. To address issues like insufficient
information and uncertainty, their work suggests
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
284
Volume 19, 2024
using cost modeling that can involve creating
dynamic data summaries and pruning techniques to
effectively reduce the search space for Topk-iDS.
2.4 Comparative Analysis
Table 1 provides a comparative analysis of various
methodologies in continuous Top-k query
processing over uncertain data streams. It highlights
the key features, advantages, and limitations of each
approach.
Table 1. An Analysis and Comparison of
Algorithms
Method
Exac
t-k
Contai
nment
Stabilit
y
Invari
ance
Faithf
ulness
Global
Topk
Wea
k
Fail
Satisfie
d
Satisfie
d
Fail
U-Topk
Fail
Fail
Satisfie
d
Satisfie
d
Fail
K-best
ranking
with PT-k
Wea
k
Weak
Satisfie
d
Satisfie
d
Weak
UFIM
and
UFIMTo
pK
Wea
k
Satisfie
d
Weak
Satisfie
d
Weak
DRA with
Pk-Topk
Wea
k
Satisfie
d
Satisfie
d
Weak
Fail
Topk-iDS
Wea
k
Satisfie
d
Satisfie
d
Satisfie
d
Weak
Proposed
SWMTop
-kDelta
Satis
fied
Satisfie
d
Satisfie
d
Satisfie
d
Satisfie
d
Most academics have concentrated on top-k
query processing for specific data types and their
modifications, but when UDS is involved, processing
with the sliding window model approach is not
enabled. This study utilized the following categories
to structure the prior research: (i) Operates on the
concept that probability and top-k score methods are
frequently employed to characterize UDS tuple
items. (ii) Processing streaming data poses numerous
challenges, and this study seeks to address issues
arising from UDS by creating alternative scenarios.
Representing UDS using a possible world model can
be a challenging task.
3 Contributions
This study employs a SWM to investigate and
compute the top-k query on uncertain data streams.
To achieve the intended contribution, it is vital to
thoroughly analyze various considerations before
implementing the proposed framework:
The UDS model requires further examination and
analysis of its categories.
The proposed SWM needs to be evaluated for its
efficacy in handling continuous queries.
To obtain the appropriate Top-k results based on
scoring and probability, query processing is
utilized.
There's a need for optimization to decrease the
computation time and complexity of Top-k
results efficiently.
To improve the computational efficiency of
managing tuple item scores and probabilities,
examining the critical components of a suitable top-k
query method for uncertain data streams is essential.
Our proposed approach provides a systematic
process for achieving high scores and maximum
probabilities across all possible world situations.
This study introduces a new problem in processing
continuous queries that aim to find the top-k tuple
items with various fundamental characteristics. It is
crucial to optimize strategies for top-k over SWM
processing, considering the contributions highlighted
before. An empirical study was conducted on both
real-world and synthetic datasets to validate the
effectiveness of these techniques.
Therefore, this paper is crucial in top-k query
processing as it provides comprehensive knowledge,
ensuring efficient and accurate access to information
and leading to overall user satisfaction. With the
exponential growth of data, effective query
processing becomes even more critical. By
improving the retrieval process, managing complex
queries, optimizing resources, enhancing user
experience, supporting advanced features, and
ensuring data integration, query processing plays a
significant role in the success and effectiveness of
information retrieval systems. As data volumes
grow, robust top-k query processing mechanisms
will only increase, making it a critical area of focus
for researchers and practitioners.
4 Preliminary and Problem Statement
4.1 Preliminary
[47], analyzed three models for managing uncertain
data: fuzzy, evidence-oriented, and probabilistic
methods. Our study utilizes the completed model to
depict the specific example and the Uncertain Data
Model (UDM) derived from Definition 1. This
model will be called the Sliding Uncertain Data
Model (SUDM). The model is characterized by its
inclusion of Uncertainty (ALU) and Tuple Level
Uncertainty (TLU).
Definition 1 (Uncertain Data Stream, UDS): A
subscript denotes the point in time at which the tuple
item arrives. A tuple item si is a point with actual
values in d dimensions that are not overly specific.
A suitable vector S[q] = {sq-L+1, ... sq-1, sq} for a
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
285
Volume 19, 2024
subsequence of stream S defined as S = {s1, s2, ...,
si...} and the Top-k query seeks for a record vector
of size L. The starting point for indexing a tuple on
the uncertain data stream and subsequent application
of the sliding window model is denoted by N, and it
is given a series of tuples si, where 0 < i < n.
Table 2 presents a sample of UDS that contains
10 tuple items. Each item has 3 attributes, which
include a probability value. In this work, we adopt
the baseline sliding window semantics, it is defined
as follows:
Table 2. Example of an Uncertain Data Stream Set
Timestamp
(secs)
tid
Score
Probability
5
t1
80
0.3
10
t2
65
0.4
10
t3
45
0.5
15
t4
30
1
20
t5
50
0.8
20
t6
25
0.2
25
t7
40
0.5
26
t8
55
0.6
26
t9
78
0.4
30
t10
90
0.8
Definition 2 (Sliding Window Model Semantics):
For a given S = {s1, s2, ..., si...}, a windowed stream
operator operates on incoming tuples using the
sliding window, SW(S, Win), where Win represents
the window size of |SW|. Due to the existential
uncertainty of specific tuples in S, the borders of the
SWM may also appear uncertain. Subsequently, the
most recent objects generated will become invalid
after |SW| time instances. The set of UDS tuple items
within the current sliding window at time t is
represented as SW [t-|SW|+1, t]. This is indicated by
Prob(|SW(S, win)|= win) < 1. This approach follows
delta's proposed sliding window semantics
(attribute, delta) and the uncertainty-independent
model for possible world semantics.
It is necessary to group tuple items and adjust
the synopsis based on density probability with a
running timestamp. Candidates are retrieved using a
selection technique outlined below:
Definition 3 (Group Membership, GM): The tuple
items within a window are arranged into groups
based on their attribute values in the grouping list L
= {Ai, Aj, ..., An}, similar to the function of the
corresponding operation in extended relational
algebra (such as COUNT, SUM, MAX, or AVG).
Once the group membership strategy is
implemented and all tuple items are correctly
projected, the partitioning process on SWM can be
defined as follows:
Definition 4 (Sliding Window Partition, SWPa):
Top-k Segmental Set Queue and Buffer (TSQB)
notation is now activated in the partition window of
the slicing panes, where SWPa = {SWPa=v| v ϵ {t.a | t
is a tuple in SWPa}} is located. The proposed
partition-by-delta attribute [SW(S) t time - SW(S, win) t
time2 >0; where delta >= query time allocated] is used
to specify the timestamps that trigger the sliding
process. The equation is defined as SWPa=v = {tia=v | I
[0…|SWPa=v|)}. The most recent tuple item in the
SWPa=v subwindow is represented by t0a=v, while the
oldest is by tla=v. Table 3 demonstrates the
progression of the sliding window from t = 1 to t =
10.
Table 3. The results of the Bucket Instance, BTk
(with sliding window partition)
Timestamp
(secs)
Initial
Bucket, Bk
SW
Bucket Top-k, BTk
5
B2 = {R1}
1
BT1 = {R1}
10
B2 = {R2,
R3}
1
BT1 = {R2, R3}
15
B2 = {R4}
1 &
2
BT1 = {R4} BT2 = {R4,
R5}
20
B1 = {R5}; B2
= {R6};
1 &
2
BT1 = {R5, R6} BT2=
{R5, R6} *{R1, R2, R3}
expired *{R6} pruned
25
B2 = {R7};
2
BT2 = {R7}
26
B1 = {R8}; B2
= {R9};
2
BT2 = {R8, R9}
30
B1 = {R10};
2
BT2 = {R10}
To identify the top-k potential candidates, the
system needs to determine the top-k ranked scores
and the aggregated probability distribution among
various UDS within the segmentation window
frame. In this work, we adopt the Delta-based
sliding window and with Definitions 2 and 4,
SWPTop-kDelta can be defined as:
Definition 5 (Sliding Window Model Top-k Delta,
SWMTop-kDelta): Given a UDS, a ranking function
can identify the top-k tuple vectors based on a
designated threshold. This threshold generates the
score Probmax(hsi) and probability Probtoplatest(n)
across the SWM Delta-based mechanism. The
scheme continuously monitors the potential
candidates P(s[q]i) Wt, which have the highest
rankings with probability ProbSWMTop-kDelta ,
exceeding a combined threshold of Scorethresh and
Probthresh, in the following manner:
ProbSWMTop-kDelta( )=
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
286
Volume 19, 2024
> Scorethresh and
Probthresh
The SWPTop-kDelta method is an extension of
the segmentation window approach used by the
TSBQ algorithm (as defined in Definition 4). The
algorithm lists the top-k potential candidates during
the initial phase. For instance, SWPa(si) can be
created by P(s[R1-3]), P(s[R4-6]), and P(s[R7-10]).
The SWPTop-kDelta method uses the top-k vectors
and a bucket set that includes specific possible
world rules to compute the final top-k result list.
Hence, to obtain results affected by the probability
distribution p(w), a deterministic SWMTop-kDelta
query is run on all possible worlds. The
segmentation window SWPa(si) achieves the
necessary hs before recognizing each tuple item in
the sequence. To compute the probability of event
PW (W1) = {R1, R2, R4, R5} occurring, we need to
multiply the individual probabilities of each event.
To determine the highest probability of a tuple item,
we need to apply the extended generation rules of
possible world semantics to each potential candidate
within SW. The SWMTop-kDelta algorithm
computes the probabilities of the top-2 and top-3
tuple items. The process proceeds as follows:
Definition 6 (Possible World, p(w)): The possible
world rules in UDS can be restated as a group of
tuples represented by W, defined as {w1, w2, ...., wn}.
PW [0,1] represents a probability distribution
where , p(wi) > 0. The probability
of existence of W is calculated as p(wi) =
= The expression
represents the set of all possible worlds
where the sliding window SW(S, Win) is applicable,
assuming that all tuples are independent of one
another: Prob(W1) = p1*p2*p4*p5 = 0.096
Prob(W2) = p1*p2*p4*p6 = 0.024
[ = ;
]
In the context of UDS that contains semantic
possible worlds, there are 2n possible worlds in the
SW(S), where S is a set of n tuple items. For a
positive integer k and a possible world p(w), k is a
set of k tuples with the highest scores in w, called
the top-k tuple of w, denoted as Hk (w). T* is the
solution to an ongoing top-k query processing on
UDS is denoted as:
T* = arg
UDS has two sorting indices that should be
considered: the query result's score and probability.
The phenomenon is referred to as the transitivity of
supremacy and is defined as follows:
Definition 7 (Threshold Score Ranking): When
dealing with a top-k query ,, where where P, f,
and k are all greater than 0, a complete top-k vector
contains the highest aggregated probability. This
vector's total score is equivalent to Scorethresh, the
designated threshold score. Instead of the standard
ranking order, the top-k vector is utilized as an
alternative, serving as the select function Score. The
function Prob. Score(t) represents a score ranking
function.
Definition 8 (Threshold Probability Ranking): The
threshold probability, Probthresh, is the sum of
probabilities for the top-k vectors with the highest
total scores. Let S denote the possible world space
Ω, where k is a positive integer. Prob(t) is a
probability function, and Top-k(W) is a collection of
k tuple items generated from the possible world W
based on the scoring function Score(t). It can
proceed by comparing tuple item ti to tj, when
Score(ti) > Score(tj) and Pro(ti) > Pro(tj), denoted
by ti tj.
Concerning the threshold assumptions and
definitions mentioned above, the k-vectors with the
most significant values can be defined in a particular
way:
Definition 9 (Discovering the k-vectors with the
highest values): When dealing with an uncertain
top-k query, one can consider a top-k vector rule
(vi). First, a set of k-tuple items is selected based on
their ranking according to the notations of q(hsi),
Probmax(hsi), and Probtoplatest(n). The maximum
values of these items are described as follows:
1. For each iteration i, the criteria for hsi belonging
to X are considered. The expression q(hsi)
represents the chance that a v-tuple of q(hsi) is
absent. This can be summarized as [q(hsi) = 1 -
p(tj)], where p(tj) is the probability of
the original tuple item tj occurring in the possible
world.
2. The original tuple item in set hsi with the highest
probability is represented as Probmax(hsi). The
aggregated top-k vector with the highest
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
287
Volume 19, 2024
probability (which becomes a full top-k vector)
is denoted as Probtoplatest(n).
Definition 10 (Optimization using Clearing, OC):
The task entails effectively executing the OC
probclear(l) operation on a v-tuple l, with an OC-
probability denoted as Prob(l) (0 ≤ Prob(l) ≤ 1). The
purpose is to ensure that OC probclear(l) can
achieve a successful performance with a probability
of Prob(l). If probclear(l) is successful, l is
substituted with a v-tuple that consists of a single
tuple τl. If the attempt to clear the probability (τl) is
unsuccessful, τl will stay unchanged.
4.2 Problem Statement
Processing Top-k queries over uncertain data
streams has emerged as a promising approach to
developing an intuitive information system. Data's
underlying and unavoidable uncertainty is typically
attributed to unexpected and unreliable signal
readings, sensor update delays, or insufficient
expertise. Uncertainty data mainly refers to factors
such as indeterminacy, unreliability,
unpredictability, randomness, inconsistency,
variability, incompleteness, unknown bounding, and
irregularity. Our current research focuses on the
challenges and uncertainties surrounding top-k
query processing in Real-Time Traffic Management
applications. These issues are particularly prominent
and require careful examination.
Following the above research problem, we have
identified three challenges that need to be addressed
in this thesis, focusing on analyzing and computing
Top-k on uncertain data streams. Limitation 1: An
efficient implementation of a sliding window
approach is needed to handle Top-k queries on
uncertain data streams. If the sliding window model
is not employed correctly, candidate tuple items
within the window frame may cause significant
overlapping computing costs. Therefore, it is
essential to ensure that the sliding window approach
is used effectively. Limitation 2: An efficient
algorithm is needed to improve the retrieval of the
top-k query results from uncertain data streams,
particularly when computing the top-k score and
probabilities expected to have significant
computational expenses for generating the set of
possible worlds. Limitation 3: The number of
possible world instances grows exponentially,
affecting top-k continuous query processing time to
be high.
5 Proposed Top-k Query Processing
Framework
To achieve the primary goal, we offer a framework
called the SWMTop-kDelta. This framework has
three stages: In the first stage, we will use the
necessary SWM representation to convert a set of
tuple elements into a window fragment. Phase II is
essential in determining the optimal processing time
utilizing the suggested algorithm. This aspect
focuses on the complexity of top-k computation,
which is directly influenced by factors such as
continuous queries, score value, probability,
timestamp interval (delta-attribute), and possible
world rules. Phase III builds upon the information
obtained in Phases I and II. It implements
optimization techniques to prevent unnecessary top-
k computations when determining and computing
possible world vector rules.
5.1 Phase I
It addresses the challenge by introducing a
condition, expressed as [ SW(S) t time - SW(S,win) t time2
>0; where delta >= query time ]. The framework
design of Phase I is depicted in Figure 1. The study
presents the development of a method for
enhancement by utilizing the delta attribute with a
constant timestamp as a slide value.
Fig. 1: SWMTop-kDelta Flow for Utilizing SWM
Over the Initial UDS – Phase I
As illustrated in Table 4 and Figure 2, the
computations on registers R6, R5, R4, R3, R2, and
R1 are entered into the sliding window frame based
on the Delta-based sliding condition mentioned
above, with SW=1. The computation for the
expression SW=2 will be performed on registers
R10, R9, R8, R7, R5, and R4. However, registers
R3, R2, and R1 will be excluded as a later tuple item
will push them. The candidate R6 was eliminated
due to its lowest score among all candidates, making
it ineligible for the top spot.
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
288
Volume 19, 2024
Table 4. Example of tuple items from UDS
Timest
amp
(secs)
5
10
10
15
20
20
25
26
26
30
Tuple
items
R1
R2
R3
R4
R5
R5
R7
R8
R9
R1
0
score
80
65
45
30
50
50
40
55
78
90
prob
0.3
0.4
0.5
1
0.8
0.8
0.5
0.6
0.4
0.8
SW
1
1
1
1|2
1|2
1|2
2
2
2
2
index
1
2
3
4
5
5
7
8
9
10
Fig. 2: SWM (Delta-based - tuple insertion & exit)
The algorithm described in Figure 3 is based on
or-set probability distribution, scoring, and tuple-
level confidence values. In the process, each bucket
Bk will store the following information and a
timestamp: (i) The bucket number with a probability
level greater than 5.0 will be retained on the bucket
list; (ii) The bucket number with the highest score
but a probability level less than 5.0 will also be kept
on the bucket list; (iii) If a bucket contains items
expiring quickly, it will be removed if the sliding
window period is exceeded. This is because it falls
below the confidence level boundaries and has a
low-rank score.
The GML section serves two primary purposes.
Firstly, it presents the grouping of tuple items with
indexes, Ij (1 ≤ j s), over UDS. Secondly, the list of
attributes of the GML synopsis should include all
tuple items S[q] from UDS being considered for the
Top-k candidate list before the period expires. In the
selection process, Bk is used to efficiently retrieve all
tuple items (x) from the repository Bucket Top-k,
BTk SW and assign their corresponding values to
each segmentation group. This feature enables the
efficient computation of UDS attributes, where ti ϵ
SW [t-|SW|+1, t]. To maintain data tracking for each
k, an index structure called Ij will be created over SW
for characteristics. This will enable us to scan the
group bucket Bk only once. The solution presented
here eliminates the expense of refuse collection
caused by empty buckets, as BTk is always
guaranteed to possess a value. Figure 3 depicts the
procedure described. This scanning process results in
the creation of two sets, namely I BT1 {R1 dj , R2
dj, …., R6 dj} and I BT2 {R4 dj , R7 dj,
…., R10 dj}. It is important to note that the access
order is insignificant in this context.
Input: Bucket of Instance, B = {B1, B2, ..., Bi…}
Output: Bucket Top-k, BTk = {BT1, BT2, ..., BTi…}
1. Begin
2. t = 0; Initial parameters
3. While UDS is active do
4. t++;
5. Read a new data tuple (x, Bi), map to
index, d;
6. If (d is mapping to membership density
group in Bi) then
Create SW of d, and insert into BTk;
7. Else Update
BTk according to check
expired/removed x
SW from membership group in Bi
8. End While
9. End
Fig. 3: The Algorithm to Output Bucket of Top-k,
BTk
To reduce the processing complexity of
developing rules for all possible worlds to almost a
minimum, it is necessary to further segment the
sliding window between tuple items from each
bucket. The Top-k Segmental Set Queue and Buffer
(TSQB) is constructed over BTk tuple items. Since all
bucket sets are used by the SWMTop-kDelta method,
only the oldest bucket set, BTk, is required to retrieve
the top-k query results. When many sliding windows
overlap, it increases computational complexity and
takes longer to process data.
Theorem 1: When the TSQB algorithm and the
preceding algorithms are executed, each tuple item
P(s[q]) UDS will be identified as one of the Top-k
potential candidates. This will occur every time the
sliding window is moved in each composition, which
is determined by the discretization function.
Proof 1: To minimize the processing complexity
associated with creating rules for all possible worlds,
it is essential to further divide the sliding window
between tuple items within each bucket. Our
proposed SWMTop-kDelta approach utilizes all
bucket sets, but only the oldest bucket set, BTk, is
necessary for retrieving the top-k query results. The
Top-k Segmental Set Queue and Buffer (TSQB)
method is built on BTk tuple components, as seen in
Figure 4 and Figure 5.
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
289
Volume 19, 2024
1. Indexing
2. Tuple Items
3. Panes Breakdown, p1
4. Segmentation, p2 (p1 = Slidingp2)
5. Periodic Top-k
(potential candidates)
6. Segmental Buffer
7. Discretization
Function, fdisc
Fig. 4: An Overview of TSQB
Input: Bucket Top-k, BTk = {BT1, BT2, ..., BTi…}
Output: Top-k Segmental Sets Queue, Top-k Ψ and Buffer, SB(Sl1) =
{SB(Sl1), …, SB(Sln)…}
1. Begin Tuple set si = ø; segmental set queue, Ψ = ; let B be a buffer
with size kH;
2. t = 0; Initial parameters from si
3. For (each arriving tuple t from BTk )
4. Group-by tuple t with tid; //partition with summation tid is
generated;
5. insert t into si;
6. Begin
7. If (successfully create a segmental sets SWPa(si) for si)
8. append SWPa(si) to Ψ;
9. remove tuples in si older than t′′ (including t′′), where
t′′ is the oldest tuple in SWPa(si);
10. For (each segmental sets SWPa(si) Ψ from new to old)
11. Execute Final SWMTop-kDelta Algorithm;
12. If (t become highest ranked tuple in SWPa(si))
13. Update SWPa(si)
14. insert t into SB;
15. If (SB is full);
16. find the smallest i such that SBi // Admits a
segmental set;
17. starting from i, build SWPa on SB;
18. update the existing SWPa;
19. SB = ;
20. If (SWPa(si) is affected)
21. update SWPa(si);
22. Else
23. break;
24. If (the expiring tuple SWPa(si))
25. remove SWPa(si) from Ψ; //remove Expired segmental
sets queue in
26. SWPa(si) := first segmental sets queue in Ψ;
27. compute the array r on the new SWPa(si)
28. End
29.End
Fig. 5: The TSQB algorithm
The acquisition and utilization of TSQB
properties are necessary for the subsequent stages of
the SWMTop-kDelta algorithm's execution. The Top-
k potential candidates are obtained from the BTk
bucket by performing the following operations:
1. Property 1 (Breakdown): The number of
subwindows in |SWPa| is represented by SWPa
and a sub-window linked to the value of the
partition-by attribute v is expressed as SWPa=v.
2. Property 2 (Hybrid Method): The technique of
segmental slicing is an orthogonal approach that
can be employed with the buffer set to enable an
implicit class of windows, which is a superclass
of periodic windows, while still maintaining the
continuous appearance of the SWM.
3. Property 3 (Buffering): The first component,
contains the current top-k potential candidates
set, which satisfies the condition PrSWMTop-kDelta
(S[q]) > α); Secondly, data buffering synopsis
built on SWPa, where SB(Sln) represents the set
of tuple items that are not currently top-k
potential candidates where 0 < PrSWMTop-kDelta
(S[q]) α holds at delta attribute).
Table 5 clearly shows the total TSBQ results for
segmentation and merging multiple overlapping SW.
An example of this is the SWPa(si) which is
established by P(s[R1-3]), P(s[R4-6]), P(s[R7-10]),
P(s[R11-13]), P(s[R14-16]), and P(s[R17-19]).
During a basic evaluation segmentation, it is
observed that P(s[R4-6]) includes BT1 = {R4, R5,
R6} and BT2 = {R4, R5, R6}. As a result, it is
necessary to invoke this function twice, once for
SW=1 and once for SW=2. Increased window overlap
results in a decrease in the number of operations
required for each window merging process.
Table 5. The results of the SWPa(si)
Potential
Candidates
SW
Bucket Top-k, BTk
Segmental
Buffer, SBk
P(s[R1-3])
1
BT1 = {R1, R2, R3}
SB1 = {R1, R2,
R3, R4, R5, R6}
*{R6} pruned
P(s[R4-6])
1&2
BT1 = {R4, R5, R6}
BT2 = {R4, R5, R6}
P(s[R7-10])
2&3
BT2 = {R7, R8, R9, R10}
SB2 = {R7, R8,
R9, R10}
P(s[R11-13])
3
BT3 = {R11, R12, R13}
SB3 = {R11,
R12, R13, R14,
R15, R16}
*{R13} pruned
P(s[R14-16])
3&4
BT3 = {R14, R15, R16}
BT4 = {R14, R15, R16}
P(s[R17-19])
4
BT4 = {R17, R18, R19}
SB4 = {R17,
R18, R19}
*{R19} pruned
5.2 Phase II
Phase II will first clarify the explanation and actions
in obtaining the top-k potential candidates. This
phase proposed three key elements, and each case
was specifically designed to address a different score
and probability scenario in SWM. The proposed
framework can be implemented by following the
procedures depicted in Figure 6. The utilization of
Tuple-level Uncertainty (TLU) to generate possible
world semantics, reveals that the UDS encompasses
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
290
Volume 19, 2024
a total of 12 possible worlds (illustrated in Table 6
based from Table 2). The feasibility of providing
answers to top-k queries relies on probability-based
considerations.
Fig. 6: The Proposed SWMTop-kDelta Framework
for Utilizing SWM Over the Initial UDS – Phase II
Table 6. Evaluation of query Q: σscore≥40 (UDS) and
within initial 20 seconds, which returns tuple items
whose score (speed) is >= 40
possible
world p(w)
probability of W
possible
answer
answer probability
{t1, t2, t4, t5}
p1*p2*p4*p5 = 0.096
A1 = {t1,
t2, t5}
P1 = 0.096+0.024
= 0.12
{t1, t2, t4, t6}
p1*p2*p4*p6 = 0.024
{t1, t3, t4, t5}
p1*p3*p4*p5 = 0.12
A2 = {t1,
t3, t5}
P2 = 0.12+0.03
= 0.15
{t1, t3, t4, t6}
p1*p3*p4*p6 = 0.03
{t1, t4, t5}
p1*(1-p2-p3)*p4*p5
= 0.024
A3 = {t1,
t5}
P3 = 0.024+0.006
= 0.03
{t1, t4, t6}
p1*(1-p2-p3)*p4*p6
= 0.006
{t2, t4, t5}
(1-p1)*(p2)*p4*p5 =
0.224
A4 = {t2,
t5}
P4 = 0.224+0.056
=0.28
{t2, t4, t6}
(1-p1)*(p2)*p4*p6 =
0.056
{t3, t4, t5}
(1-p1)*(p3)*p4*p5 =
0.28
A5 = {t3,
t5}
P5 = 0.28+0.07
= 0.35
{t3, t4, t6}
(1-p1)*(p3)*p4*p6 =
0.07
{t4, t5}
(1-p1) *(1-p2-
p3)*p4*p5 = 0.056
A6 = {t5}
P6 = 0.056+0.014
0.07
{t4, t6}
(1-p1) *(1-p2-
p3)*p4*p6 = 0.014
Theorem 2: If the stopping condition Probtoplatest(n) is
met, it is possible to identify the top-k vector with
the highest combined probability of being the top-k
answer for a v-relation of X.
Probtoplatest(n)
{ Probmax(hsi), q(hsi) }
Proof 2: To obtain the top-k result list with the
highest aggregated probability full vector (FV), it is
necessary to realign the top-k potential candidates
for generating the k-ReduceSet for UDS. This is
accomplished by using a data structure known as
ReducePSW where it captures and records the q(hsi)
tuple items for each member of the segmental sets
SWPa(si) in UDS. The k-ReduceSet can be obtained
by scanning the UDS once within a sliding window
segment. The time and space complexity of the
process is O(|UDS|2). Using examples from Table 7
and Table 8, the summary of the k-ReduceSet can be
explained as follows:
i. P(s[R1-3]): 1-ReduceSet = {t1}; 2-ReduceSet =
{t1, t2}; k- ReduceSet = {t1, t2, t3} where k ≥ 3;
ii. P(s[R4-6]): 1-ReduceSet = {t5}; 2-ReduceSet =
{t4, t5}; k- ReduceSet = {t4, t5, t6}; where k 3.
Table 7. The New Vector of Combination Tuple
Items in vi
Potential
Candidates
Bucket Top-k,
BTk
Pair Combination of
(Score(vi), Prob(vi)) Tuples
in vi
P(s[R1-3])
BT1 = {t1, t2, t3}
v1 = (80, 0.3); v2 = (65,
0.28); v3 = (45, 0.07);
P(s[R4-6])
BT2 = {t4, t5, t6}
v4 = (30, 0.014); v5 = (50,
0.336); v6 = (25, 0);
Table 8. The process of generating k-ReduceSet
Position
Initiali
ze
Retrie
ve
{t1}
Retrie
ve
{t2}
Retrie
ve
{t3}
ReducePSW [t1]
Ø
Ø
Ø
Ø
ReducePSW [t2]
Ø
Ø
{t1}
{t1}
ReducePSW [t3]
Ø
Ø
Ø
{t1, t2}
Position
Initiali
ze
Retrie
ve
{t4}
Retrie
ve
{t5}
Retrie
ve
{t6}
ReducePSW [t4]
Ø
{t5}
Ø
{t5}
ReducePSW [t5]
Ø
Ø
Ø
Ø
ReducePSW [t6]
Ø
Ø
Ø
{t4, t5}
Theorem 3: The top-k result list comprises tuple
items with the highest ranked score and aggregated
probability combination of two top-k vectors, vi and
vj. These combinations are represented by (Score(vi),
Prob(vi)) and (Score(vj), Prob(vj)), respectively.
Assuming that vi outperforms vj, denoted as vi > vj,
the algorithm ensures the logical chain of conditions
as described below:
i. Score(vi) > Score(vj) where Prob(vi) == Prob(vj)
ii. Score(vi) == Score(vj) where Prob(vi) > Prob(vj)
iii. Score(vi) > Score(vj) where Prob(vi) > Prob(vj)
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
291
Volume 19, 2024
Proof 3: This algorithm is represented by
{(Score(v1), Prob(v1), …, (Score(vn), Prob(vn))},
which combines the top-k ranked scores and
aggregated probabilities for a set of n likely top-k
vector rules {vi, ..., vn}. Therefore, to establish the
supremacy of vi as the top-k vector rule p(w), it is
essential to demonstrate that there is no other vector
vj {v1, ..., vn}, where vj vi, that surpasses vi in
supremacy. The proposed SWMTop-kDelta
algorithm approach is depicted in Figure 7. This
approach builds upon the prior description in Phase I
and demonstrates superior performance compared to
the Naive algorithm method.
The operation commences by executing the
proposed window segmentation for every incoming
tuple item t from BTk as the sliding window
progresses with the initial state of v-tuple items T. If
Prob(hs) < 1, the v-tuple is unlikely to occur, and
none of the elements in hs will perform. To compute
possible permutations of world rules for the variables
Score(v'') and Prob(v''), you need to add the list of
array hs to v. This completes the cycle of generating
new vector rules for v''. Moreover, Prob(v) X (1 -
Prob(hs)), where (1 - Prob(hs)), where (1 -
Prob(hs)) represents an ideal set of virtual tuple
items possessed by hs.
Input: v-tuples T (Onsite tuples of Table 8)
Output: Top-k vectors rules of p(w) has the supreme ranked
score
1. Begin Tuple set of Score(v0) = 0; Prob(v0) = 1;
2. NFV = {v0} //not full vector set (length < k);
3. NFV' = {};
4. FV = {} // full vector set (length = k);
5. Computation Theorem 2 // means hs performs
6. While NFV {}
7. For (each arriving v-tuple hs from BTk) do
8. For each tuple t in v-tuple hs do
9. For each not full vector v in NFV do
10. Append t to v to get a new vector v'
11. Compute Score(v') and Prob(v') for Top-2
and Top-3
12. Score(v') = Score(v) + Score(t)
13. Prob(v') = Prob(v) + Prob(t)
14. Computation Theorem 3 // Make sure that v’
is a redundant vector
15. If (v') is not being redundant
16. If (v') is not full vector rules then
17. Add v' to NFV’
18. Else
19. Add v' to FV
20. End if
21. Else
22. Computation ReducePSW (v')
23. Computation Pruning Strategies (v' from (t))
24. End if
25. End for
26. End for
27. If Prob(hs) < 1
28. Append hs to v to acquire new vector rules of
v'' // means that none of t Є hs performs
29. Compute Score(v'') and Prob(v'') for Top-2
and Top-3
30. Score(v'') = Score(v) + Score(t)
31. Prob(v'') = Prob(v) x (1 - Prob(hs))
32. Computation Theorem 3 // Make sure that
v’ is a redundant vector
33. If (v'') is not a redundant vector
34. If (v’') is not full vector rules then
35. Add v'' to NFV’
36. Else
37. Add v'' to FV
38. End if
39. Else
40. Computation ReducePSW (v')
41. Computation Pruning Strategies (v' from
(t))
42. End if
43. End if
44. NFV = NFV'
45. NFV' = {}
46. End for
47. End While
48. Computation Theorem 4 // Where supreme ranked score of
each vj Є FV will return vector rules of having the highest
ranked score
Fig. 7: The main algorithm of the framework for
SWMTop-kDelta query answering – Phase II
Upon arrival, the probability method determines
whether an object should be included in the top-k
candidate objects for a given query, as indicated in
Table 9. The segmentation window SWPa(si)
achieves the necessary hs before recognizing each
tuple item in the sequence. To compute the
probability of event PW (W1) = {R1, R2, R4, R5}
occurring, we need to multiply the individual
probabilities of each event. The probabilities for R5
and R6 must sum up to 1, and the total number of
possible worlds can be calculated by multiplying the
values of each factor: 2 x 3 x 1 x 2 = 12.
To determine the highest probability of a tuple
item, we need to apply the extended generation rules
of possible world semantics to each potential
candidate within sliding windows. The sum of the
probabilities of R5 occurring among the top-3 can be
obtained by summing 0.12+0.024+0.224+0.28+
0.056=0.704, as illustrated in Table 10. The query,
which includes a top-2 clause with a condition where
SW=1, yields R5 and R2 as the current results from
SWPa(s1) and SWPa(s2) occurrences. The SWMTop-
kDelta algorithm computes the probabilities of the
top-2 and top-3 tuple items. The top-2 probability for
R5 represents the number of possible worlds in
which R5 is ranked within the top 3 among the
possibilities of W3, W5, W7, W9, and W11. The
cumulative probabilities of R5 and R2 are presented
in the third column of Table 11, which is indicated
by bold font.
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
292
Volume 19, 2024
Table 9. The Computation Probability of PW
(SW=1) using v-Tuple, T Onsite approach
PW
Calculation
Prob.
Prob.
Top-2
Top-3
W1= R1, R2,
R4, R5
0.3 X 0.4 X 1 X
0.8
0.096
R1, R2
R1, R2,
R5
W2= R1, R2,
R4, R6
0.3 X 0.4 X 1 X
0.2
0.024
R1, R2
R1, R2,
R4
W3= R1, R3,
R4, R5
0.3 X 0.5 X 1 X
0.8
0.12
R1, R5
R1, R5,
R3
W4= R1, R3,
R4, R6
0.3 X 0.5 X 1 X
0.2
0.03
R1, R3
R1, R3,
R4
W5= R1, R4,
R5
0.3 X (1-0.4-
0.5) X 1 X 0.8
0.024
R1, R5
R1, R4,
R5
W6= R1, R4,
R6
0.3 X (1-0.4-
0.5) X 1 X 0.2
0.006
R1, R4
R1, R4,
R6
W7= R2, R4,
R5
(1-0.3) X 0.4 X
1 X 0.8
0.224
R2, R5
R2, R4,
R5
W8= R2, R4,
R6
(1-0.3) X 0.4 X
1 X 0.2
0.056
R2, R4
R2, R4,
R6
W9= R3, R4,
R5
(1-0.3) X 0.5 X
1 X 0.8
0.28
R5, R3
R5, R3,
R4
W10= R3, R4,
R6
(1-0.3) X 0.5 X
1 X 0.2
0.07
R3, R4
R3, R4,
R6
W11= R4, R5
(1-0.3) X (1-
0.4-0.5) X 1 X
0.8
0.056
R5, R4
R5, R4
W12= R4, R6
(1-0.3) X (1-
0.4-0.5) X 1 X
0.2
0.014
R4, R6
R4, R6
Table 10. SW=1 consist tuple item for Top-2 and
Top-3 probability
ID
Top-2 Prob.
Calculation
Top-2
Prob
Top-3 Prob.
Calculation
Top-3
Prob
R1
0.096+0.024+0.12
+0.03+0.024+0.00
6
0.3
0.096+0.024+0.12
+0.03+0.024+0.00
6
0.3
R2
0.096+0.024+0.22
4+0.056
0.4
0.096+0.024+0.22
4+0.056+
0.4
R3
0.03+0.28+0.07
0.38
0.12+0.03+0.28+0
.07
0.5
R4
0.006+0.056+0.07
+0.056+0.014
0.202
0.024+0.03+0.024
+0.006+0.224+0.0
56+0.28+0.07+0.0
56+0.014
0.784
R5
0.12+0.024+0.224
+0.28+0.056
0.704
0.096+0.12+0.024
+0.224+0.28+0.05
6
0.8
R6
0.014
0.014
0.006+0.056+0.07
+0.014
0.173
Table 11. Top-2 and Top-3 Result List based on
Probability
Top-2
Prob
Top-3
Prob
R5
R5
R2
R4
R3
5.3 Phase III
The focus of this phase is to propose SWM
operations to enhance the efficiency of top-k query
processing. The framework comprises two key
elements:
i. Clearing the possible world;
ii. Implementing pruning strategies on an SWM
and top-k potential candidates to compute
sort-rank tuple items.
Thus, implementing PWC as an additional
mechanism on a large scale can effectively reduce
the time complexity from exponential to polynomial.
Specifically, this reduction is achieved by
considering the size of each segmental UDS. Two
cases need to be considered as deliberated in Table
12 and Figure 8.
Table 12. The two Cases of an probclear( l)
optimization with explanation
UDS
state
Before
probclear(l)
After
probclear(l)
Explanation
Case 1
z1
z2
z3
(9
possible
world
vector
rules for
1, 2, 3)
1, 2, ..., [x
1, 2, ..., [x]
probclear(l)
is performed
for Ul times,
the
uncertainty of
l with the
number of
times
performed for
Ml
X = { 1, 2},
so it can be
Z1 = ( 1,
{t1}, {t2});
z2 = ( 2, {t3},
{t4}, {t5},
{t6}); and z3
= ( 3, {t7}).
Let z1
× z2.
If ti 1 and
probclear(l)
is successful,
the
probability
Let (l) =
{ti} is similar
to ei n. Result:
(1, {t3}),
({t1}, {t4},
{t6}), and ( 1,
2). {t5} is
cleared from
2.
To derive the
optimization
solution for
the clearing
problem, it is
necessary to
determine the
best possible
world vector
rules for
probclear(l).
Case 2:
9
possible
world
vector
rules for
1, 2, 3
1 to be
cleared
equals to
where
top-k
probability
of each v-
tuple.
Therefore,
the v-tuple
is more
likely to be
cleaned.
Let costl
represent the
computational
effort
required to
execute the
probclear(l)
function once.
Let "O"
denote the
allocated
memory,
specifically
for removing
unnecessary
vector rules.
Operation is
based on the
anticipated
optimization
query
processing
improvement
level and the
associated
cost.
The achievement of this task involves the
execution of probclear(l) between the output of the
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
293
Volume 19, 2024
clearing algorithm, depicted in Figure 9, which is
implemented based on the two cases that were
previously deliberated.
Fig. 8: Clearing operation based on Case 1 is
performed successfully
Input: Top-k Segmental Sets Queue, Top-km Ψ
Output: Zn = { 1, 2, ..., [X] }
1. Begin
2. For (each data tuple X consisting ti 1 from Top-k
Ψ)
3. Let LZl = ( 1, {t1}, {t2}, …, {tn}), the clearing data
tuples of ti
1
4. For (each data tuple X consisting ti 1)
5. Begin
6. If (successfully create 1 sets of probclear(l) for
ti)
7. Remove < LZl with low probability> tuples
with
low probability in LZl but remain single tuple
item,
Ul times
8. Append ti to v to get a new vector v'
(means probclear(l)
9. Else
10 If (1 >= Zcostl ) then only when O has
high
memory
11. Insert tuples with high probability in Z and
remain
array tuple of ti that belong to Top-k Ψ
12. Append ti to v to get any remaining vector v'
within
adjacent window
13. End
14. End
14. Bucket of Zn = probclear({ 1, 2, ..., [X]})
15. End
Fig. 9: The Clearing Algorithm with PWC Indicator
Table 13 shows a sample of the improved
Bucket of Zn. The bucket consists of a sliding
window that moves every 5 seconds. Each
movement of the window demonstrates how the
Clearing algorithm transforms tuple items, resulting
in the tuples ( 1, {t3}), ({t1}, {t4}, {t6}), and ( 1, 2).
{t5} is cleared from 2.
Table 13. Function probclear( l) of assigned a higher
rank if its score and probability surpass 0.4
id
C Task (slide for 5 secs)
Cleared
(yes|no)
Marginal Prob.
t1
If 35 secs, t1 score > t2 and
prob >= 0.4
no
y(0.5) n(0.5)
t2
If 35 secs, t2 score > t1 and
prob >= 0.4
no
y(0.4) n(0.6)
t3
If 40 secs, t3 score > t4 and
prob >= 0.4
no
y(0.5) n(0.5)
t4
If 40 secs, t4 score > t3 and
prob >= 0.4
no
y(0.5) n(0.5)
t5
If 44 secs, t5 score > t6 and
prob >= 0.4
yes
0
t6
If 44 secs, t6 score > t5 and
prob >= 0.4
no
y(0.7) n(0.3)
t7
If 50 secs, t7 score > tn+1
and prob >= 0.4
no
y(1) n(0)
The concept of pruning tuple items within a
sliding window can be explained through different
methods of identifying the top-k (top-2 and top-3)
potential candidates. In this example, the elements
indexed as 1-2 depend on their SWPa(si), and each
index is associated with two segmentation frames.
When evaluating a sliding window that moves every
5 seconds, the earliest element (R1) will be removed
when the succeeding elements (R7, R8, R9, and
R10) enter. However, when SW=1, the window will
still have the other elements. Let's assume that {R6}
is one of the elements in the sliding window
currently being evaluated for pruning. To determine
whether at least two elements within the window are
more recent and greater than {R6}, we need to
compare {R6} with each window component.
According to the given condition, it is clear that
{R6} will not be selected as one of the top two
options in the current scenario where SW=1.
To perform sliding window frame pruning, it is
essential to establish and clarify Theorems 5 and 6:
Theorem 4: Figure 10 illustrates if there are at least
RankF elements in the sequence of an array {e(m+1),
e(m+2), ..., e(N)} that are all greater than e(m), then
e(m) is not eligible to be included in the RankF set
for future incoming tuple items of the current
SWPa(si) window frame. However, it is still possible
to incorporate it within the complete sliding window
frame, which represents the total duration before the
window recommences sliding.
Proof 4: To satisfy e(m) within the sliding
window, it needs to ensure that there are at least
RankF elements in the sequence {e(m+1), e(m+2),
..., e(N)} that are greater than e(m). Additionally, we
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
294
Volume 19, 2024
require the sequence {e(m+1), e(m+2), ..., e(N)} and
RankF + 1. Assuming that the level e(m) appears
before any other element in the sequence, except for
a certain number of (m-1) arbitrary elements, we can
determine this by comparing the timestamp attributes
for the exit policy. In this scenario, the iterations of
the sliding window that contain e(m) will include the
sequence e(m+1), e(m+2),..., e(N)} in the order they
appear. It's important to note that the number of
iterations cannot be fewer than the sequence
{e(m+1), e(m+2),..., e(N)} in the order of time.
Therefore, in subsequent iterations of the sliding
window frame, the element e(m) will consistently be
excluded as a RankF element.
Fig. 10: The Initial sliding window frame with
elements before SWPa(si) pruning
Theorem 5: Figure 11 illustrates for each subsequent
instance of the SWPa(si), element e(m) should not
possess a RankF if there are at least (N RankF +1)
elements in the sequence e(m+1), e(m+2),..., e(N)
that are smaller than or equal to e(m) but have the
possibility of being among the top k candidates in
the entire sliding window frame, which represents
the time required before the window resumes sliding.
Proof 5: To determine the variable timestamp for
elements on RankF in the sliding window, we can
use the sliding window semantics of delta (attribute,
delta). RankF of e(m) refers to the position of e(m) in
the sliding window that contains e(m), e(m+1),
e(m+2),..., e(N) and other (m−1) elements can be
limited to a range of 1 to (N (N RankF +1)) =
RankF − 1. This is because there is a minimum of (N
RankF +1) elements in the sequence e(m+1),
e(m+2),..., e(N) that are less than or equal to e(m). In
any subsequent occurrence of the sliding window
containing e(m), the sequence e(m+1), e(m+2),...,
e(N) will also be present. This is because e(m) comes
before all the elements in the series (m+1),
e(m+2),..., e(N). Therefore, e(m) is not a part of
RankF and cannot be included.
Again, this study suggests an efficient
optimization method to process only the latest tuple,
prevent multiple scans of data sets, and enhance
memory usage. This significant contribution will be
further discussed in the following chapter dedicated
to experimentation and discussion. Table 14 provides
the symbols used in this section and concise
descriptions.
Fig. 11: The Initial sliding window frame with
elements after SWPa(si) pruning
Table 14. Symbols used in this section and their
explanations
Symbol
Description
UDS
Uncertain Data Stream
ti
Uncertain tuple item / Tuple at timestamp i
from UDSi
s
Subsequence of stream from UDSi
SW
Sliding Window Model
W
Window Size of |SW|
Score(t)
Score function of tuple item
Prob(t)
Probability function of tuple item
SWtop-
k(t, S)
Probabilities cumulative of all possible
worlds
p(w)
Possible world model rules where w is a
collection
GM
Group membership
Bk
Bucket of Instance / tuple
BTk
Bucket of potential candidates to be Top-k
SWPa
Sliding window panes where partition-by
attribute
Fdisc : T
Discretization function on Si
P(s[q])
Top-k Potential Candidates
SB(Sln)
Segmentation buffer
Top-k Ψ
Top-k segmental sets queue
d
Indexes
6 Result and Discussion
An extensive experiment was conducted to evaluate
the performance and efficacy of the SWMTop-kDelta
framework. The algorithms and baseline approaches
that have been compared are as follows:
i. The Top-k Incomplete Data Stream (Topk-
iDS) technique, is a recent approach that
utilizes a Count-based SWM to handle top-k
queries in a possible world scenario.
ii. The Dominant Relationship Analysis (DRA)
approach, determines the tuple items that are
most likely to be the top-k results in query
processing by minimizing the quantity of
data. This technique does not rely on SWM.
iii. The summaries for directly adapting the
current Count-based and Time-based sliding
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
295
Volume 19, 2024
windows prioritize recent data to generate a
list of top-k potential candidates.
Two baseline algorithms can be used as
reference points for comparing the framework:
i. The Count-based SWM determines the window's
size based on the number of events it contains
without a predetermined time duration for its
validity. This method maintains the specified
number of rows within the window frame. If a
certain number of new points, denoted as n, are
added, then the same number of the oldest
points, also n in number, are removed.
ii. The Time-based SWM uses the constant number
of active points. A point's termination time is
independent of the reaching or termination of
other points. Tuple items that have exceeded the
specified duration will be eliminated from the
window frame, regardless of whether new rows
were received.
In addition, the DRA algorithm conducted
comparisons across three (3) types of Top-k query
processing (including U-Topk, U-kRanks, and Pk-
Topk), and these entries could serve as enhanced or
optimized solutions without relying on SWM. The
DRA method is evaluated based on the number of
possible world instances and their corresponding
scores and probabilities. The Topk-iDS algorithm
specifically deals with top-k queries on an
incomplete and uncertain data stream. The data
structure used for this purpose was top-k dual layers
(TDL).
In the first phase, Phase I, we utilize the initial
data to derive a group set of tuple items. The final
top-k result list is generated by Phase II of the
SWMTop-kDelta by comparing its result lists, R (R'),
with the set of Top-k result lists produced by the
previous methods, Rp (Rp') where these variables
seem to represent the same SWMTop-kDelta result
list. The experiment is conducted in five repetitions,
and the resulting average value is recorded. The
performance metrics utilized in our experiments are
the expected SWM Top-k query result answer and
processing time.
The measurements are evaluated for multiple
parameter configurations, which include the size of
the data set (K), the parameter (k), the window size
(S), the probability threshold (d), and the K-Pruning
(r). The sizes of the available datasets, including the
synthetic data set, range from 5K (Chicago Trip
Taxi) to over 200K (AirBeijing). The dimensions of
the Chicago Trip Taxi and AirBeijing data sets are 5
and 9, respectively, while the dimension of the
AirBeijing data set is 6. The value of this parameter
is modified in multiple experiments, as outlined in
Table 15.
Table 15. The parameter settings of the synthetic and
real data sets
An evaluation was conducted on efficiency and
scalability, focusing on the number of comparisons
made between possible worlds and the processing
time required. The top-k on a UDS is influenced by
two parameters and compares the settings for the two
measures. Phase I generates the initial set of
candidate lists and collects other required
information. The Phase II component is triggered
based on the type of data modification to minimize
overlapping computation and apply an insertion/exit
policy to the relevant tuple candidates. Thus, Phase
III builds upon the information that was obtained in
Phases I and II by implementing optimization
techniques to prevent unnecessary top-k
computations when determining and computing
possible world vector rules until the potential
candidates expired from the window frame. We
assume that lower checkpoint values on Green graft
are better than higher ones when creating the
collection of successful top-k final lists.
6.1 Effect of Data size
The results are present in Figures 12(a - c). As the
number of tuple items in a data set increases, the
number of possible world comparisons required to
determine the validity of the data set also increases.
When the size of data sets reaches 20K or more, it
becomes evident that the SWMTop-kDelta algorithm
outperforms the Topk-iDS, DRA, and baseline
Count-based/Time-based SWM algorithm. The goal
is to minimize redundant computation by
implementing SWMTop-kDelta segments with an
eviction policy delta. This approach aims to prevent
the generation of unnecessary possible world rules.
Using synthetic data, the SWMTop-kDelta method
effectively decreases the number of potential world
comparisons by 35% compared to the Topk-iDS
approach. The subsequent step involves a decrease
of 60% compared to Time-based SWM and a more
than 95% reduction compared to Count-based SWM
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
296
Volume 19, 2024
and DRA algorithms. Possible world number
comparisons involve comparing the items of a tuple
that will be included in the future Top-k list of
occurrences in a SWM.
As the volume of the data set increases, the
processing time for the mentioned algorithm will
also increase. Our proposed SWMTop-kDelta
algorithm was able to cut down on the number of
possible worlds generated and based on these graft, it
is clear that it has demonstrated superior
performance by reducing overlapping computation
segments using an eviction policy delta and avoiding
generating unnecessary possible world rules.
However, the main goal is to resolve the ambiguity
caused by the uncertain data stream. Therefore, as
the data set volume grows, our proposed algorithm
maintains consistent performance, regardless of the
data set size. Assuming the candidate's tuple has
reached the first sliding window, the combined
possible world is subjected to number comparisons
with the same top-k tuple items. The number of
possible worlds required was significantly reduced
by implementing the following steps in SWMTop-
kDelta. The analysis of the numbers indicates that
SWMTop-kDelta consistently maintains its
performance across data sets of varying sizes.
(a) Synthetic
(b) Chicago Trip Taxi
(c) AirBeijing
Fig. 12 (a - c): The results of possible world number
comparisons with varying data set size
Figures 13 (a c) present the reduction in the
number of possible worlds leads to a decrease in the
time required for query processing. Both the DRA
and the baseline count-based SWM algorithm exhibit
inadequate efficiency when comparing possible
world numbers and processing time. The Topk-iDS
algorithm is considered to be the closest performance
rival to SWMTop-kDelta. The results demonstrate the
percentage improvement of SWMTop-kDelta
compared to the Topk-iDS, Time-based SWM,
Count-based SWM, and DRA algorithms, as shown
in Table 16.
(a) Synthetic
(b) Chicago Trip Taxi
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
297
Volume 19, 2024
(c) AirBeijing
Fig. 13 (a - c): The results of processing time with
varying data set size
Table 16. Percentage improvement of SWMTop-
kDelta in terms of Data Set Size
Algorithms
compared
Dataset
(independent)
Number of
possible world
comparisons
Processing
time
Topk-iDS
Synthetic
61.68%
56.74%
Chicago Trip
Taxi
61.50%
51.76%
AirBeijing
62.23%
33.76%
DRA
Synthetic
89.34%
96.26%
Chicago Trip
Taxi
89.35%
95.05%
AirBeijing
89.06%
74.73%
Count-based
Synthetic
90.86%
97.07%
Chicago Trip
Taxi
90.85%
96.20%
AirBeijing
90.52%
81.22%
Time-based
Synthetic
72.22%
80.51%
Chicago Trip
Taxi
71.86%
77.45%
AirBeijing
71.33%
58.70%
6.2 Effect of Number of Parameter (k)
The objective is to evaluate the performance
of SWMTop-kDelta in handling Top-k continuous
queries over uncertain data streams with varying
parameter values (k). We vary the parameter (k)
values as 2, 4, 6, 8, and 10. In addition, instead of
using the usual three score distributions d
u, n, e (uniform, normal, and exponential), only the
uniform distribution d u was chosen. The
dimensions for the synthetic, Chicago Trip Taxi,
and AirBeijing data sets are fixed at 5, 6, 6, and 9,
respectively. Figures 14 (a - c) display the number of
possible world comparisons that have been
accomplished. . The results of the algorithms suggest
that as the number of parameter (k) values increases,
the number of conducted possible world
comparisons also increases. This second comparison
measurement is about the effect of Number
Parameter k, where our proposed algorithm
outperformed others regarding possible world
number comparison because it remains a consistent
performance and robust scalability, even where the
number of parameter (k) values changes until it
reaches 10 value.
(a) Synthetic
(b) Chicago Trip Taxi
(c) AirBeijing
Fig. 14 (a - c): The results of possible world number
comparisons with varying numbers of parameter (k)
values
In terms of processing time, our algorithm
outperformed others when the number of parameter
(k) values increased through delta segmentation in
the eviction policy. This is because the parameter k,
which affects the top-10 values, takes longer to
compute, resulting in more potential candidates
being chosen. The processing time achieved with
SWMTop-kDelta consistent performance across
different parameter (k) values than Topk-iDS, DRA,
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
298
Volume 19, 2024
and the baseline count-based/time-based SWM
algorithms. Figures 15 (a - c) exhibit comparable
patterns from figures before, as a result of the
reduction in the number of potential world
comparisons, leading to a decrease in processing
time. However, SWMTop-kDelta consistently
achieves shorter processing times than other
algorithms, such as Topk-iDS, DRA, and baseline
count-based/time-based SWM algorithms.
(a) Synthetic
(b) Chicago Trip Taxi
(c) AirBeijing
Fig. 15 (a - c): The results of processing time with
varying number of parameter (k) values
This is due to its use of tuple item comparison
with pruning, which reduces the need to compute
potential candidate lists from the specified sliding
window. This approach is more efficient than
comparing tuple items across the entire data set. The
results of all algorithms demonstrate this
relationship. As shown in Table 17, SWMTop-kDelta
outperforms Topk-iDS, time-based SWM, count-
based SWM, and DRA algorithms regarding an
improvement percentage of over 30 % for both the
number of possible world and processing time
comparison.
Table 17. Percentage improvement of SWMTop-
kDelta in terms of Number of Parameter (k)
Algorithms
compared
Dataset
(independent)
Number of
possible world
comparisons
Processing
time
Topk-iDS
Synthetic
31.93%
41.47%
Chicago Trip
Taxi
33.68%
24.07%
AirBeijing
28.77%
29.23%
DRA
Synthetic
59.61%
65.78%
Chicago Trip
Taxi
57.29%
43.81%
AirBeijing
56.08%
40.08%
Count-based
Synthetic
48.59%
61.33%
Chicago Trip
Taxi
46.38%
47.29%
AirBeijing
44.83%
45.08%
Time-based
Synthetic
44.86%
57.95%
Chicago Trip
Taxi
31.93%
41.47%
AirBeijing
33.68%
24.07%
6.3 Effect of Window Size (W) and
Segmentation
This study examines the impact of varying the
window size (W) during segmentation on the
effectiveness of SWMTop-kDelta. It can be inferred
that the computation of window sizes can be reduced
when more window segments are created. The
eviction of a tuple item in this scenario is determined
by the attribute delta, with a value of 10 seconds.
This attribute enables a sliding window mechanism.
To maintain the delta invariant, it is essential to
calculate the combination of score and probability
before UDS timestamp. The range of window sizes
(W) is from 10 to 104.
By utilizing a dataset size of 20K and
implementing a sliding window that covers a range
of 10 segments to >= 2,000 segmentation units,
significant improvements can be observed in the
performance of SWMTop-kDelta. At this iteration, a
higher amount of pruning is applied to the intra-
window segmentation comparisons, while a lower
amount of pruning is applied to the inter-window
segmentation comparisons. It makes sense to assume
that having fewer window sizes would result in more
computational work when done in large quantities.
This is because it would be necessary to construct a
more significant number of window segments, which
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
299
Volume 19, 2024
can restrict the computational possibilities within
smaller segments.
Number of Possible World
Processing Time
(a) Synthetic
Number of Possible World
Processing Time
(b) Chicago Trip Taxi
Number of Possible World
P
rocessing Time
(c) AirBeijing
Fig. 16 (a - c): The results of the number of possible
world comparisons and processing time with varying
numbers of Window Size (W) and Segmentation
Table 18. Percentage improvement of SWMTop-
kDelta in terms of Window Size (W) and
Segmentation
Algorithms
compared
Dataset
(independent)
Number of
possible world
comparisons
Processing
time
Topk-iDS
Synthetic
46.45%
40.58%
Chicago Trip
Taxi
38.76%
25.41%
AirBeijing
33.75%
17.13%
DRA
Synthetic
92.08%
95.83%
Chicago Trip
Taxi
89.77%
91.32%
AirBeijing
87.22%
88.54%
Count-based
Synthetic
95.08%
96.96%
Chicago Trip
Taxi
93.51%
93.49%
AirBeijing
91.99%
90.24
Time-based
Synthetic
80.63%
87.08%
Chicago Trip
Taxi
75.17%
76.86%
AirBeijing
70.22%
74.44%
More pruning is done on greater intra-window
segmentation and less on more extensive inter-
window segmentation comparisons at this iteration.
Whenever we increase the number of window
segmentations while decreasing the window size, the
performance remains intact and even improves
slightly. Based on the analysis of Figure 16 (a-c) and
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
300
Volume 19, 2024
the data provided in Table 18, it can be concluded
that SWMTop-kDelta outperforms Topk-iDS, Time-
based SWM, Count-based SWM, and DRA algorithms
regarding improvement percentage.
6.4 Effect of Probability Threshold (d)
The performance of top-k algorithms in handling
continuous queries over UDS is greatly influenced
by the frequency of changes to the probability
threshold (d) and the state of the sliding window.
The process includes adjusting the probability
threshold (d) values to compute the number of
possible world comparisons and processing time.
The probability threshold rate (d) is set at 90% of the
probability attribute from each dataset. The number
of probability thresholds (d) varies in the following
manner: the initial probability is set at 0.3, 0.4, 0.6,
0.7, and finally reaches 0.8. The presented numbers
provide proof of the consistent performance of
SWMTop-kDelta as shown in Figures 17 (a - c).
Possible world comparisons rise as the
probability thresholds (d) decrease to 0.5 and below.
The comparison of processing time is significantly
impacted by the frequency of changes to the
probability threshold and the status of the sliding
window, as shown by the results of the previous
algorithms. The SWMTop-kDelta algorithm has
shown superior performance compared to Topk-iDS,
DRA, and baseline count-based/time-based SWM
algorithms. Despite its slight increase, SWMTop-
kDelta outperforms these algorithms by eliminating
unnecessary Top-k scores and probabilities
computations.
Figures 18 (a-c) illustrate the strong performance
of SWMTop-kDelta in reducing the distributions of
minimal probability threshold (d) values within the
sliding window. This reduction impacts the k
possible answers while having minimal effect on the
performance of SWMTop-kDelta. The processing
time can be improved by reducing the number of
possible world comparisons. The task involves
calculating the precise probabilities that surpass the
score threshold to generate potential candidate lists.
This approach requires significantly less effort than
comparing individual tuple items from the entire
dataset against all possible world generation rules.
The SWMTop-kDelta algorithm outperforms the
Topk-iDS, Time-based SWM, Count-based SWM,
and DRA methods in terms of improvement
percentage, as demonstrated in Table 19. Based on
percentage improvement, SWMTopk-Delta reduces
unnecessary computations by focusing on fewer
possible world vector rules and optimizing sorting
for remaining tuples using a sliding window
approach based on a combination score and
probability function.
(a) Synthetic
(b) Chicago Trip Taxi
(c) AirBeijing
Fig. 17 (a - c): The results of the number of possible
world comparisons with varying numbers of
probability thresholds (d)
(a) Synthetic
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
301
Volume 19, 2024
(b) Chicago Trip Taxi
(c) AirBeijing
Fig. 18 (a - c): The results of processing time with
varying numbers of probability thresholds (d)
Table 19. Percentage improvement of SWMTop-
kDelta in terms of Probability Threshold (d)
Algorithms
compared
Dataset
(independent)
Number of
possible world
comparisons
Processing
time
Topk-iDS
Synthetic
56.82%
68.08%
Chicago Trip
Taxi
54.51
36.48%
AirBeijing
49.99%
30.62%
DRA
Synthetic
88.23%
96.58%
Chicago Trip
Taxi
87.22%
88.17%
AirBeijing
85.33%
76.46%
Count-based
Synthetic
90.54%
97.62%
Chicago Trip
Taxi
89.70%
90.59%
AirBeijing
88.33%
83.22%
Time-based
Synthetic
70.68%
88.22%
Chicago Trip
Taxi
68.58%
64.59%
AirBeijing
65.18%
46.81%
6.5 Effect of Number of Queries
We examined the impact of the execution query
number on the performance of SWMTop-kDelta. The
construction of highly correlated probabilistic
streams, both physically and temporally, is directly
proportional to the number of continuous queries
performed. The queries are executed on data sets that
initiate every 10 seconds and continue until they are
complete. The parameter settings we employ are as
follows: the probability threshold rate is configured
to be 20% or higher, and the data sets for synthetic,
Chicago Trip Taxi, and AirBeijing are each set to
5K. Figures 19 (a) demonstrate that the growth of all
algorithms is linear as the number of single static
queries u increases. When the number of queries
reaches 20 seconds, there is a slight improvement in
the performance of SWMTop-kDelta, Topk-iDS, and
Time-based SWM. SWMTop-kDelta algorithm
outperformed previous algorithm regarding the
number of possible worlds and processing time. This
is because the more continuous queries are
performed, the more probabilistic streams highly
correlated physically and temporally are constructed.
Initially, it shows consistent performance with a
steady increase as the number of queries grows.
However, when the number of queries hits 20, the
performance of SWMTop-kDelta surpasses the
others.
Figure 19 (b) exhibits patterns comparable to
Figure 19 (a), as the processing time demonstrates a
slight increase at 20 seconds. This occurrence is
because the number of possible worlds generated
depends on the tuple items maintained throughout
the continuous query time frame. Reducing query
execution time directly leads to a decrease in
processing time. The number of possible worlds
generated for percentage improvement depends on
the tuple items maintained during the continuous
query time frame. If the query execution time is
reduced, the processing time will also be reduced.
Based on the findings presented in Table 20, it can
be observed that the SWMTop-kDelta algorithm
outperforms the Topk-iDS, the Time-based SWM, the
Count-based SWM, and the DRA algorithm.
Number of Possible World
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
302
Volume 19, 2024
Processing Time
(a) Synthetic
Number of Possible World
Processing Time
(b) Chicago Trip Taxi
Number of Possible World
Processing Time
(c) AirBeijing
Fig. 19: The results of the number of possible world
comparisons and processing time with varying
numbers of queries execution
Table 20. Percentage improvement of SWMTop-
kDelta in terms of Number of Queries
Algorithms
compared
Dataset
(independent)
Number of
possible world
comparisons
Processing
time
Topk-iDS
Synthetic
53.67%
49.03%
Chicago Trip
Taxi
44.52%
15.39%
AirBeijing
33.75%
16.36%
DRA
Synthetic
82.22%
95.55%
Chicago Trip
Taxi
76.06%
81.33%
AirBeijing
67.46%
72.58%
Count-based
Synthetic
90.10%
97.42%
Chicago Trip
Taxi
86.36%
87.97%
AirBeijing
81.16%
80.55%
Time-based
Synthetic
49.50%
78.41%
Chicago Trip
Taxi
52.31%
47.74%
AirBeijing
44.41%
42.70%
7 Conclusion and Future Work
Top-k queries are commonly employed in various
critical applications to support data analysis and
decision-making processes. This study presents the
concept of tuple items probability theory and
explores its application in creating anticipated rules
for potential scenarios using a method involving
SWM. We have already discussed the limitations of
current approaches in dealing with UDS and
semantic possibilities, and we will evaluate them
alongside our proposed methods. The main goal of
top-k queries is to provide users with relevant tuple
items based on their preferences.
The research aims to propose an efficient top-k
computation framework that can process continuous
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
303
Volume 19, 2024
queries on uncertain data streams over a SWM. We
propose a solution, named SWMTop-kDelta
framework, which consists of three main phases,
namely: Phase I - aims to transform a collection of
tuple items into a window fragment using the
required SWM representation whenever the record
vector tuple's necessary fragmentation is inserted,
processed, and discarded, Phase II - executing an
essential aspect of establishing the most efficient
processing time using the proposed algorithm, which
involves the complexity of top-k computation, which
has a direct relationship starting with the continuous
queries, score value, probability, timestamp interval
(delta-attribute), and the possible world rules, which
is exponentially correlated to the size of the data set,
and Phase III - builds upon the information that was
obtained in Phases I and II by implementing
optimization techniques to prevent unnecessary top-k
computations when determining and computing
possible world vector rules until the potential
candidates expired from the window frame. For each
phase mentioned, a framework is proposed and
designed.
Several evaluations have been conducted to
evaluate the performance and efficiency of our
proposed SWMTop-kDelta on uncertain data streams.
The experiments employ performance metrics, such
as comparisons of possible world numbers and
processing time, across different parameter settings.
These settings include data set size, number of
parameters (k), window size (W) and segmentation,
probability threshold (d), and number of queries. The
study thoroughly investigated different scenarios,
utilizing both synthetic and real data, to showcase
the efficiency and performance of the proposed
SWMTop-kDelta algorithm. Additionally, it
surpasses the baseline SWM algorithm in terms of
percentage improvement. The reported results are an
improvement exhibited by SWMTop-kDelta, as
evidenced in Table 16, Table 17, Table 18, Table 19
and Table 20, concerning the number of possible
world comparisons. The analysis reveals a
significant reduction of more than 61% in the
generation of possible world rules. Consequently,
there is an average improvement in processing time
efficiency exceeding 55% across all the performance
metrics. The research conducted in this study aims to
reduce costs and enhance the efficiency of decision-
making processes by enabling faster and more
effective results. To demonstrate the effectiveness
and significance of these studies, based on
experimental results in contributing to the existing
body of knowledge of query processing, we propose
several vital suggestions:
Enhanced Real-Time Data Processing: Our
proposed SWMTop-kDelta approach enables
efficient handling of real-time data streams by
avoiding frequent full recomputations. Through
efficiency improvements, it allows for incremental
updates, reducing computational overhead and
improving response times compared to traditional
top-k query processing methods. Efficiently
processing high-velocity data streams is crucial for
modern applications, as it allows for scalability and
improved performance.
Handling Uncertainty: Demonstrating the effective
handling of uncertainty when processing top-k
queries through a combination of score and
probability value computation. This will improve
understanding of managing and disseminating
uncertain data effectively, providing models and
techniques that can be applied to other uncertain data
processing tasks.
Algorithmic Innovations: In our research, we have
developed and tested the SWMTop-kDelta algorithm,
showcasing its effectiveness and making a valuable
contribution to algorithm design. Our algorithm
efficiently handles top-k queries within the context
of SWM on uncertain data streams, outperforming
previous research efforts.
Three (3) essential future research areas have
been identified as suggestions for other researchers
to pursue.
i. Crowdsourcing - In the context of the current
diverse, complicated, and complex data
management landscape, improving the
efficiency and effectiveness of crowdsourcing
data is of utmost importance. The presented
approach offers a novel opportunity to
incorporate human intelligence in addressing
top-k queries.
ii. Distributed Data Streams In the context of
the current diverse, complicated, and complex
landscape of data management, it is of utmost
importance to improve the efficiency and
effectiveness of crowdsourcing data. The
presented approach offers a novel opportunity
to incorporate human intelligence in
addressing top-k queries.
iii. Advanced uncertain data stream analysis in
Cloud Environment The scenario presents
a significant amount of data that is both
extensive and intriguing. This is related to the
availability of numerous regions for
exploration and discovery, made possible by
transmitting a large volume of data over the
cloud environment. The investigation of
processing large uncertain data streams and
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
304
Volume 19, 2024
incomplete and dynamic data on the cloud to
derive an advantageous top-k result list for
users is a noteworthy subject.
To ensure the reproducibility of future research
studies, it is essential to provide other researchers
with the specific details necessary to replicate our
study. These details are as follows:
Presenting a comprehensive methodology
explanation: SWM presents detailed explanations of
algorithms and data structures. The Top-kDelta
framework thoroughly explains top-k query
processing models, encompassing delta-based SWM
evaluations and score-probabilities values methods.
Open data and algorithm: We will ensure that all
datasets used and source code can be shared on the
repository website upon request, enabling other
researchers to replicate our experiments.
Comprehensive comments and documentation
accompany the code to facilitate comprehension and
utilization.
Parameter Settings: The parameter settings include
factors such as the size of the data set, the number of
parameters (k), the window size (W) and
segmentation, the probability threshold (d), and the
number of queries. These settings are crucial for
accurately reproducing our experimental conditions.
The number of possible world comparisons and
processing time are also considered dependent
factors. By providing detailed information about
these settings, we enable others to recreate our
experiments precisely.
Statistical Analysis: Thoroughly demonstrate the
statistical analyses used to validate our outcome
results, such as significance assessments and
confidence intervals. This ensures the reliability and
reproducibility of research findings, which can
benefit future research aiming to achieve similar or
improved results.
Adhering to these practices guarantees that our
research on top-k query processing for uncertain data
streams is transparent, replicable, and verifiable by
the broader research community.
Declaration of Generative AI and AI-assisted
Technologies in the Writing Process
During the preparation of this work the authors used
ChatGPT in order to acquire some information about
literature review and baseline technique example.
After using this tool/service, the authors reviewed
and edited the content as needed and take full
responsibility for the content of the publication.
References:
[1] Ilyas, I. F., Beskales, G., & Soliman, M. A.
(2008). A survey of top-k query processing
techniques in relational database systems.
ACM Computing Surveys (CSUR), 40(4),
11:1-11:58.
https://doi.org/10.1145/1391729.1391730.
[2] Fagin, R., Kumar, R., Sivakumar, D., Road,
H., & Jose, S. (2003). Comparing top k lists.
ACM-SIAM Symposium on Discrete
Algorithms (SODA), San Jose, California,
1(2003), 134–160.
https://doi.org/10.1137/S0895480102412856.
[3] Mingyi, D., & Yinju, L. (2015). An Effective
Uncertain Data Streams Top-K Query
Algorithm. The Open Automation and
Control Systems Journal, 7(1), 1549–1553.
https://doi.org/10.2174/187444430150701154
9.
[4] Aggarwal, C. C., Member, S., & Yu, P. S.
(2009). A Survey of Uncertain Data
Algorithms and Applications. IEEE
Transactions on Knowledge and Data
Engineering, 21(5), 609–623.
https://doi.org/10.1109/TKDE.2008.190.
[5] Wahab, R. A. S. R., Mohd Rum, S. N.,
Ibrahim, H., Sidi, F., & Ishak, I. (2021). A
Method for Processing Top-k Continuous
Query on Uncertain Data Stream in Sliding
Window Model. WSEAS Transactions on
Systems and Control, 16, 261–269.
https://doi.org/10.37394/23203.2021.16.22.
[6] Jin, C., Chen, L., Yu, J. X., & Lin, X. (2008).
Sliding-Window Top-k Queries on Uncertain
Streams. VLDB 08, (August 24-30), 24–30.
https://doi.org/10.14778/1453856.1453892.
[7] Sarma, A. D., Benjelloun, O., Halevy, A., and
Widom, J.:Working models for uncertain
data, in 22nd International Conference on
Data Engineering (ICDE'06), Atlanta, GA,
USA, 2006, pp. 7-7.
https://doi.org/10.1109/ICDE.2006.174.
[8] Ré, C., Letchner, J., Balazinksa, M., and
Suciu, D., Event queries on correlated
probabilistic streams, in Proceedings of the
2008 ACM SIGMOD International
Conference on Management of data, 2008,
Vancouver, BC, Canada, pp. 715-728.
https://doi.org/10.1145/1376616.1376688.
[9] Burdick, D., Deshpande, P., Jayram, T.,
Ramakrishnan, R., and Vaithyanathan, S:
OLAP over uncertain and imprecise data, in
VLDB, 2007, pp. 970-981.
https://doi.org/10.1007/s00778-006-0033-y.
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
305
Volume 19, 2024
[10] Jiang, W., Wang, T., & Wang, Z. (2020). A
Top-K Query Scheme with Privacy
Preservation for Intelligent Vehicle Network
in Mobile IoT. IEEE Access, 8, 81698–
81710.
https://doi.org/10.1109/ACCESS.2020.29909
32.
[11] Fuhr, N. and Rölleke, T.: A probabilistic
relational algebra for the integration of
information retrieval and database systems,
ACM Transactions on Information Systems
(TOIS), vol. 15, pp. 32-66, 1997.
https://doi.org/10.1145/239041.239045.
[12] Lakshmanan, L. V., Leone, N., Ross, R., and
Subrahmanian, V. S.: Probview: A flexible
probabilistic database system, ACM
Transactions on Database Systems (TODS),
vol. 22, pp. 419-469, 1997.
https://doi.org/10.1145/261124.261131.
[13] Abiteboul, S., Kanellakis, P., and Grahne, G.:
On the representation and querying of sets of
possible worlds, Theoretical computer
science, vol. 78, pp. 159-187, 1991.
https://doi.org/10.1016/0304-3975(51)90007-
2.
[14] Green, T.J., Tannen, V. (2006). Models for
Incomplete and Probabilistic Information. In:
Grust, T., Current Trends in Database
Technology EDBT 2006. EDBT 2006.
Lecture Notes in Computer Science, vol 4254.
Springer, Berlin, Heidelberg.
https://doi.org/10.1007/11896548_24.
[15] Li, L., Wang, H., Li, J., and Gao, H.: A
survey of uncertain data management,
Frontiers of Computer Science, vol. 14, pp.
162-190, 2020.
https://doi.org/10.1007/s11704-017-7063-z.
[16] Carbone, P., Katsifodimos, A., & Haridi, S.
(2019). Stream Window Aggregation
Semantics and Optimization. Encyclopedia of
Big Data Technologies,1615–1623.
https://doi.org/10.1007/978-3-319-77525-
8_154.
[17] Dallachiesa, M., Jacques-Silva, G., Gedik, B.,
Wu, K.-L., and Palpanas, T.: Sliding
windows over uncertain data streams,
Knowledge and Information Systems, vol. 45,
pp. 159-190, 2015.
https://doi.org/10.1007/s10115-014-0804-5.
[18] Chen, T., Chen, L., Oezsu, M. T., and Xiao,
N.: Optimizing multi-top-k queries over
uncertain data streams, IEEE Transactions on
Knowledge and Data Engineering, vol. 25,
pp. 1814-1829, 2012.
https://doi.org/10.1109/TKDE.2012.126.
[19] Minh, T., Le, N., Cao, J., & He, Z. (2013).
Data & Knowledge Engineering Top-k best
probability queries and semantics ranking
properties on probabilistic databases. Data &
Knowledge Engineering, 88, 248–266.
https://doi.org/10.1016/j.datak.2013.04.005.
[20] Agarwal, P. K., Sintos, S., & Steiger, A.
(2020). Efficient Indexes for Diverse Top-k
Range Queries. Proceedings of the 39th ACM
SIGMOD-SIGACT-SIGAI Symposium on
Principles of Database Systems. Association
for Computing Machinery, New York, NY,
USA, 213–227.
https://doi.org/10.1145/3375395.3387667.
[21] Gong, Z., Sun, G., Yuan, J., & Zhong, Y.
(2009). Efficient Top- k Query Algorithms
Using K -emSkyband. Infoscale Journal, 18,
288–305.
https://doi.org/10.1007/978-3-642-10485-
5_21.
[22] Ilyas, I. F., & Chang, K. C. (2007). Top- k
Query Processing in Uncertain Databases.
Proceedings of the 23rd International
Conference on Data Engineering (ICDE
2007), Istanbul (Turkey), 896–905.
https://doi.org/10.1109/ICDE.2007.367935.
[23] Khosla, C., & Kakkar, P. (2015). Top-k
Query Processing Techniques in Uncertain
Databases: A Review. International Journal
of Computer Applications, 120(20), 33–37.
https://doi.org/10.5120/21345-4358.
[24] Lin, S. (2010). Rank aggregation methods.
John Wiley & Sons, Inc,
2(September/October 2010), 555–570.
https://doi.org/10.1002/wics.111.
[25] Xiao, G., Li, K., Zhou, X., & Li, K. (2017).
Journal of Computer and System Sciences
Efficient monochromatic and bichromatic
probabilistic reverse top- k query processing
for uncertain big data. Journal of Computer
and System Sciences, 89, 92–113.
https://doi.org/10.1016/j.jcss.2016.05.010.
[26] Vlachou, A., Doulkeridis, C., Nørvåg, K.,
Vazirgiannis, M., Management, H. D., &
Query, S. (2008). On Efficient Top-k Query
Processing in Highly Distributed
Environments. Proceedings of the
International Conference on Management of
Data (ICMD08), Vancouver (Canada), 753–
764.
https://doi.org/10.1145/1376616.1376692.
[27] Xie, M., & Wood, P. T. (2013). Efficient
Top-k Query Answering using Cached
Views. ACM Journal, (March 18-22), 489–
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
306
Volume 19, 2024
500.
https://doi.org/10.1145/2452376.2452433.
[28] Zhao, K., Tao, Y., & Zhou, S. (2007).
Efficient top- k processing in large-scaled
distributed environments. Data & Knowledge
Engineering, 63, 315–335.
https://doi.org/10.1016/j.datak.2007.03.012.
[29] M. Hua, J. Pei, W. Zhang and X. Lin,
"Efficiently Answering Probabilistic
Threshold Top-k Queries on Uncertain Data,"
2008 IEEE 24th International Conference on
Data Engineering, Cancun, Mexico, 2008,
pp. 1403-1405, doi:
10.1109/ICDE.2008.4497570.
[30] Bousnina, F. E., Chebbah, M., Anis, M.,
Tobji, B., Hadjali, A., & Yaghlane, B. Ben.
(2017). On Top-k Queries over Evidential
Data. Proceedings Ofthe 19th International
Conference on Enterprise Information
Systems (ICEIS 2017), 1(Iceis), Porto,
Portugal, 106–113.
https://doi.org/10.5220/0006317701060113.
[31] Ge, S., U, L. H., Mamoulis, N., & Cheung, D.
W. (2013). Efficient All Top- k Computation
- A Unified Solution for All Top- k, Reverse
Top- k and Top- m Influential Queries. IEEE
Transactions on Knowledge and Data
Engineering, 25(5), 1015–1027.
https://doi.org/10.1109/TKDE.2012.34.
[32] Ge, T., Zdonik, S., & Madden, S. (2009).
Top-k queries on uncertain data. Proceedings
of the 2009 ACM SIGMOD International
Conference on Management of Data,
Providence, Rhode Island USA, 375–388.
https://doi.org/10.1145/1559845.1559886.
[33] Lin, C., Lu, J., Wei, Z., Wang, J., & Xiao, X.
(2017). Optimal algorithms for selecting top-
k combinations of attributes: theory and
applications. The VLDB Journal, 27, 27–52.
https://doi.org/10.1007/s00778-017-0485-2.
[34] Papadopoulos, A. N., Tiakas, E.,
Tzouramanis, T., Georgiadis, N., &
Manolopoulos, Y. (2021). Top-k Dominating
Queries BT - Skylines and Other Dominance-
Based Queries. Skylines and Other
Dominance-Based Queries, Synthesis
Lectures on Data Management ((SLDM)), pp.
63-90. https://doi.org/10.1007/978-3-031-
01876-3_4.
[35] Zhang, Z., Xie, X., & Pan, H. (2018). An
Efficient Optimization Approach for Top-k
Queries on Uncertain Data. International
Journal of Cooperative Information Systems,
27(01), 1741002.
https://doi.org/10.1142/S0218843017410027.
[36] Chen, J., & Feng, L. (2017). Efficient pruning
for top-k ranking queries on attribute-wise
uncertain datasets. Journal of Intelligent
Information Systems, 48(1), 215–242.
https://doi.org/10.1007/s10844-016-0403-x.
[37] Dai, C., Chen, L., & Chen, Y. (2012). An
efficient algorithm for top-k queries on
uncertain data streams. 2012 11th
International Conference on Machine
Learning and Applications, Boca Raton, FL,
USA, 19, 294-299.
https://doi.org/10.1109/ICMLA.2012.57.
[38] Jiang, H., Zhu, R., & Wang, B. (2020). EPF:
A General Framework for Supporting
Continuous Top-k Queries over Streaming
Data. Cognition Computing, 12, 176–194.
https://doi.org/10.1007/s12559-019-09661-z.
[39] Liu, H., Zhou, K., Zhao, P., & Yao, S. (2018).
Mining frequent itemsets over uncertain data
streams. Int. J. High Performance Computing
and Networking, 11(4), 312–321.
https://doi.org/10.1504/IJHPCN.2018.093234
[40] Ren, W., Lian, X., & Ghazinour, K. (2021).
Effective and efficient top- k query
processing over incomplete data streams.
Information Sciences, 544, 343–371.
https://doi.org/10.1016/j.ins.2020.08.011.
[41] Shen, Z., Cheema, M. A., Lin, X., Zhang, W.,
& Wang, H. (2014). A Generic Framework
for Top- k Pairs and Top- k Objects Queries
over Sliding Windows. IEEE Transactions on
Knowledge and Data Engineering, 26(6),
1349-1366.
https://doi.org/10.1109/TKDE.2012.181.
[42] Xiao, N., Chen, T., Chen, L., & Tamer, M. O.
(2013). Optimizing Multi-Top-k Queries over
Uncertain Data Streams. Transactions on
Knowledge and Data Engineering, 25(8),
1814–1829.
https://doi.org/10.1109/TKDE.2012.126.
[43] Zarko, I. P., & Zi, P. (2015). Time- and
Space-Efficient Sliding Window Top-k
Query Processsing. ACM Transactions on
Database Systems, 40(1), 1–44.
https://doi.org/10.1145/2736701.
[44] Cormode, G., & Li, F. (2009). Semantics of
Ranking Queries for Probabilistic Data and
Expected Ranks. 2009 IEEE 25th
International Conference on Data
Engineering, Shanghai, China, 16, 305-316.
https://doi.org/10.1109/ICDE.2009.75.
[45] Rai, N., & Lian, X. (2023). Distributed
probabilistic top-k dominating queries over
uncertain databases. Knowledge and
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
307
Volume 19, 2024
Information Systems, 65(11), 4939–4965.
https://doi.org/10.1007/s10115-023-01917-3.
[46] Li, L., & Wang, H. (2020). A survey of
uncertain data management. Frontiers of
Computer Science, 14(1), 162–190.
https://doi.org/10.1007/s11704-017-7063-z.
[47] Wang, Y., Li, X., Li, X., & Wang, Y. (2013).
A survey of queries over uncertain data.
Knowledge and Information Systems, 37(3),
485–530. https://doi.org/10.1007/s10115-
013-0638-6.
Contribution of Individual Authors to the
Creation of a Scientific Article (Ghostwriting
Policy)
Dr. Siti Nurulain Mohd Rum served as my advisor
and mentor. I am extremely grateful to her for all of
the hard work, invaluable research skills, helpful
support, and valuable criticism that she provided. In
addition, I would like to express my appreciation to
Professor Dr. Hamidah Ibrahim and Associate
Professor TS. Dr. Iskandar Ishak, both of whom
were members of our advisory group, for the
insightful criticism, positive reinforcement, smart
counsel, and creative suggestions that they provided.
Sources of Funding for Research Presented in a
Scientific Article or Scientific Article Itself
There are no sources of Funding from me
Conflict of Interest
The authors have no conflicts of interest to declare
that are relevant to the content of this article.
Creative Commons Attribution License 4.0
(Attribution 4.0 International, CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en
_US
WSEAS TRANSACTIONS on SYSTEMS and CONTROL
DOI: 10.37394/23203.2024.19.31
Raja Azhan Syah Raja Wahab, Siti Nurulain Mohd Rum,
Hamidah Ibrahim, Iskandar Ishak
E-ISSN: 2224-2856
308
Volume 19, 2024