Multi-Fractional Gradient Descent: A Novel Approach to Gradient
Descent for Robust Linear Regression
ROBAB KALANTARI, KHASHAYAR RAHIMI, SAMAN NADERI MEZAJIN
Finance Department
Khatam University
Tehran
IRAN
Abstract: This work introduces a novel gradient descent method by generalizing the fractional gradient descent
(FGD) such that instead of the same fractional order for all variables, we assign different fractional orders to each
variable depending on its characteristics and its relation to other variables. We name this method Multi-Fractional
Gradient Descent (MFGD) and by using it in linear regression for minimizing loss function (residual sum of
square) and apply it on four financial time series data and also tuning their hyperparameters, we can observe that
unlike GD and FGD, MFGD is robust to multicollinearity in the data and also can detect the real information in
it and obtain considerable lower error.
Key-Words: multicollinearity-gradient descent-fractional gradient descent -Multi-Fractional Gradient Descent-
Fractional calcules
Received: March 12, 2024. Revised: August 14, 2024. Accepted: September 13, 2024. Published: October 14, 2024.
1 Introduction
Fractional Calculus is concerned with the study of
fractional order integral and derivative operators over
real or complex domains, as well as their applica-
tions. Its origins may be traced back to a letter from de
l’Hospital to Leibniz in 1695. Questions like ”What
does Fractional Derivative mean?” , for instance,
”What does the derivative of order 1
4or 3of a func-
tion mean?” encouraged many talented scientists to
concentrate their efforts on this issue. In the 18th and
19th centuries. For example, consider [1], [2], [3],
[4], [5], [6], [7],[8], [9], [10], [11], [12], [13]. From
a mathematical perspective, we have found many in-
teresting publications in the last decayed related to
applications of classical fixed point theorems on ab-
stract spaces to study the existence and uniqueness
of solutions to various types of initial value problems
and boundary value problems for fractional operators
(See, e. example., [14] [15], [16], [17], [18], [19],
[20] and on the other hands there are also many ap-
plications to different science for example [21], [22],
[23], [24], [25],[26], [27].
The field of machine learning and fractional cal-
culus have each independently played vital roles in
understanding and modeling complex real-life phe-
nomena. Machine learning has emerged as a pow-
erful tool for extracting patterns and behaviors from
historical data, making it a cornerstone in various sci-
entific disciplines, while fractional calculus provides
a unique framework for describing complex dynam-
ics with non-integer-valued derivatives. Fractional
derivatives, which originated from a 17th-century in-
quiry into the concept of non-integer orders, have be-
come essential in capturing the memory and inherent
non-local behavior of systems. As these two modern-
day topics hold substantial potential for synergistic
approaches in modeling complex dynamics, this in-
troduction sets the stage for a broader exploration of
their combined potential.
Fractional calculus, often associated with its appli-
cations in physics, image processing, environmental
sciences, and even biology, introduces the concept of
memory into modeling processes. The fractional or-
der of a process is closely tied to the degree of mem-
ory exhibited by that process, making it particularly
relevant in fields where historical context and spa-
tiotemporal memory are key considerations. As re-
searchers continue to uncover the utility of fractional
derivatives in modeling complex natural phenomena,
it is evident that these derivatives provide valuable
tools to enhance machine learning approaches [28].
The integration of machine learning and fractional
calculus is a burgeoning field of research, with sev-
eral recent papers exploring their combined potential.
Fractional calculus, which involves derivatives and
integrals of non-integer order, is gaining attention due
to its ability to model complex dynamics and phenom-
ena in various fields. Machine learning, on the other
hand, is a powerful tool for data analysis and predic-
tion. The combination of these two fields can lead to
more accurate and powerful models.
A recent review paper titled ”Combining Frac-
Engineering World
DOI:10.37394/232025.2024.6.12
Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin
E-ISSN: 2692-5079
118
Volume 6, 2024
tional Derivatives and Machine Learning: A Review”
discusses the potential of combining approaches from
fractional derivatives and machine learning. The pa-
per categorizes past combined approaches into three
categories: preprocessing, machine learning and frac-
tional dynamics, and optimization. The contributions
of fractional derivatives to machine learning are man-
ifold as they provide powerful preprocessing and fea-
ture augmentation techniques, can improve physically
informed machine learning, and are capable of im-
proving hyperparameter optimization[29]
Another paper titled ”Efficient Machine Learning
and Factional Calculus Based Mathematical Model
for Early COVID Prediction” discusses the use of
fractional calculus-based models for disease predic-
tion and detection. The paper highlights that frac-
tional calculus has non-local memory characteristics,
which makes function approximation more accurate.
The authors combined mathematical models based on
fractional calculus with machine learning models for
early estimation of COVID spread[30].
A paper titled ”Machine Learning of Space-
Fractional Differential Equations” discusses the ben-
efits of implementing fractional derivatives. The pa-
per highlights that fractional derivatives allow for dis-
covering fractional-order PDEs for systems charac-
terized by heavy tails or anomalous diffusion. The
paper also mentions that a single fractional-order
archetype allows for a derivative of arbitrary order to
be learned, with the order itself being a parameter in
the regression[31].
In the paper ”Fractional differentiation and its use
in machine learning”, the authors discuss the imple-
mentation of fractional (non-integer order) differen-
tiation on real data of four datasets based on stock
prices. The paper concludes that fractional differ-
entiation plays an important role and leads to more
accurate predictions in the case of artificial neural
networks[32].
In summary, the combination of machine learn-
ing and fractional calculus is a promising area of re-
search that can lead to more accurate and powerful
models. The use of fractional calculus in machine
learning can provide powerful preprocessing and fea-
ture augmentation techniques, improve physically in-
formed machine learning, and enhance hyperparame-
ter optimization.
This work serves as the initial step in unraveling
the synergy between Gradient Descent Algorithm as
part of machine learning and fractional calculus with
stochastic nature. In section 2, we introduce Gra-
dient Descent Algorithm then in section 3, we ex-
plain the fractional Calculus and FGD. In section 4
we define MFGD. In numerical experiment we show
the result of implementing the main idea on four fi-
nance time series data( Gold, S& P500, NASDAQ,
and Dowjones) set and compare three methods (Gra-
dient Descent, FGD, and MFGD) with classical linear
regression and represent how MFGD to find informa-
tive feature. And at last, in section 6 we get a con-
clusion and shortly speak about future challenges and
works.
2 Gradient Descent Algorithm in
Linear Regression
In its simplest form, a linear regression model can be
expressed as:
Y=β0+β1X1+β2X2+. . . +βpXp+ε
The cost function for linear regression is often defined
as the mean squared error (MSE), which measures the
average squared difference between the predicted ( ˆ
Yi)
and actual (Yi) values:
J(β) = 1
2N
N
i=1
(ˆ
YiYi)2
where Nis the number of observations.
The goal of gradient descent is to minimize this
cost function by adjusting the coefficients (β).
The gradient descent algorithm starts with an ini-
tial guess for the coefficients (β) and iteratively up-
dates them by moving in the direction of the steepest
decrease in the cost function. The update rule for the
coefficients is given by:
β=βαJ(β)
where: αis the learning rate, a small positive constant
that determines the step size, J(β)is the gradient
vector of the cost function with respect to the coeffi-
cients.
The components of the gradient vector are the par-
tial derivatives of the cost function with respect to
each coefficient:
J(β) = J(β)
β0
,J(β)
β1
, . . . , J(β)
βpT
J(β)
β =1
NXT(YXβ)
The algorithm continues this process until con-
vergence, where the changes in the coefficients be-
come negligible, or a predefined number of iterations
is reached.
Engineering World
DOI:10.37394/232025.2024.6.12
Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin
E-ISSN: 2692-5079
119
Volume 6, 2024
2.1 Advantages and Considerations
Gradient descent offers several advantages in the con-
text of linear regression:
1. Scalability: It is particularly useful for large data-
sets as it processes data in small batches, making
it computationally efficient.
2. Flexibility: It can be applied to a wide range of
cost functions, not limited to the mean squared
error.
However, the choice of the learning rate is crucial. A
learning rate that is too small may result in slow con-
vergence, while a learning rate that is too large may
cause overshooting or divergence. Additionally, the
cost function should be convex to ensure that gradient
descent converges to the global minimum.
In summary, gradient descent is a powerful optimiza-
tion algorithm applied to linear regression, enabling
the model to find optimal coefficients efficiently and
effectively.
3 Fractional Calculus
In this section, we’ll explain Fractional Gradient De-
scent and more information about fractional deriva-
tive such as Riemann–Liouville fractional derivative,
Caputo’s fractional derivatives, Grünwald–Letnikov
derivative see[33].
Theorem 1. [33]Let (c, d),−∞ <c<d<+, be
an open interval in R, and [a, b](c, d)be such that
for each t[a, b]the closed ball Bba(t), with center
at tand radius ba, lies in (c, d). If x(·)is analytic
in (c, d), then
aDα
tX(t) =
k=0
(1)k1αx(k)(t)
k!(kα)Γ(1 α)(ta)kα.
3.1 Fractional Gradient Descent
In various research studies, the utilization of frac-
tional derivatives in optimization has been explored,
particularly in the formulation of a gradient vector
that incorporates fractional partial derivatives.[34] To
be more precise, fractional gradient descent in linear
regression can be defined as follows:
αJ(β) = aDα
β0J(β),aDα
β1J(β), . . . , aDα
βpJ(β)T
βk+1 =βkγαJ(βk)
Corollary 1. Riemann–Liouville fractional deriva-
tive of J(β)can be calculated by the finite terms of
Theorem 1 approximation.
Proof. Since there is a integer-order derivative in
that approximation, by calculating it we have:
0J(β)
β0=1
2N(YXβ)T(YXβ)
1J(β)
β1=1
NXT(YXβ)
2J(β)
ββT=1
NXTX
And obviously for higher partial orders i > 2
iJ(β)
βi= 0
Therefore
aDα
βJ(β) =
2
k=0
(1)k1α
k!(kα)Γ(1 α)
kJ(β)
βk(ta)kα.
4 Multi-Fractional Gradient Descent
Different studies shows that fractional gradient de-
scent performs better than standard gradient descent
in different senses like model accuracy, CPU/GPU
time consumption and number of iteration for con-
vergence.
These improvements stem from changing strict
one-order gradient to more flexible fractional-orders.
So, a natural idea for generalizing fractional gradient
descent with order αfor all ppartial derivative is to
dedicate different fractional orders to each of them
with respect to their natures and other properties that
are influenced by the order of fractional derivative.
For mathematical formulation we can define the
multi-fractional gradient descent as follows:
AJ(β) = aDα0
β0J(β),aDα1
β1J(β), . . . , aDαp
βpJ(β)T
βk+1 =βkγAJ(βk)
Where A= (α0, α1, . . . , αp).
We can generalize this definition even more by
changing the starting point awith pdifferent points
and forming the following gradient vector:
A,AJ(β) = a0Dα0
β0J(β),a1Dα1
β1J(β), . . . , apDαp
βpJ(β)T
Where A= (a0, a1, . . . , ap).
Engineering World
DOI:10.37394/232025.2024.6.12
Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin
E-ISSN: 2692-5079
120
Volume 6, 2024
5 Numerical Experiment
To assess the effectiveness of the suggested approach,
which involves using multi-fractional gradient de-
scent, we conducted experiments on four financial
datasets: S&P 500, Dow Jones, NASDAQ, and Gold.
These datasets encompass diverse sets of features.
The data spans daily closing prices from January 1,
2000, to January 1, 2023.
The dataset was divided into training and test sets,
with over 56 percent allocated for training purposes.
Specifically, the first 3000 records of the data were
used for training, and the rest were reserved for test-
ing. This partitioning strategy allows for robust eval-
uation and validation of the proposed framework’s
performance on the financial datasets.
5.1 Multi-Fractional Gradient Descent is
Robust to Multicollinearity
We designed an experiment to investigate the perfor-
mance of multi-fractional gradient descent in Linear
Regression model in dealing with multicollinearity in
the data. For this we add the first to the tenth lag of
four mentioned financial time series and evaluate the
classic, fractional and multi-fractional gradient de-
scent Linear Regression models on each of them and
at last comparing their error rates.
It is obvious that there as we have shown in the fol-
lowing plots,unlike classic LR, and somehow frac-
tional version, our Multi-Fractional model is perfectly
robust to the multicollinearity and its performance
does not ruined by increasing collinear features and
in some cases it gets better(lower) mean absolute er-
ror rate. In the presented heatmaps, it is evident that
the correlation scores between each pair of variables
exceed 0.99, indicating a high degree of correlation
between these variables. This observation prompted
us to investigate the performance of three gradient de-
scent algorithms applied to each of the four time se-
ries. The ensuing analysis includes the assessment of
error rates. Remarkably, our proposed model demon-
strates robustness in the face of collinearity, show-
casing consistent performance even when new corre-
lated lags are introduced to the dataset. Notably, the
standard fractional gradient descent outperforms the
classical gradient descent approach, emphasizing its
superiority in handling complex relationships within
the data. Despite this enhanced performance, it is
worth noting that the standard fractional gradient de-
scent remains susceptible to the challenges posed by
collinearity, presenting a nuanced trade-off between
performance and sensitivity to correlated variables.
The outcomes of the aforementioned experiment are
presented below. It is essential to highlight that each
of the three models underwent a tuning process to op-
timize their hyperparameters, ensuring the attainment
of the best possible performance for each model.
5.2 Multi-Fractional Gradient Descent
Detects the Information
In the next numerical experiment, we delve into as-
sessing the effectiveness of our proposed MFGD LR
model, highlighting its superior performance relative
to the other two models. The datasets comprises four
financial time series: S&P 500, Gold, Dow Jones, and
NASDAQ. Notably, within the domain of financial
time series analysis, the extraction of meaningful fea-
tures presents a substantial challenge. It is essential
to underscore that feature extraction falls outside the
scope of this work. The datasets for each financial in-
strument are constructed with the following features:
1. Fractional_diff_lag1: Fractional Difference at
Lag 1 - A transformation applied to time series
data to enhance stationarity and capture long-
term dependencies, promoting stability in the
dataset.
In time series analysis, fractional differencing
is a technique used to enhance stationarity by
applying fractional differences to a time series.
The fractional difference operator, denoted as
d, is defined as follows:
For a given time series {Xt}, the d-th fractional
difference is calculated as:
(1 B)d=
k=0 d
k(B)k
=
k=0 k1
i=0 (di)
k!(B)k
= 1 dB +d(d1)
2! B2
d(d1)(d2)
3! B3+. . .
The parameter ddetermines the degree of differ-
encing. When dis an integer, fractional differ-
encing reduces to the regular differencing opera-
tor.
Fractional differencing is particularly useful
for capturing long-term dependencies in time
series data. It allows for a flexible approach
to achieving stationarity without relying on
traditional differencing methods.
2. WMA (Weighted Moving Average): Weighted
Average - Averages time series data with higher
Engineering World
DOI:10.37394/232025.2024.6.12
Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin
E-ISSN: 2692-5079
121
Volume 6, 2024
Figure 1: Historical Price Plots
weights assigned to recent observations, provid-
ing emphasis on more recent trends.
3. EMA (Exponential Moving Average): Expo-
nential Average - A moving average that assigns
greater weight to recent observations, enabling
the capture of short-term trends in the data.
4. SMA (Simple Moving Average): Simple Av-
erage - An average calculated over a specified
number of past data points, effectively smooth-
ing fluctuations and revealing underlying trends.
5. RSI (Relative Strength Index): RSI - A
momentum oscillator that gauges the speed
and magnitude of price movements, signaling
potential overbought or oversold conditions in
the market.
6. Close_lag:1: Lag 1 of Closing Prices - Repre-
sents the previous day’s closing price, providing
insight into the immediate historical perfor-
mance of the financial instrument.
7. Close_difference: Closing Price Difference -
The discrepancy between consecutive closing
prices, offering a measure of the directional
movement in the financial instrument.
In our analysis, the target variable is defined as
the closing price of the day. Notably, the sum of
Close_lag:1 and Close_difference directly corre-
sponds to the closing price, encapsulating the entirety
of relevant information. This combination serves as
a comprehensive and informative set of features in
our datasets.
However, our experimental results demonstrate a
noteworthy observation: the Multi-Fractional Gra-
dient Descent Linear Regression model uniquely
exhibits the capability to fully discern and exploit
this intrinsic relationship between Close_lag:1,
Close_difference, and the closing price. In contrast,
the other two models in consideration do not exhibit
Engineering World
DOI:10.37394/232025.2024.6.12
Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin
E-ISSN: 2692-5079
122
Volume 6, 2024
Figure 2: Correlation Tables
the same level of proficiency in capturing this vital
feature. This highlights the distinctive effectiveness
of the MFGD LR model in recognizing and utilizing
the inherent information encapsulated within these
features.
The following four tables present comprehensive
summaries of the performance of three gradient
descent methods employed in linear regression. It
is important to highlight that the numerical values
in the weights and derivative order columns corre-
spond to the order of the features mentioned earlier.
Additionally, it is essential to emphasize that the
hyperparameters of all three models have been metic-
ulously tuned to achieve optimal results, aiming for
the lowest possible error for each respective model.
6 Conclusion
The paper under review provides a comprehensive
examination of linear regression models utilizing the
gradient descent method to determine optimal values
for the loss function. The initial section of the study
delves into the fundamental aspects of linear regres-
sion, emphasizing the significance of the gradient de-
scent approach in optimizing the loss function.
A critical aspect of the paper involves an in-depth ex-
ploration of various definitions of fractional deriva-
tives, setting the stage for a subsequent discussion on
their application in the context of fractional gradient
descent. This analysis contributes to the theoretical
foundation of the study and establishes the ground-
work for proposing a novel definition for fractional
gradient descent.
The innovative approach introduced in the paper is
Engineering World
DOI:10.37394/232025.2024.6.12
Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin
E-ISSN: 2692-5079
123
Volume 6, 2024
Figure 3: Performance Comparison of Tuned Models: The plot illustrates the results of the experiment, highlight-
ing the robustness of the MFGD LR to multicollinearity, the degradation of GD LR performance due to collinear-
ity, and the intermediate performance of the FGD LR model between these two scenarios. MFGD LR: Multi-
Fractional Gradient Descent Linear Regression, FGD LR:Fractional Gradient Descent Linear Regression,GD LR:
Gradient Descent Linear Regression
supported by a series of numerical experiments. The
results of these experiments reveal the efficacy of the
multi-fractional gradient descent method in enhanc-
ing the robustness of linear regression models, partic-
ularly in the presence of multicollinearity within the
dataset. This finding is crucial as it addresses a com-
mon challenge in regression analysis and provides a
potential solution to improve model performance un-
der such conditions.
Furthermore, the paper demonstrates that the multi-
fractional gradient descent method exhibits a capa-
bility to identify and leverage information embedded
in the dataset. This adaptability leads to improved
training and prediction outcomes when there is valu-
able information present in the data. The emphasis
on leveraging information for enhanced model per-
formance aligns with contemporary trends in machine
learning and contributes to the broader understanding
of gradient descent methods.
6.1 Future Works
In conclusion, our examination of the presented tables
reveals a discernible correlation between the frac-
tional derivative order (α) and the resulting weights
in the model. This implies a connection between α
and the information encoded in the independent vari-
ables (features). The investigation of this relationship
offers promising avenues for future research.
Engineering World
DOI:10.37394/232025.2024.6.12
Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin
E-ISSN: 2692-5079
124
Volume 6, 2024
Alpha Weights MAE
Classic LR 1
0.137
10.88
0.218
0.223
0.215
0.007
0.226
-0.014
Fractional LR 0.97
0.087
9.62
0.223
0.290
0.210
0.035
0.200
0.021
Multi-Fractional LR
0.98 0.018
2.84
0.85 0.055
0.74 0.129
0.82 0.069
0.76 0.032
0.53 0.728
0.8 0.023
Table 1: Gold
Alpha Weights MAE
Classic LR 1
0.129
23.65
0.214
0.226
0.198
0.041
0.262
-0.00023
Fractional LR 0.97
0.079
21.03
0.184
0.187
0.163
0.051
0.420
0.032
Multi-Fractional LR
0.61 0.011
1.97
0.58 0.057
0.96 0.011
0.82 -0.022
0.96 -0.007
0.32 0.944
0.49 0.075
Table 2: S&P 500
Alpha Weights MAE
Classic LR 1
0.097
110.77
0.219
0.226
0.198
0.070
0.275
0.012
Fractional LR 0.94
0.096
81.04
0.198
0.208
0.187
0.076
0.332
0.047
Multi-Fractional LR
0.93 0.010
11.72
0.75 0.040
0.98 0.026
0.76 0.035
0.73 0.009
0.41 0.891
0.60 0.071
Table 3: NASDAQ
Alpha Weights MAE
Classic LR 1
0.138
217.40
0.213
0.226
0.194
0.044
0.265
-0.004
Fractional LR 0.99
0.082
97.57
0.182
0.173
0.146
0.040
0.442
0.017
Multi-Fractional LR
0.81 0.012
20.64
0.94 0.024
0.97 0.029
0.96 0.026
0.98 0.004
0.41 0.911
0.60 0.073
Table 4: Dow Jones
Moreover, the applicability of the proposed gradi-
ent descent method extends beyond linear regression.
It can be seamlessly implemented in other machine
learning models, such as deep learning architectures
like recurrent neural networks (RNNs).
Another avenue for future exploration involves the
rigorous analysis of the convergence properties of
the multi-fractional gradient descent method. Under-
standing the convergence behavior is essential for es-
tablishing the reliability and efficiency of the method
across various scenarios.
It is worth noting that, due to the computation of
Engineering World
DOI:10.37394/232025.2024.6.12
Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin
E-ISSN: 2692-5079
125
Volume 6, 2024
first and second-order derivatives in the loss function
for fractional derivative calculation, the presented
method involves a higher computational burden com-
pared to classic gradient descent. Future endeavors
should focus on investigating methods to mitigate
computational demands, making the approach more
feasible for real-world applications. These consider-
ations underscore the potential for refinement and en-
hancement in the proposed gradient descent method,
paving the way for its broader adoption and practical
utility.
References:
[1] L. Euler, De progressionibus transcendentibus
seu quarum termini generales algebraice dari
nequeunt, Commentarii academiae scientiarum
Petropolitanae (1738) 36–57.
[2] P. Laplace, Théorie analytique des probabilités,
courcier, paris, Oeuvres Complètes de Laplace
7 (1812) 523–525.
[3] J. B. J. Fourier, Théorie analytique de la chaleur,
Gauthier-Villars et fils, 1888.
[4] N. H. Abel, œuvres complètes de Niels Henrik
Abel, Vol. 1, Grøndahl, 1881.
[5] A. Letnikov, : On historical development of dif-
ferentiation theory with an arbitrary index. mat.
sb. 3, 85-112 (1868).
[6] A. Letnikov, Theory of differentiation with an
arbitrary index, moscow mat (1868).
[7] A. Letnikov, On explanation of the main propo-
sitions of differentiation theory with an arbitrary
index, Sb. Math 6 (1872) 413–445.
[8] J. Liouville, Mémoire sur quelques questions
de géométrie et de mécanique, et sur un nou-
veau genre de calcul pour résoudre ces ques-
tions, 1832.
[9] J. Liouville, Mémoire sur le changement de la
variable indépendante, dans le calcul des dif-
férentielles a indices quelconques, 1835.
[10] A. K. Grunwald, Uber” begrente” derivationen
und deren anwedung, Zangew Math und Phys 12
(1867) 441–480.
[11] B. Riemann, Versuch einer allgemeinen auffas-
sung der integration und differentiation, Gesam-
melte Werke 62 (1876) (1876).
[12] H. Laurent, Sur le calcul des dérivées à indices
quelconques, Nouvelles annales de mathéma-
tiques: journal des candidats aux écoles poly-
technique et normale 3 (1884) 240–252.
[13] O. Heaviside, Iii. on operators in physical math-
ematics. part i., Proceedings of the Royal Soci-
ety of London 52 (315-320) (1893) 504–529.
[14] P. Kulczycki, J. Korbicz, J. Kacprzyk, Fractional
Dynamical Systems: Methods, Algorithms and
Applications, Vol. 402, Springer, 2022.
[15] R. P. Agarwal, Y. Zhou, Y. He, Existence of
fractional neutral functional differential equa-
tions, Computers & Mathematics with Applica-
tions 59 (3) (2010) 1095–1100.
[16] R. P. Agarwal, D. O’Regan, S. Staněk, Posi-
tive solutions for dirichlet problems of singular
nonlinear fractional differential equations, Jour-
nal of Mathematical Analysis and Applications
371 (1) (2010) 57–68.
[17] R. P. Agarwal, M. Benchohra, S. Hamani,
A survey on existence results for boundary
value problems of nonlinear fractional differen-
tial equations and inclusions, Acta Applicandae
Mathematicae 109 (2010) 973–1033.
[18] N.-e. Tatar, Mild solutions for a problem involv-
ing fractional derivatives in the nonlinearity and
in the non-local conditions, Advances in Differ-
ence Equations 2011 (2011) 1–12.
[19] K. Diethelm, N. J. Ford, Volterra integral equa-
tions and fractional calculus: do neighboring so-
lutions intersect?, The Journal of Integral Equa-
tions and Applications (2012) 25–37.
[20] D. Baleanu, K. Diethelm, E. Scalas, J. J. Tru-
jillo, Fractional calculus: models and numerical
methods, Vol. 3, World Scientific, 2012.
[21] C. Ionescu, A. Lopes, D. Copot, J. T. Machado,
J. H. Bates, The role of fractional calculus
in modeling biological phenomena: A review,
Communications in Nonlinear Science and Nu-
merical Simulation 51 (2017) 141–159.
[22] J. S. Jacob, J. H. Priya, A. Karthika, Applica-
tions of fractional calculus in science and engi-
neering, J. Crit. Rev 7 (13) (2020) 4385–4394.
[23] T.-Q. Tang, Z. Shah, R. Jan, E. Alzahrani, Mod-
eling the dynamics of tumor–immune cells in-
teractions via fractional calculus, The European
Physical Journal Plus 137 (3) (2022) 367.
[24] T. Alinei-Poiana, E.-H. Dulf, L. Kovacs, Frac-
tional calculus in mathematical oncology, Sci-
entific Reports 13 (1) (2023) 10083.
Engineering World
DOI:10.37394/232025.2024.6.12
Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin
E-ISSN: 2692-5079
126
Volume 6, 2024
[25] M. Joshi, S. Bhosale, V. A. Vyawahare, A sur-
vey of fractional calculus applications in artifi-
cial neural networks, Artificial Intelligence Re-
view (2023) 1–54.
[26] D. Baleanu, Y. Karaca, L. Vázquez, J. E.
Macías-Díaz, Advanced fractional calculus, dif-
ferential equations and neural networks: anal-
ysis, modeling and numerical computations,
Physica Scripta 98 (11) (2023) 110201.
[27] S. Shahmorad, R. Kalantari, A. Assadzadeh,
Numerical solution of fractional black-scholes
model of american put option pricing via a non-
standard finite difference method: Stability and
convergent analysis, Mathematical Methods in
the Applied Sciences 44 (4) (2021) 2790–2805.
[28] M. S. Raubitzek, K., T.Neubauer, Combining
fractional derivatives and machine learning: A
review., Entropy 25 (1) (2023) 462–467.
[29] S. Raubitzek, K. Mallinger, T. Neubauer, Com-
bining fractional derivatives and machine learn-
ing: A review, Entropy 25 (1) (2022) 35.
[30] S. K. Chandra, M. K. Bajpai, Efficient machine
learning and factional calculus based mathemat-
ical model for early covid prediction, Human-
Centric Intelligent Systems (2023) 1–13.
[31] M. Gulian, M. Raissi, P. Perdikaris, G. Kar-
niadakis, Machine learning of space-fractional
differential equations, SIAM Journal on Scien-
tific Computing 41 (4) (2019) A2485–A2509.
[32] R. Walasek, J. Gajda, Fractional differentiation
and its use in machine learning, International
Journal of Advances in Engineering Sciences
and Applied Mathematics 13 (2-3) (2021) 270–
277.
[33] R. Almeida, S. Pooseh, D. F. Torres, Computa-
tional methods in the fractional calculus of vari-
ations, World Scientific Publishing Company,
2015.
[34] Y. Chen, Q. Gao, Y. Wei, Y. Wang, Study
on fractional order gradient methods, Applied
Mathematics and Computation 314 (2017) 310–
321.
Contribution of individual authors to
the creation of a scientific article
(ghostwriting policy)
Robab Kalantari and Khashayar Rahimi developed a
novel method for gradient descent, conducted simula-
tions, performed optimization, and were responsible
for the writing and implementation of the proposed
approach.
Saman Naderi has some edit in writing.
Sources of Funding for Research Presented in a
Scientific Article or Scientific Article Itself
No funding was received for conducting this study.
Conflict of Interest
The authors have no conflicts of interest to declare
that are relevant to the content of this article.
Creative Commons Attribution License 4.0
(Attribution 4.0 International, CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en
_US
Engineering World
DOI:10.37394/232025.2024.6.12
Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin
E-ISSN: 2692-5079
127
Volume 6, 2024