Multi-Fractional Gradient Descent: A Novel Approach to Gradient

Descent for Robust Linear Regression

ROBAB KALANTARI, KHASHAYAR RAHIMI, SAMAN NADERI MEZAJIN

Finance Department

Khatam University

Tehran

IRAN

Abstract: This work introduces a novel gradient descent method by generalizing the fractional gradient descent

(FGD) such that instead of the same fractional order for all variables, we assign different fractional orders to each

variable depending on its characteristics and its relation to other variables. We name this method Multi-Fractional

Gradient Descent (MFGD) and by using it in linear regression for minimizing loss function (residual sum of

square) and apply it on four financial time series data and also tuning their hyperparameters, we can observe that

unlike GD and FGD, MFGD is robust to multicollinearity in the data and also can detect the real information in

it and obtain considerable lower error.

Key-Words: multicollinearity-gradient descent-fractional gradient descent -Multi-Fractional Gradient Descent-

Fractional calcules

Received: March 12, 2024. Revised: August 14, 2024. Accepted: September 13, 2024. Published: October 14, 2024.

1 Introduction

Fractional Calculus is concerned with the study of

fractional order integral and derivative operators over

real or complex domains, as well as their applica-

tions. Its origins may be traced back to a letter from de

l’Hospital to Leibniz in 1695. Questions like ”What

does Fractional Derivative mean?” , for instance,

”What does the derivative of order 1

4or √3of a func-

tion mean?” encouraged many talented scientists to

concentrate their efforts on this issue. In the 18th and

19th centuries. For example, consider [1], [2], [3],

[4], [5], [6], [7],[8], [9], [10], [11], [12], [13]. From

a mathematical perspective, we have found many in-

teresting publications in the last decayed related to

applications of classical fixed point theorems on ab-

stract spaces to study the existence and uniqueness

of solutions to various types of initial value problems

and boundary value problems for fractional operators

(See, e. example., [14] [15], [16], [17], [18], [19],

[20] and on the other hands there are also many ap-

plications to different science for example [21], [22],

[23], [24], [25],[26], [27].

The field of machine learning and fractional cal-

culus have each independently played vital roles in

understanding and modeling complex real-life phe-

nomena. Machine learning has emerged as a pow-

erful tool for extracting patterns and behaviors from

historical data, making it a cornerstone in various sci-

entific disciplines, while fractional calculus provides

a unique framework for describing complex dynam-

ics with non-integer-valued derivatives. Fractional

derivatives, which originated from a 17th-century in-

quiry into the concept of non-integer orders, have be-

come essential in capturing the memory and inherent

non-local behavior of systems. As these two modern-

day topics hold substantial potential for synergistic

approaches in modeling complex dynamics, this in-

troduction sets the stage for a broader exploration of

their combined potential.

Fractional calculus, often associated with its appli-

cations in physics, image processing, environmental

sciences, and even biology, introduces the concept of

memory into modeling processes. The fractional or-

der of a process is closely tied to the degree of mem-

ory exhibited by that process, making it particularly

relevant in fields where historical context and spa-

tiotemporal memory are key considerations. As re-

searchers continue to uncover the utility of fractional

derivatives in modeling complex natural phenomena,

it is evident that these derivatives provide valuable

tools to enhance machine learning approaches [28].

The integration of machine learning and fractional

calculus is a burgeoning field of research, with sev-

eral recent papers exploring their combined potential.

Fractional calculus, which involves derivatives and

integrals of non-integer order, is gaining attention due

to its ability to model complex dynamics and phenom-

ena in various fields. Machine learning, on the other

hand, is a powerful tool for data analysis and predic-

tion. The combination of these two fields can lead to

more accurate and powerful models.

A recent review paper titled ”Combining Frac-

Engineering World

DOI:10.37394/232025.2024.6.12

Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin

E-ISSN: 2692-5079

118

Volume 6, 2024

tional Derivatives and Machine Learning: A Review”

discusses the potential of combining approaches from

fractional derivatives and machine learning. The pa-

per categorizes past combined approaches into three

categories: preprocessing, machine learning and frac-

tional dynamics, and optimization. The contributions

of fractional derivatives to machine learning are man-

ifold as they provide powerful preprocessing and fea-

ture augmentation techniques, can improve physically

informed machine learning, and are capable of im-

proving hyperparameter optimization[29]

Another paper titled ”Efficient Machine Learning

and Factional Calculus Based Mathematical Model

for Early COVID Prediction” discusses the use of

fractional calculus-based models for disease predic-

tion and detection. The paper highlights that frac-

tional calculus has non-local memory characteristics,

which makes function approximation more accurate.

The authors combined mathematical models based on

fractional calculus with machine learning models for

early estimation of COVID spread[30].

A paper titled ”Machine Learning of Space-

Fractional Differential Equations” discusses the ben-

efits of implementing fractional derivatives. The pa-

per highlights that fractional derivatives allow for dis-

covering fractional-order PDEs for systems charac-

terized by heavy tails or anomalous diffusion. The

paper also mentions that a single fractional-order

archetype allows for a derivative of arbitrary order to

be learned, with the order itself being a parameter in

the regression[31].

In the paper ”Fractional differentiation and its use

in machine learning”, the authors discuss the imple-

mentation of fractional (non-integer order) differen-

tiation on real data of four datasets based on stock

prices. The paper concludes that fractional differ-

entiation plays an important role and leads to more

accurate predictions in the case of artificial neural

networks[32].

In summary, the combination of machine learn-

ing and fractional calculus is a promising area of re-

search that can lead to more accurate and powerful

models. The use of fractional calculus in machine

learning can provide powerful preprocessing and fea-

ture augmentation techniques, improve physically in-

formed machine learning, and enhance hyperparame-

ter optimization.

This work serves as the initial step in unraveling

the synergy between Gradient Descent Algorithm as

part of machine learning and fractional calculus with

stochastic nature. In section 2, we introduce Gra-

dient Descent Algorithm then in section 3, we ex-

plain the fractional Calculus and FGD. In section 4

we define MFGD. In numerical experiment we show

the result of implementing the main idea on four fi-

nance time series data( Gold, S& P500, NASDAQ,

and Dowjones) set and compare three methods (Gra-

dient Descent, FGD, and MFGD) with classical linear

regression and represent how MFGD to find informa-

tive feature. And at last, in section 6 we get a con-

clusion and shortly speak about future challenges and

works.

2 Gradient Descent Algorithm in

Linear Regression

In its simplest form, a linear regression model can be

expressed as:

Y=β0+β1X1+β2X2+. . . +βpXp+ε

The cost function for linear regression is often defined

as the mean squared error (MSE), which measures the

average squared difference between the predicted ( ˆ

Yi)

and actual (Yi) values:

J(β) = 1



i=1

(ˆ

Yi−Yi)2

where Nis the number of observations.

The goal of gradient descent is to minimize this

cost function by adjusting the coefficients (β).

The gradient descent algorithm starts with an ini-

tial guess for the coefficients (β) and iteratively up-

dates them by moving in the direction of the steepest

decrease in the cost function. The update rule for the

coefficients is given by:

β=β−α∇J(β)

where: αis the learning rate, a small positive constant

that determines the step size, ∇J(β)is the gradient

vector of the cost function with respect to the coeffi-

cients.

The components of the gradient vector are the par-

tial derivatives of the cost function with respect to

each coefficient:

∇J(β) = ∂J(β)

∂β0

,∂J(β)

∂β1

, . . . , ∂J(β)

∂βpT

∂J(β)

∂β =−1

NXT(Y−Xβ)

The algorithm continues this process until con-

vergence, where the changes in the coefficients be-

come negligible, or a predefined number of iterations

is reached.

Engineering World

DOI:10.37394/232025.2024.6.12

Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin

E-ISSN: 2692-5079

119

Volume 6, 2024

2.1 Advantages and Considerations

Gradient descent offers several advantages in the con-

text of linear regression:

1. Scalability: It is particularly useful for large data-

sets as it processes data in small batches, making

it computationally efficient.

2. Flexibility: It can be applied to a wide range of

cost functions, not limited to the mean squared

error.

However, the choice of the learning rate is crucial. A

learning rate that is too small may result in slow con-

vergence, while a learning rate that is too large may

cause overshooting or divergence. Additionally, the

cost function should be convex to ensure that gradient

descent converges to the global minimum.

In summary, gradient descent is a powerful optimiza-

tion algorithm applied to linear regression, enabling

the model to find optimal coefficients efficiently and

effectively.

3 Fractional Calculus

In this section, we’ll explain Fractional Gradient De-

scent and more information about fractional deriva-

tive such as Riemann–Liouville fractional derivative,

Caputo’s fractional derivatives, Grünwald–Letnikov

derivative see[33].

Theorem 1. [33]Let (c, d),−∞ <c<d<+∞, be

an open interval in R, and [a, b]⊂(c, d)be such that

for each t∈[a, b]the closed ball Bb−a(t), with center

at tand radius b−a, lies in (c, d). If x(·)is analytic

in (c, d), then

aDα

tX(t) =

∞



k=0

(−1)k−1αx(k)(t)

k!(k−α)Γ(1 −α)(t−a)k−α.

3.1 Fractional Gradient Descent

In various research studies, the utilization of frac-

tional derivatives in optimization has been explored,

particularly in the formulation of a gradient vector

that incorporates fractional partial derivatives.[34] To

be more precise, fractional gradient descent in linear

regression can be defined as follows:

∇αJ(β) = aDα

β0J(β),aDα

β1J(β), . . . , aDα

βpJ(β)T

βk+1 =βk−γ∇αJ(βk)

Corollary 1. Riemann–Liouville fractional deriva-

tive of J(β)can be calculated by the finite terms of

Theorem 1 approximation.

Proof. Since there is a integer-order derivative in

that approximation, by calculating it we have:

∂0J(β)

∂β0=1

2N(Y−Xβ)T(Y−Xβ)

∂1J(β)

∂β1=−1

NXT(Y−Xβ)

∂2J(β)

∂β∂βT=1

NXTX

And obviously for higher partial orders i > 2

∂iJ(β)

∂βi= 0

Therefore

aDα

βJ(β) =



k=0

(−1)k−1α

k!(k−α)Γ(1 −α)

∂kJ(β)

∂βk(t−a)k−α.

4 Multi-Fractional Gradient Descent

Different studies shows that fractional gradient de-

scent performs better than standard gradient descent

in different senses like model accuracy, CPU/GPU

time consumption and number of iteration for con-

vergence.

These improvements stem from changing strict

one-order gradient to more flexible fractional-orders.

So, a natural idea for generalizing fractional gradient

descent with order αfor all ppartial derivative is to

dedicate different fractional orders to each of them

with respect to their natures and other properties that

are influenced by the order of fractional derivative.

For mathematical formulation we can define the

multi-fractional gradient descent as follows:

∇AJ(β) = aDα0

β0J(β),aDα1

β1J(β), . . . , aDαp

βpJ(β)T

βk+1 =βk−γ∇AJ(βk)

Where A= (α0, α1, . . . , αp).

We can generalize this definition even more by

changing the starting point awith pdifferent points

and forming the following gradient vector:

∇A,AJ(β) = a0Dα0

β0J(β),a1Dα1

β1J(β), . . . , apDαp

βpJ(β)T

Where A= (a0, a1, . . . , ap).

Engineering World

DOI:10.37394/232025.2024.6.12

Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin

E-ISSN: 2692-5079

120

Volume 6, 2024

5 Numerical Experiment

To assess the effectiveness of the suggested approach,

which involves using multi-fractional gradient de-

scent, we conducted experiments on four financial

datasets: S&P 500, Dow Jones, NASDAQ, and Gold.

These datasets encompass diverse sets of features.

The data spans daily closing prices from January 1,

2000, to January 1, 2023.

The dataset was divided into training and test sets,

with over 56 percent allocated for training purposes.

Specifically, the first 3000 records of the data were

used for training, and the rest were reserved for test-

ing. This partitioning strategy allows for robust eval-

uation and validation of the proposed framework’s

performance on the financial datasets.

5.1 Multi-Fractional Gradient Descent is

Robust to Multicollinearity

We designed an experiment to investigate the perfor-

mance of multi-fractional gradient descent in Linear

Regression model in dealing with multicollinearity in

the data. For this we add the first to the tenth lag of

four mentioned financial time series and evaluate the

classic, fractional and multi-fractional gradient de-

scent Linear Regression models on each of them and

at last comparing their error rates.

It is obvious that there as we have shown in the fol-

lowing plots,unlike classic LR, and somehow frac-

tional version, our Multi-Fractional model is perfectly

robust to the multicollinearity and its performance

does not ruined by increasing collinear features and

in some cases it gets better(lower) mean absolute er-

ror rate. In the presented heatmaps, it is evident that

the correlation scores between each pair of variables

exceed 0.99, indicating a high degree of correlation

between these variables. This observation prompted

us to investigate the performance of three gradient de-

scent algorithms applied to each of the four time se-

ries. The ensuing analysis includes the assessment of

error rates. Remarkably, our proposed model demon-

strates robustness in the face of collinearity, show-

casing consistent performance even when new corre-

lated lags are introduced to the dataset. Notably, the

standard fractional gradient descent outperforms the

classical gradient descent approach, emphasizing its

superiority in handling complex relationships within

the data. Despite this enhanced performance, it is

worth noting that the standard fractional gradient de-

scent remains susceptible to the challenges posed by

collinearity, presenting a nuanced trade-off between

performance and sensitivity to correlated variables.

The outcomes of the aforementioned experiment are

presented below. It is essential to highlight that each

of the three models underwent a tuning process to op-

timize their hyperparameters, ensuring the attainment

of the best possible performance for each model.

5.2 Multi-Fractional Gradient Descent

Detects the Information

In the next numerical experiment, we delve into as-

sessing the effectiveness of our proposed MFGD LR

model, highlighting its superior performance relative

to the other two models. The datasets comprises four

financial time series: S&P 500, Gold, Dow Jones, and

NASDAQ. Notably, within the domain of financial

time series analysis, the extraction of meaningful fea-

tures presents a substantial challenge. It is essential

to underscore that feature extraction falls outside the

scope of this work. The datasets for each financial in-

strument are constructed with the following features:

1. Fractional_diff_lag1: Fractional Difference at

Lag 1 - A transformation applied to time series

data to enhance stationarity and capture long-

term dependencies, promoting stability in the

dataset.

In time series analysis, fractional differencing

is a technique used to enhance stationarity by

applying fractional differences to a time series.

The fractional difference operator, denoted as

∆d, is defined as follows:

For a given time series {Xt}, the d-th fractional

difference is calculated as:

(1 −B)d=

∞



k=0 d

k(−B)k

∞



k=0 k−1

i=0 (d−i)

k!(−B)k

= 1 −dB +d(d−1)

2! B2

−d(d−1)(d−2)

3! B3+. . .

The parameter ddetermines the degree of differ-

encing. When dis an integer, fractional differ-

encing reduces to the regular differencing opera-

tor.

Fractional differencing is particularly useful

for capturing long-term dependencies in time

series data. It allows for a flexible approach

to achieving stationarity without relying on

traditional differencing methods.

2. WMA (Weighted Moving Average): Weighted

Average - Averages time series data with higher

Engineering World

DOI:10.37394/232025.2024.6.12

Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin

E-ISSN: 2692-5079

121

Volume 6, 2024

Figure 1: Historical Price Plots

weights assigned to recent observations, provid-

ing emphasis on more recent trends.

3. EMA (Exponential Moving Average): Expo-

nential Average - A moving average that assigns

greater weight to recent observations, enabling

the capture of short-term trends in the data.

4. SMA (Simple Moving Average): Simple Av-

erage - An average calculated over a specified

number of past data points, effectively smooth-

ing fluctuations and revealing underlying trends.

5. RSI (Relative Strength Index): RSI - A

momentum oscillator that gauges the speed

and magnitude of price movements, signaling

potential overbought or oversold conditions in

the market.

6. Close_lag:1: Lag 1 of Closing Prices - Repre-

sents the previous day’s closing price, providing

insight into the immediate historical perfor-

mance of the financial instrument.

7. Close_difference: Closing Price Difference -

The discrepancy between consecutive closing

prices, offering a measure of the directional

movement in the financial instrument.

In our analysis, the target variable is defined as

the closing price of the day. Notably, the sum of

Close_lag:1 and Close_difference directly corre-

sponds to the closing price, encapsulating the entirety

of relevant information. This combination serves as

a comprehensive and informative set of features in

our datasets.

However, our experimental results demonstrate a

noteworthy observation: the Multi-Fractional Gra-

dient Descent Linear Regression model uniquely

exhibits the capability to fully discern and exploit

this intrinsic relationship between Close_lag:1,

Close_difference, and the closing price. In contrast,

the other two models in consideration do not exhibit

Engineering World

DOI:10.37394/232025.2024.6.12

Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin

E-ISSN: 2692-5079

122

Volume 6, 2024

Figure 2: Correlation Tables

the same level of proficiency in capturing this vital

feature. This highlights the distinctive effectiveness

of the MFGD LR model in recognizing and utilizing

the inherent information encapsulated within these

features.

The following four tables present comprehensive

summaries of the performance of three gradient

descent methods employed in linear regression. It

is important to highlight that the numerical values

in the weights and derivative order columns corre-

spond to the order of the features mentioned earlier.

Additionally, it is essential to emphasize that the

hyperparameters of all three models have been metic-

ulously tuned to achieve optimal results, aiming for

the lowest possible error for each respective model.

6 Conclusion

The paper under review provides a comprehensive

examination of linear regression models utilizing the

gradient descent method to determine optimal values

for the loss function. The initial section of the study

delves into the fundamental aspects of linear regres-

sion, emphasizing the significance of the gradient de-

scent approach in optimizing the loss function.

A critical aspect of the paper involves an in-depth ex-

ploration of various definitions of fractional deriva-

tives, setting the stage for a subsequent discussion on

their application in the context of fractional gradient

descent. This analysis contributes to the theoretical

foundation of the study and establishes the ground-

work for proposing a novel definition for fractional

gradient descent.

The innovative approach introduced in the paper is

Engineering World

DOI:10.37394/232025.2024.6.12

Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin

E-ISSN: 2692-5079

123

Volume 6, 2024

Figure 3: Performance Comparison of Tuned Models: The plot illustrates the results of the experiment, highlight-

ing the robustness of the MFGD LR to multicollinearity, the degradation of GD LR performance due to collinear-

ity, and the intermediate performance of the FGD LR model between these two scenarios. MFGD LR: Multi-

Fractional Gradient Descent Linear Regression, FGD LR:Fractional Gradient Descent Linear Regression,GD LR:

Gradient Descent Linear Regression

supported by a series of numerical experiments. The

results of these experiments reveal the efficacy of the

multi-fractional gradient descent method in enhanc-

ing the robustness of linear regression models, partic-

ularly in the presence of multicollinearity within the

dataset. This finding is crucial as it addresses a com-

mon challenge in regression analysis and provides a

potential solution to improve model performance un-

der such conditions.

Furthermore, the paper demonstrates that the multi-

fractional gradient descent method exhibits a capa-

bility to identify and leverage information embedded

in the dataset. This adaptability leads to improved

training and prediction outcomes when there is valu-

able information present in the data. The emphasis

on leveraging information for enhanced model per-

formance aligns with contemporary trends in machine

learning and contributes to the broader understanding

of gradient descent methods.

6.1 Future Works

In conclusion, our examination of the presented tables

reveals a discernible correlation between the frac-

tional derivative order (α) and the resulting weights

in the model. This implies a connection between α

and the information encoded in the independent vari-

ables (features). The investigation of this relationship

offers promising avenues for future research.

Engineering World

DOI:10.37394/232025.2024.6.12

Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin

E-ISSN: 2692-5079

124

Volume 6, 2024

Alpha Weights MAE

Classic LR 1

0.137

10.88

0.218

0.223

0.215

0.007

0.226

-0.014

Fractional LR 0.97

0.087

9.62

0.223

0.290

0.210

0.035

0.200

0.021

Multi-Fractional LR

0.98 0.018

2.84

0.85 0.055

0.74 0.129

0.82 0.069

0.76 0.032

0.53 0.728

0.8 0.023

Table 1: Gold

Alpha Weights MAE

Classic LR 1

0.129

23.65

0.214

0.226

0.198

0.041

0.262

-0.00023

Fractional LR 0.97

0.079

21.03

0.184

0.187

0.163

0.051

0.420

0.032

Multi-Fractional LR

0.61 0.011

1.97

0.58 0.057

0.96 0.011

0.82 -0.022

0.96 -0.007

0.32 0.944

0.49 0.075

Table 2: S&P 500

Alpha Weights MAE

Classic LR 1

0.097

110.77

0.219

0.226

0.198

0.070

0.275

0.012

Fractional LR 0.94

0.096

81.04

0.198

0.208

0.187

0.076

0.332

0.047

Multi-Fractional LR

0.93 0.010

11.72

0.75 0.040

0.98 0.026

0.76 0.035

0.73 0.009

0.41 0.891

0.60 0.071

Table 3: NASDAQ

Alpha Weights MAE

Classic LR 1

0.138

217.40

0.213

0.226

0.194

0.044

0.265

-0.004

Fractional LR 0.99

0.082

97.57

0.182

0.173

0.146

0.040

0.442

0.017

Multi-Fractional LR

0.81 0.012

20.64

0.94 0.024

0.97 0.029

0.96 0.026

0.98 0.004

0.41 0.911

0.60 0.073

Table 4: Dow Jones

Moreover, the applicability of the proposed gradi-

ent descent method extends beyond linear regression.

It can be seamlessly implemented in other machine

learning models, such as deep learning architectures

like recurrent neural networks (RNNs).

Another avenue for future exploration involves the

rigorous analysis of the convergence properties of

the multi-fractional gradient descent method. Under-

standing the convergence behavior is essential for es-

tablishing the reliability and efficiency of the method

across various scenarios.

It is worth noting that, due to the computation of

Engineering World

DOI:10.37394/232025.2024.6.12

Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin

E-ISSN: 2692-5079

125

Volume 6, 2024

first and second-order derivatives in the loss function

for fractional derivative calculation, the presented

method involves a higher computational burden com-

pared to classic gradient descent. Future endeavors

should focus on investigating methods to mitigate

computational demands, making the approach more

feasible for real-world applications. These consider-

ations underscore the potential for refinement and en-

hancement in the proposed gradient descent method,

paving the way for its broader adoption and practical

utility.

References:

[1] L. Euler, De progressionibus transcendentibus

seu quarum termini generales algebraice dari

nequeunt, Commentarii academiae scientiarum

Petropolitanae (1738) 36–57.

[2] P. Laplace, Théorie analytique des probabilités,

courcier, paris, Oeuvres Complètes de Laplace

7 (1812) 523–525.

[3] J. B. J. Fourier, Théorie analytique de la chaleur,

Gauthier-Villars et fils, 1888.

[4] N. H. Abel, œuvres complètes de Niels Henrik

Abel, Vol. 1, Grøndahl, 1881.

[5] A. Letnikov, : On historical development of dif-

ferentiation theory with an arbitrary index. mat.

sb. 3, 85-112 (1868).

[6] A. Letnikov, Theory of differentiation with an

arbitrary index, moscow mat (1868).

[7] A. Letnikov, On explanation of the main propo-

sitions of differentiation theory with an arbitrary

index, Sb. Math 6 (1872) 413–445.

[8] J. Liouville, Mémoire sur quelques questions

de géométrie et de mécanique, et sur un nou-

veau genre de calcul pour résoudre ces ques-

tions, 1832.

[9] J. Liouville, Mémoire sur le changement de la

variable indépendante, dans le calcul des dif-

férentielles a indices quelconques, 1835.

[10] A. K. Grunwald, Uber” begrente” derivationen

und deren anwedung, Zangew Math und Phys 12

(1867) 441–480.

[11] B. Riemann, Versuch einer allgemeinen auffas-

sung der integration und differentiation, Gesam-

melte Werke 62 (1876) (1876).

[12] H. Laurent, Sur le calcul des dérivées à indices

quelconques, Nouvelles annales de mathéma-

tiques: journal des candidats aux écoles poly-

technique et normale 3 (1884) 240–252.

[13] O. Heaviside, Iii. on operators in physical math-

ematics. part i., Proceedings of the Royal Soci-

ety of London 52 (315-320) (1893) 504–529.

[14] P. Kulczycki, J. Korbicz, J. Kacprzyk, Fractional

Dynamical Systems: Methods, Algorithms and

Applications, Vol. 402, Springer, 2022.

[15] R. P. Agarwal, Y. Zhou, Y. He, Existence of

fractional neutral functional differential equa-

tions, Computers & Mathematics with Applica-

tions 59 (3) (2010) 1095–1100.

[16] R. P. Agarwal, D. O’Regan, S. Staněk, Posi-

tive solutions for dirichlet problems of singular

nonlinear fractional differential equations, Jour-

nal of Mathematical Analysis and Applications

371 (1) (2010) 57–68.

[17] R. P. Agarwal, M. Benchohra, S. Hamani,

A survey on existence results for boundary

value problems of nonlinear fractional differen-

tial equations and inclusions, Acta Applicandae

Mathematicae 109 (2010) 973–1033.

[18] N.-e. Tatar, Mild solutions for a problem involv-

ing fractional derivatives in the nonlinearity and

in the non-local conditions, Advances in Differ-

ence Equations 2011 (2011) 1–12.

[19] K. Diethelm, N. J. Ford, Volterra integral equa-

tions and fractional calculus: do neighboring so-

lutions intersect?, The Journal of Integral Equa-

tions and Applications (2012) 25–37.

[20] D. Baleanu, K. Diethelm, E. Scalas, J. J. Tru-

jillo, Fractional calculus: models and numerical

methods, Vol. 3, World Scientific, 2012.

[21] C. Ionescu, A. Lopes, D. Copot, J. T. Machado,

J. H. Bates, The role of fractional calculus

in modeling biological phenomena: A review,

Communications in Nonlinear Science and Nu-

merical Simulation 51 (2017) 141–159.

[22] J. S. Jacob, J. H. Priya, A. Karthika, Applica-

tions of fractional calculus in science and engi-

neering, J. Crit. Rev 7 (13) (2020) 4385–4394.

[23] T.-Q. Tang, Z. Shah, R. Jan, E. Alzahrani, Mod-

eling the dynamics of tumor–immune cells in-

teractions via fractional calculus, The European

Physical Journal Plus 137 (3) (2022) 367.

[24] T. Alinei-Poiana, E.-H. Dulf, L. Kovacs, Frac-

tional calculus in mathematical oncology, Sci-

entific Reports 13 (1) (2023) 10083.

Engineering World

DOI:10.37394/232025.2024.6.12

Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin

E-ISSN: 2692-5079

126

Volume 6, 2024

[25] M. Joshi, S. Bhosale, V. A. Vyawahare, A sur-

vey of fractional calculus applications in artifi-

cial neural networks, Artificial Intelligence Re-

view (2023) 1–54.

[26] D. Baleanu, Y. Karaca, L. Vázquez, J. E.

Macías-Díaz, Advanced fractional calculus, dif-

ferential equations and neural networks: anal-

ysis, modeling and numerical computations,

Physica Scripta 98 (11) (2023) 110201.

[27] S. Shahmorad, R. Kalantari, A. Assadzadeh,

Numerical solution of fractional black-scholes

model of american put option pricing via a non-

standard finite difference method: Stability and

convergent analysis, Mathematical Methods in

the Applied Sciences 44 (4) (2021) 2790–2805.

[28] M. S. Raubitzek, K., T.Neubauer, Combining

fractional derivatives and machine learning: A

review., Entropy 25 (1) (2023) 462–467.

[29] S. Raubitzek, K. Mallinger, T. Neubauer, Com-

bining fractional derivatives and machine learn-

ing: A review, Entropy 25 (1) (2022) 35.

[30] S. K. Chandra, M. K. Bajpai, Efficient machine

learning and factional calculus based mathemat-

ical model for early covid prediction, Human-

Centric Intelligent Systems (2023) 1–13.

[31] M. Gulian, M. Raissi, P. Perdikaris, G. Kar-

niadakis, Machine learning of space-fractional

differential equations, SIAM Journal on Scien-

tific Computing 41 (4) (2019) A2485–A2509.

[32] R. Walasek, J. Gajda, Fractional differentiation

and its use in machine learning, International

Journal of Advances in Engineering Sciences

and Applied Mathematics 13 (2-3) (2021) 270–

277.

[33] R. Almeida, S. Pooseh, D. F. Torres, Computa-

tional methods in the fractional calculus of vari-

ations, World Scientific Publishing Company,

2015.

[34] Y. Chen, Q. Gao, Y. Wei, Y. Wang, Study

on fractional order gradient methods, Applied

Mathematics and Computation 314 (2017) 310–

321.

Contribution of individual authors to

the creation of a scientific article

(ghostwriting policy)

Robab Kalantari and Khashayar Rahimi developed a

novel method for gradient descent, conducted simula-

tions, performed optimization, and were responsible

for the writing and implementation of the proposed

approach.

Saman Naderi has some edit in writing.

Sources of Funding for Research Presented in a

Scientific Article or Scientific Article Itself

No funding was received for conducting this study.

Conflict of Interest

The authors have no conflicts of interest to declare

that are relevant to the content of this article.

Creative Commons Attribution License 4.0

(Attribution 4.0 International, CC BY 4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.0/deed.en

_US

Engineering World

DOI:10.37394/232025.2024.6.12

Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin

E-ISSN: 2692-5079

127

Volume 6, 2024