A New Robust Molding of Heat and Mass Transfer Process in
MHD Based on Adaptive-Network-Based Fuzzy Inference System
AHMAD A. ALHARBI1, AMR R. KAMEL2*, SAMAH A. ATIA3
1Department of Mathematics, Faculty of Science and Arts, Northern Border University,
Arar, SAUDI ARABIA
2Department of Applied Statistics and Econometrics, Faculty of Graduate Studies for Statistical
Research (FGSSR), Cairo University, Giza 12613, EGYPT
2Data Processing and Tabulation at Central Agency for Public Mobilization and
Statistics (CAPMAS), Nasser City 2086, EGYPT
3Department of Mathematical Statistics, Faculty of Graduate Studies for Statistical Research
(FGSSR), Cairo University, Giza 12613, EGYPT
Abstract:- This study concerns with the Process intensification deal with the complex fluids in mixing
processes of many industries and its performance is based on the flow of fluid, magnetohydrodynamic (MHD)
heat and mass transfer. This paper proposes a dynamic control model based on adaptive-network-based fuzzy
inference system (ANFIS), weighted logistic regression and robust relevance vector machine (RRVM).
Suitable similarity variables are applied to convert the flow equations into higher order ordinary differential
equations and solved numerically. The surface-contour plots are utilized to visualize the influence of active
parameters on velocity, thermal, nanoparticles concentration and motile microorganism’s density. The hybrid-
learning algorithm comprised of gradient descent and least-squares method is employed for training the ANFIS.
A novel RRVM is presented to predict the endpoint. RRVM solves the problem of sensitivity to outlier
characteristic of classical relevance vector machine (RVM), thus obtaining higher prediction accuracy. The key
idea of the proposed RRVM is to introduce individual noise variance coefficient to each training sample. In the
process of training, the noise variance coefficients of outliers gradually decrease so as to reduce the impact of
outliers and improve the robustness of the model. To compare the proposed RRVM and other methods with
outliers, the Monte Carlo simulation study has been performed. The simulation results showed that, based on
mean squared error (MSE), mean absolute error (MAE), root mean squared error (RMSE) and coefficient of
determination () criteria, the proposed RRVM give better performance than other methods when the data
contain outliers. While when the dataset does not contain outliers, the results showed that the classical RVM is
more efficient than other methods.
Key-Words:- ANFIS, Heat and Mass Transfer, MHD flow, Monte Carlo Simulation, Outliers, Robust
Classification, Robust RVM, Sparsity, Weighted Logistic Regression.
Received: June 25, 2021. Revised: January 15, 2022. Accepted: February 26, 2022. Published: March 30, 2022.
1 Introduction
The magnetohydrodynamic (MHD) heat and mass
transfer processes over a moving surface are of
interest engineering and geophysical applications
such as geothermal reservoirs, thermal insulation,
enhanced oil recovery, packed-bed catalytic
reactors, cooling of nuclear reactors. Many
chemical engineering processes, such as metallurgy
and polymer extrusion, require cooling a molten
liquid as it is stretched into a cooling system; the
fluid mechanical characteristics of the final product
are mostly determined by the cooling liquid
employed and the velocity of stretching. Some
polymer fluids with higher electromagnetic
characteristics, such as polyethylene oxide and
polyisobutylene solution in cetane, are commonly
employed as cooling liquids because their flow
may be managed by external magnetic fields to
improve the quality of the final product. Many
transport processes in the industrial world include
simultaneous heat and mass transfer as a result of
the combined buoyancy effects of thermal diffusion
and chemical species diffusion. This might be due
to the fact that the research of combined heat and
mass transfer is beneficial in a variety of
technological transfer procedures. make a few
attempts in this direction. The study of magnetic
fields and the movement of electrically conducting
fluids in porous media has raised significant
concerns [1].
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
80
Volume 17, 2022
Nomenclature
Greek Symbols

target value
forecast value
󰇛󰇜
󰨥
average target value
󰇛󰇜
number of data
power law index
material time constant
fluid density
kinematic viscosity
mean absorption coefficient
similarity variable
velocity slip factor
thermal slip factor
concentration slip factor
microorganism slip factor

fuzzy rule
󰇛󰇜
logistic loss function
variance matrix

mean value vector
weight associated
kernel width
unique hyperparameter individually
Stefan-Boltzmann constant
Toki and Tokis [2] study unstable free convection
fluid flows that are incompressible and viscous near
a porous infinite plate with arbitrary time dependent
heating plate. Senapati et al. [3] published the
results of chemical reactions of viscous fluids that
are electrically conducting via a porous material in
two-dimensional steady free convection flow along
a vertical surface with slip flow area.
Moreover, the non-newtonian fluids and their
properties play an important role in the
intensification of mixing processes in a variety of
sectors, including plastics, paper, rubber, food, and
minerals. The carreau rheological model is a non-
newtonian rheological model in which the
constitutive relation holds for both high and low
shear rates. Because of its numerous uses in
engineering and technology, the Carreau fluid flow
has gotten a lot of attention. Several researches on
the heat and mass transfer properties of magneto
Carreau nanofluids with diverse characteristics such
as heat source/sink, thermal radiation,
suction/injection, and changing thermal conductivity
over a permeable/impermeable stretched sheet have
been conducted, see [4,5]. Nanofluids improve heat
transmission and can be used to improve the
efficiency of heat exchangers and reactors. In
nanofluids, bioconvection improves mass transfer,
induces microvolume mixing, improves stability,
and prevents nanoparticle clustering. Bio-nano
cooling systems, microfluidic devices, enhanced
energy conservation devices, medical filtration, and
microbial fuel cell technologies are all possible uses
of bioconvection phenomena in nanofluids.
Understanding MHD is inextricably linked to an
understanding of the physical consequences that
occur in MHD. Electric current is induced in the
conductor as it travels into a magnetic field, and the
conductor develops its own magnetic field. The
magnetic field lines will be excluded from the
conductor because the generated magnetic field
seeks to eradicate the original and externally
supported field. The induced field enhances the
applied field when the magnetic field forces the
conductor to move it out of the field. As a result of
this procedure, the force lines appear to be pulled
together with the conductor. The fluid with
complicated movements is the conductor in this
article. To comprehend the dynamical impact, we
must first understand that when cur rents are created
by a conducting fluid moving through a magnetic
field, a Lorentz force acts on the fluid and alters its
velocity. In MHD, movement affects the field and
vice versa. As a result, the theory is significantly
non-linear.
The data-processing techniques like artificial
neural network (ANN), adaptive-network-based
fuzzy inference system (ANFIS) and genetic
algorithm (GA) attracted the researchers because of
its applications in many non-linear systems. An
ANFIS can assist us in determining the best
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
81
Volume 17, 2022
distribution of membership functions by
determining the mapping relation between input and
output data via hybrid learning. This inference
system is made up of five levels. The node function
describes numerous nodes in each tier. Fixed nodes,
shown by circles, represent parameter sets that are
fixed in the system, whereas adaptive nodes,
denoted by squares, represent parameter sets that are
modifiable in these nodes. The current layer's input
will be the output data from the preceding levels'
nodes.
To alleviate the above drawbacks, Tipping [6,7]
proposed the relevance vector machine (RVM). The
RVM is a Bayesian evidence-based nonlinear
probabilistic model. To optimize the
hyperparameters of the model and get a sparse
solution, it employs the type-II maximum likelihood
approach, often known as the "evidence process."
For each of the model coefficients, an independent
zero-mean Gaussian prior is assumed, as well as an
independent Gamma hyper prior for each
hyperparameter. After that, a training data set is
used to determine posterior distributions of model
coefficients and hyperparameters. Initially, the
posterior distributions were calculated using the
type-II maximum likelihood approach, which is an
evidence procedure. A variational inference
technique, which maximizes a variational lower
bound on the marginal log likelihood, is an alternate
strategy for the approximation.
The posterior distributions of several of the
model coefficients are strongly peaked around zero,
and so those coefficients may be omitted from the
final model, thanks to the hierarchical prior structure
known as automated relevance determination prior.
As a result, we can have a sparse solution. The
relevance vectors are the training observations with
non-zero coefficient values. The support vector
machine (SVM) is another common kernel-based
learning technique that also delivers a sparse
solution, see [8]. The support vectors in the SVM
are the observations that contribute to the final
decision boundary. In practice, the RVM offers
significant benefits over the SVM. The number of
relevance vectors is substantially fewer than the
number of support vectors, resulting in a higher
degree of sparsity. Second, it generates probabilistic
results (e.g., class probability estimates). Finally,
model complexity may be controlled automatically,
without the need for an extra regularization
parameter. However, RVM has a serious weakness
that it assumes all of the training samples are
coupled with independent Gaussian
noise:󰇛󰇜. A well-known disadvantage
with Gaussian noise model is that it is not robust.
The accuracy of the RVM model will be
considerably harmed if the training samples are
polluted by outliers.
In this paper, a novel robust relevance vector
machine (RRVM) is contrived, which posits that
each training sample has its own coefficient of noise
variance. To discover and eradicate outliers, the
coefficients corresponding to outliers will be
severely reduced throughout the model training
method. To estimate the endpoint carbon content
and temperature of molten steel, we use the
suggested RRVM as an identifier. Measured data
are frequently intermixed with outlying observations
in MHD heat and mass transfer processes, although
RRVM can lessen the impact of outliers and has
strong generalization capacity. As a result, it is
appropriate to build the endpoint prediction model.
The remainder of this paper is organized as
follows: In Section 2, the literature review of MHD
heat and mass transfer processes described. Section
3 presents the The mathematical formulation of the
problem. Section 4 introduces the methods ANFIS,
Weighted logistic regression with Transformation of
the logistic function and RRVM utilized in this
paper. RRVM for classification using variational
inference are given in Section 5. Section 6 contains
the Monte Carlo simulation study. In Section 7, the
conclusions are drawn.
2 Modeling Studies: literature review
There is a growing body of research in the topic of
nanofluids, and multiple examinations of their
thermal conductivities have been carried out to
assess the impact of various factors. While
experimental work necessitates a significant
investment in a well-equipped laboratory and
appropriate instruments, which is a significant
barrier for some scholars, predictive approaches are
increasingly popular for a faster and less expensive
view of various influential parameters on desired
parameters. Actually, predicting the impact of
thermal conductivity of nanofluids is quite difficult,
and this has been a focus of intense research for
scientists.
Naveed et al. [9] examined MHD BL (boundary
layer) unsteady flow above curved stretching
surface. Abbas et al. examined numerically
radiation impacts on MHD flow above curved
stretching surface of nanofluid by assimilating the
slip, collective radiation and heat generation effects.
Sahoo [10] investigated the mass and heat transfer
in MHD flow of viscoelastic fluid via porous media
bounded by vacillating plate in slip flow system.
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
82
Volume 17, 2022
Singh et al. 11 have inspected mass transfer and heat
in MHD flow or viscous fluid past a straight up
plate in oscillatory velocity suction. Noor et al. [11]
used the shooting approach to investigate MHD
flow on an inclined surface with heat source/sink
effects. MHD fluid flow across a spinning disc was
researched by Turkyilmazoglu [12]. He analysed the
viscous dissipation and Joule heating components
using a spectral numerical integration approach.
Chen [13] studied heat and mass transport in MHD
free convective flow with Ohmic heating and
viscous dissipation using a numerical technique.
In recent years, statistical learning theory has
been rapidly developed. It is based on the notion of
structural risk reduction and focuses on managing
the generalization ability of the learning process, see
[14]. The SVM was created based on this notion. It
improves processing capabilities by translating data
into a high-dimensional space and employing kernel
functions. In addition, Müller et al. [15] established
a regularization parameter to adjust the trade-off
between model complexity and training error. As a
result, SVM has shown to be an effective tool for
identifying non-linear systems, with several
successful applications, see [16]. SVM has also
performed well in the application of steelmaking
process control. To forecast the endpoint parameters
of electric arc furnace steelmaking, Yuan et al. [17]
combined multiple support vector machines with
principal component regression. Valyon and Horvth
[18] suggested a sparse and robust extension of
least-square SVM (LS-SVM) for calculating the
quantity of oxygen blasted in Basic oxygen furnace
(BOF) steelmaking, and showed that LS-SVM
outperformed ANNs. Despite its popularity,
however, SVM has a number of major and practical
drawbacks. Predictions, for example, are not
probabilistic, hence the kernel function must meet
Mercer's requirement. Cross validation is required to
estimate the error/margin trade-off parameter,
which takes a long time. Furthermore, despite the
fact that SVM is a sparse model, the number of
support vectors rises linearly with the size of the
training sample set. These drawbacks limit the scope
of SVM future uses.
On the other hand, several projects have been
undertaken to construct robust kernel-based learning
algorithms, see Hwang et al. [19,20]. The robust
truncated hinge loss SVM was proposed by Wu and
Liu [21]. They used the difference convex approach
to solve the nonconvex problem through a series of
convex sub-problems because the underlying
optimization problem comprises nonconvex
minimization. However, because they were created
using the SVM technique, these studies are unable
to provide statistical information such as a class
probability. For the logistic regression, Park and Liu
[22] used a truncated logistic loss function to
remove the effect of outlying observation. Despite
the fact that this study can estimate the class
probability, it does not provide a sparse solution.
Furthermore, if a dataset contains outliers, a
decision boundary derived from the RVM may be
severely warped. Because data sets with outliers are
regularly encountered in practice, a robust learning
algorithm for the RVM that is insensitive to outliers
is sought. In this paper, the influence of an outlier
on the decision boundaries from the SVM (dotted
line), RVM (dashed line), and the new approach,
which is dubbed the RRVM (full line), is illustrated
using a simulated dataset example in Figures 1-2.
Figure 1 represents the decision boundaries obtained
by employing the linear kernel, while Figure 2
displays the decision boundaries obtained by radial
basis function (RBF) kernel with. For the
SVM, the regularization constant C is set to 1. From
the figures, it is observed that the decision
boundaries from the SVM and RVM are pulled
toward to the outlier regardless of the type of
kernels.
Fig. 1 A simulated dataset with outliers: plots of
the decision boundaries from SVM, RVM and
RRVM by employing the linear kernel.
An adaptive-network-based fuzzy inference
system (ANFIS) is used to generate the values of
these control variables, which is based on operator
control experience and production data from a steel
factory. ANFIS can learn from a set of input-output
data and offers competitive computation accuracy.
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
83
Volume 17, 2022
Combining ANFIS with RRVM, a dynamic control
model of MHD heat and mass transport processes is
created. In order to achieve the intended control
effect, the RRVM model must be well-trained as an
identifier to approximate the link between input and
output and correctly anticipate the endpoint carbon
content and temperature. The simulations in the
final section of this paper will demonstrate that
RRVM has a high degree of approximation ability
and robustness.
Fig. 2 A simulated dataset with outliers: plots of
the decision boundaries from SVM, RVM and
RRVM obtained by RBF kernel with.
Finally, a trimmed relevance vector machine
(TRVM) was suggested by Yuan et al. [17], which
redefined the likelihood function as a trimmed one.
During model training, outliers are removed, and a
weighted technique is used to determine the
trimmed subset. The new technique can detect
outliers and improve the model's robustness. Many
robust methods are discussed by many papers in
several models, see e.g. [23-26].
3 The Mathematical Formulation of
the Problem
In this section, our new robust modeling
approach is described. Initially, the description
on how the MHD heat and mass transfer
processes are handled is presented. The
adaptation to convective effects is also
included.
3.1 Modeling Description
Consider an unsteady 2D flow of a
magnetohydrodynamic Carreau nano-fluid
containing gyrotactic micro-organisms influenced
by a slendering stretching surface in the presence of
thermal radiation and multiple slips. The heat
transfer and mass transfer features are examined
with the effects of Brownian motion and
thermophoresis.
The slendering sheet is stretched in the
-direction with velocity 󰇛󰇜
󰇛󰇜
and -axis is normal to the
flow, see Figure 3. The surface is assumed to be
impermeable 󰇛󰇜 with the thickness
󰇛󰇜
where . A uniform magnetic
field of strength 󰇛󰇜󰇛󰇜
 is
imposed in the direction transverse to the flow. The
temperature 󰇛󰇜, nanoparticle concentration
󰇛󰇜, and density of motile microorganisms
󰇛󰇜, at the stretching sheet are assumed to be
greater than the ambient values ,
respectively.
Fig. 3 Schematic form of the physical model
3.2 Boundary Conditions and Governing
Equations
Based on the foregoing assumptions, the governing
equations for mass, momentum, energy,
nanoparticle concentration, and microorganisms are
as follows: 

 (1)
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
84
Volume 17, 2022




󰇡
󰇢
󰇛󰇜
󰇡
󰇢
󰇡
󰇢
󰇛󰇜
(2)




󰇡
󰇢󰇡
󰇢



 (3)




󰇡
󰇢
 (4)





 󰇣
󰇡
󰇢󰇤 (5)
The problem boundary conditions are defined as
follows:
󰇛󰇜
󰇡
󰇢
󰇛󰇜

󰇛󰇜
󰇛󰇜

󰇛󰇜

 as (6)
where 󰇛󰇜 denotes the components of velocity
along 󰇛󰇜directions, t is the time, 
represent the fluid temperature, nanoparticles
volume fraction and motile micro-organisms
density,  stand for material time constant
and power law index,  are the density,
kinematic viscosity, electrical conductivity and
thermal diffusivity,  is the specific heat,
󰇛󰇜󰇛󰇜
where 󰇛󰇜 is effective heat
capacity of nanoparticles and󰇛󰇜 is heat capacity
of base fluid,  indicate the Brownian,
thermophoretic and microorganism diffusion
coefficients,  are the chemotaxis constant
and maximum cell swimming speed ( -
constant),  signify the velocity,
thermal, concentration and microorganism slip
factors.
󰇛󰇜 󰇛󰇜

󰇛󰇜󰇛󰇜
 󰇛󰇜
󰇛󰇜
 (7)
However, the heat flux can be expressed as;


 

 (8)
where denote Stefan-Boltzmann constant and
is mean absorption coefficient, respectively.
In make use of Eq. (8), the Eq. (3) reduced to






󰇡
󰇢󰇡
󰇢

 (9)
4. Methods Utilized
In this section, the methods utilized in the dynamic
control model, such as ANFIS, Weighted logistic
regression with Transformation of the logistic
function, and RRVM, will be presented.
4.1 Adaptive-Network-Based Fuzzy
Inference System
Adaptive-network-based fuzzy inference system
(ANFIS) is an off-line learning model. It's a type of
artificial neural network that uses the Takagi
Sugeno fuzzy inference system as its foundation. In
the early 1990s, the approach was developed. It has
the potential to capture the benefits of both neural
networks and fuzzy logic principles in a single
framework because it integrates both. Its inference
system is made up of a set of fuzzy IFTHEN rules
with the capacity to approximate nonlinear functions
through learning. As a result, ANFIS is regarded as
a universal estimator. By building a collection of
fuzzy if-then rules with appropriate membership
functions, it has been widely employed in the
modeling and control of nonlinear systems, see [27].
Generally, an ANFIS model consists of five layers.
The architecture is shown in Figure 4.
The fuzzy rules extracted from inputoutput pairs
are described as;
if is and is
and is
Then 󰇛󰇜 (10)
where denotes the  fuzzy rule, and
are the fuzzy sets associated with the input variables
. Function 󰇛󰇜 is the
output of the  fuzzy rule. The different functions
of five layers are described as follows:
Layer 1: Input variables are fuzzificated and the
membership of 󰇛󰇜 on different fuzzy
sets are calculated according to formula;
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
85
Volume 17, 2022
󰇛󰇜 (11)
where
󰇛󰇜 denotes the membership function of
variable on fuzzy sets and is the
membership degree.
Layer 2: Calculate the confidence degrees of fuzzy
rules. As for the fuzzy rule, the degree of
confidence is calculated as formula;
 (12)
Layer 3: All of the confidence degrees are
normalized as:
 (13)
Layer 4: Calculate the output of each fuzzy rule
according to formula (14). Here TakagiSugeno
type fuzzy rules are adopted.
(14)
where
and are fuzzy consequent
parameters which can be determined based on least-
square regression.
Layer 5: Calculate the final output of ANFIS. It is
the weighted summarization of , and the weight is
󰨥󰇛󰇜.
 󰨥 (15)
Fig. 4 The architecture of ANFIS
4.2 Weighted Logistic Regression
In this section, we briefly describe a standard
logistic regression and its weighted version for
achieving the robustness. Consider a data set of
input-target pairs󰇝󰇞
, where represents a
-dimensional input vector and represents class
labels: if the observation belongs to the
first class and if it belongs to the second
class. A decision boundary can be defined as a
linear combination of basis functions as follows:
󰇛󰇛󰇜󰇜󰇛󰇜
 󰇛󰇜 (16)
where 󰇛󰇜 is a vector of model
coefficients and 󰇛󰇜󰇛󰇛󰇜, 󰇛󰇜󰇜 is
a vector of basis functions. By employing some
nonlinear basis functions, the decision boundary
󰇛󰇛󰇜󰇜 becomes a nonlinear function with respect
to . Some commonly used basis functions are the
polynomial kernel, 󰇛󰇜󰇛󰇜, where
the parameter is the degree of polynomial to be
used, and the Gaussian RBF kernel;
󰇛󰇜󰇥󰇛󰇜󰇛󰇜
󰇦,
where the parameter is the kernel wid th.
In a standard logistic regression, the conditional
distribution for t is given by;
󰇛󰇜󰇛󰇛󰇜󰇜󰇝󰇛󰇛󰇜󰇜󰇞
where󰇛󰇜 is the logistic function defined as
󰇛󰇜. Assuming independent and
identically distributed data, the likelihood function
can be written as;
󰇛
󰇜
 󰇛󰇜
 󰇛󰇜󰇛󰇜 (17)
The model coefficients can be estimated by the
maximum likelihood approach which can be
formulated as the following optimization problem in
the loss function framework:

󰇛󰇜󰇛󰇜 (18)
where 󰇛󰇜 denotes the logistic loss
function. It should be noted that the solution of
minimizing the sum of loss functions is equivalent
to that of maximizing the log likelihood function,
that is;
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
86
Volume 17, 2022

󰇛󰇜󰇛󰇜

󰇛󰇜
 󰇛󰇜 (19)
To obtain a robust classification result, a
weighting strategy can be employed to the standard
logistic regression model in the loss function
framework as follows;

󰇛󰇜󰇛󰇜 (20)
where is a weight associated with the
observation. If a small weight is given to an
outlying observation, the effect of an outlier can be
reduced and therefore a robust decision boundary
can be obtained. Then, one question is raised: how
the concept of a weighted loss can be transformed
into the maximum likelihood approach. From Eq.
(19), the following relationship can be obtained:

󰇛󰇜󰇛󰇜

 󰇛󰇜

 󰇛󰇜 (21)
Therefore, the concept of a weighted loss can be
dealt with in the maximum likelihood approach by
replacing 󰇛󰇜 with 󰇛󰇜.
To avoid the overfitting problem while
considering a complex model, the regularization
concept has been used in machine learning. By
employing the regularization concept to the original
logistic regression, the formulation in Eq. (18) can
be extended as follows;

 󰇛󰇜󰇛󰇜󰇛󰇜 (22)
where  is a regularization parameter which
controls the smoothness of a decision boundary and
󰇛󰇜 denotes a regularization term which
represents a penalty for a complex decision
boundary.
4.3 Classical Relevance Vector Machine
The relevance vector machine (RVM) is a Bayesian-
based probabilistic model. Consider the training
samples to be a data collection of input-target pairs
󰇝󰇞
, where signifies an -
dimensional input vector and denotes a
scalar-measured output. Assume that the objectives
are sampled separately from the regression model
with extra noise as follows:
󰇛󰇜 (23)
where is assumed to be the mean-zero Gaussian
noise with variance , namely 󰇛󰇜.
Similar to SVM, the prediction function 󰇛󰇜 of
RVM is defined as a linear combination of the
weighted basis functions:
󰇛󰇜
󰇛󰇜 (24)
where 󰇛󰇜 is a basis function, effectively define
one basis function for each sample in training data
set. The weight parameter vector is defined as
󰇟󰇠. According to Eq. (23󰇜 and the noise
assumption of , we have the Gaussian distribution
over with mean 󰇛󰇜 and variance , viz.,
󰇛󰇜󰇛󰇛󰇜󰇜. For convenience,
a hyperparameter is defined as .
Therefore, the likelihood function of the complete
training data set is expressed as;
󰇛󰇜󰇡
󰇢󰇥
󰇦 (25)
where 󰇟󰇠 and 󰇛󰇜
defined as 󰇟󰇛󰇜, 󰇛󰇜󰇛󰇜󰇠, which
is called design matrix. The definition of 󰇛󰇜 is
󰇛󰇜󰇟󰇛󰇜󰇛󰇜󰇛󰇜󰇠
.
The goal of RVM training is to figure out what
the posterior distribution is over the weight
vector. The prior distribution over
󰇛󰇜 should be determined first in
order to keep the likelihood function sparse and
optimize it. Assume that follows a Gaussian
distribution with mean zero and variance a, thus
the previous distribution over is;
󰇛󰇜
  (26)
where is the unique hyperparameter individually
associated with each weight parameter in a
multivariate Gaussian distribution, and
󰇟󰇠 The posterior distribution over
w may be estimated using the Bayesian rule and the
defined prior distribution Eq. (26) and likelihood
function Eq. (25).
󰇛󰇜󰇛󰇜󰇛󰇜
󰇛󰇜 (27)
Since 󰇛󰇜 and 󰇛󰇜 are all Gaussian, the
product of these two distributions is also Gaussian.
Furthermore, 󰇛󰇜 does not include , so it is
considered as a normalization coefficient.
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
87
Volume 17, 2022
The posterior distribution over is also Gaussian
and can be expressed as:
󰇛󰇜󰇛󰇜 (28)
where is the mean value vector and is the
variance matrix, which are expressed as formulas
(29) and (30), respectively:
󰇛󰇜 (29)
 (30)
where 󰇛󰇜. The posterior
distribution over are determined by
hyperparameters and , thus the hyperparameters
are optimized by using evidence procedure. The
iterative optimization formulas for hyperparameters
are;
 (31)

(32)
where denotes the th ellement of vector and
 denotes the th diagonal element of matrix
. In the process of training,
Equations (13) (16) are calculated iteratively.
Most of tend to ward infinity and the
corresponding will tend toward zero. The training
stops until all the hyperparameters are convergent or
the maximum number of iterations is reached.
Classic RVM is based on the assumption that
each training samples noise is a mean-zero
Gaussian distribution with the same variance (or
hyperparameter ). Measured data is usually tainted
by outlying observations in actual applications,
making the Gaussiannoise assumption
unsustainable. This will weaken the RVM
regression model's resilience and diminish its
prediction accuracy. To alleviate this problem,
researchers have proposed some modified methods.
Tipping and Lawrence [28] improved RVM by
using the Student-t noise model, which had a larger
tail distribution than the Gaussian noise model. The
updated technique, on the other hand, was
developed using variational approximation, which
takes longer to compute.
4.4 Proposed New Modeling
In this section, our new modeling approach is
described. Initially, the description on how the
MHD heat and mass transfer processes are handled
is presented. The above modified strategies are
mainly based on variational inference or trimming
data set. A proposed modeling of robust relevance
vector machine (RRVM) is presented to reduce the
impact of outliers and the model can still be
implemented by using evidence procedure. Rather
than using the same noise variance for all samples,
we assume that each training sample has its own
noise variance coefficient. The iteration formulae
are then deduced using the Bayesian evidence
framework to maximize the hyperparameters and
noise variance coefficients. Outliers noise variance
coefficients will decrease during the optimization
process, allowing outliers to be detected and
eliminated. The following is a full description of the
optimization technique.
In reference to Bayesian weighted linear
regression,Ting et al. [29] assume that the individual
noise distribution of the  training sample is:
󰇛󰇜󰇛󰇜,  (33)
wheredenotes the average variance of all the
training samples and denotes the noise variance
coefficient of the  sample. The prior distribution
of is assumed to be Gamma distribution, namely
󰇛󰇜󰇛󰇜󰇛󰇜
with "gamma function" 󰇛󰇜.
Define the vector 󰇟󰇠 and the
likelihood function of the complete training sample
set will change from Eq. (25) to;
󰇛󰇜󰇛󰇜
󰇥
󰇛󰇜󰇛󰇜󰇦 (34)
where 󰇛󰇜, and is the
determinant of matrix. The definitions of and
are the same as before. The prior distribution over
is still expressed as Eq. (26). According to Bayesian
rule, the posterior distribution of is computed as;
󰇛󰇜󰇛󰇜
󰇛󰇜󰇛󰇜
where the variance matrix and mean value vector
can be computed by using following formulas;
󰇛󰇜

󰇛󰇜󰇛󰇜 (35)


󰇛󰇜 (36)
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
88
Volume 17, 2022
Since the computation formulas of variance matrix
and mean value vector are both influenced by ,
and, these hyperparameters need to be optimized
so as to maximize the posterior distribution of .
The marginal likelihood function is computed as
follows:
󰇛󰇜󰇛󰇜󰇛󰇜
󰇛󰇜󰇥
󰇦 (37)
where . Equivalently, we
can optimize the logarithm of the product of
󰇛󰇜 and 󰇛󰇜. Moreover, we maximize
this quantity with respect to  and 
for convenience of computing. Therefore, the
objective to be optimized is;
󰇛󰇜
󰇛󰇜
Note that 󰇛󰇜󰇛󰇜 and delete the terms
which are independent of and , we get the
objective function;

󰇟
󰇛󰇜󰇛󰇜󰇠
󰇛󰇜 (38)
The optimized value of and cannot be
obtained in closed form, and have to be re-estimated
iteratively. Take the partial derivative of Eq. (25)
with respect to 󰇛󰇜,
󰇛󰇜 and , and rearrange the
equations to obtain the iteration formulas of 
and as following;

(39)

󰇟󰇛󰇛󰇜󰇜󰇛󰇛󰇜󰇛󰇜󰇜󰇠 (40)
󰇛󰇜󰇛󰇜

 (41)
where  is the th diagonal
element of variance matrix  and
󰇛󰇜 denotes the trace of matrix. Finally the
iterative formulas for optimization are all obtained.
Formulas (35), (36), (39), (40) and (41) are the
iterative estimations of  and hyperparameters
, respectively.
5. RRVM for Classification Using
Variational Inference
In classification, it is not possible to directly seek
the posterior distributions over the model
coefficients since the logistic likelihood function is
not suitable to be combined with a Gaussian prior.
To resolve this issue, Jaakkola and Jordan [30]
introduced a transformed logistic function that is
quadratically dependent on the model coefficients in
the exponent and used it to assess a logistic
regression model with a Gaussian prior over the
model coefficients in a Bayesian framework. Bishop
and Tipping [31] used these findings to develop an
alternate training procedure for the RVM in the
context of variational inference.
The following is a lower bound on the logistic
function with the functional form of a Gaussian, see
[30]. To begin, decompose the log of the logistic
function 󰇛󰇜as follows:
󰇛󰇜󰇛󰇜


(42)
Note that the function 󰇛󰇜

is a convex function with respect to the variable .
Since a tangent surface to a convex function is a
global lower bound for the function, the global
lower bound on 󰇛󰇜 can be obtained with a first
order Taylor expansion in the variable at the
point (called a variational parameter in the
variational inference framework). That is;
󰇛󰇜󰇛󰇜󰇛󰇜
󰇛󰇜󰇛󰇜
󰇛󰇜
󰇡
󰇢󰇛󰇜
Combining this lower bound on 󰇛󰇜 with Eq.
(42), the lower bound on the logistic function can be
obtained as;
󰇛󰇜󰇛󰇜󰇥
󰇛󰇜󰇛󰇜󰇦 (43)
where󰇛󰇜
󰇡󰇢
󰇥󰇛󰇜
󰇦.
The bound has the form of the exponential quadratic
function of , which makes the Bayesian approach
analytically tractable.
Again, the conditional distribution for can be
written as;
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
89
Volume 17, 2022
󰇛󰇜󰇛󰇜󰇛󰇜
󰇡
󰇢󰇡
󰇢
󰇛󰇜󰇛󰇜
Then, the following relationship holds due to Eq.
(43):
󰇛󰇜󰇛󰇜󰇛󰇜
󰇛󰇜󰇱󰇛󰇜󰇛󰇜
󰇛󰇜󰇡󰇛󰇜󰇢󰇲
󰇛󰇜󰇛󰇜
Therefore, the likelihood function can be written as;
󰇛󰇜
 󰇛󰇜
 󰇛󰇜
Consequently, from Eq. (21), the modified
likelihood function to downweight outliers is given
by;
󰇛
󰇜
󰇛󰇜
󰇛󰇜
󰇛󰇜
 󰇛󰇜󰇛󰇜󰇛󰇜
󰇛󰇜󰇡󰇛󰇜󰇢
 󰇛󰇜󰇛󰇜
6 Simulation Study
Monte Carlo experiments were performed in the
presence of outliers; we use the benchmark and
industrial data to evaluate the performance of
dynamic control model. To investigate the
performance of some models in different situations,
different simulation factors will be used. To sum up
the above arguments, the whole training procedure
of RRVM is as follows:
In practical utilization of this algorithm, we
should set the initialization of the priors used in
equations (35) - (41). First of all, and can be
initialized according to the characteristic of the data
set, e.g. 󰇛󰇜󰇛󰇜, where
󰇛󰇜 is the variance of . Secondly, the scale
parameters and , which are included in is
prior distribution Gamma(󰇜, should be selected
so that the prior means of are 1 . For example,
when the parameters are set as and ,
the noise variance coefficient has a prior mean of
 with a variance of . That
means we start by assuming the noise distributions
of all the samples are Gaussian with the same
variance, that is to say, all of the training samples
are inliers. By using these values, it shows clearly
that the range of is , which could be
inferred from Eq. (40). This setting of prior
parameter values is generally valid for most
applications or data sets. During the process of
iteration, the corresponding to outliers will
gradually become small.
Eq. (40) reveals that the prediction error
󰇛󰇛󰇜󰇜 of data point 󰇝󰇞 is in the
denominator. If the prediction error in is so large
that it dominates over other denominator terms, then
the corresponding noise variance coefficient of
that point will be very small. When the prediction
error term in the denominator tends to infinity, the
will approach to zero. As can be seen from Eq.
(35) and (36), the calculation formulas of and of
the posterior distribution over both include a term
which is the linear weighted combination of all the
samples, and the weight is exactly . If a sample
has an extremely small coefficient, it will make
smaller contribution to the estimate of and . This
effect is equivalent to the detection and removal of
an outlier if the coefficient of the data sample
󰇝󰇞 is small enough, which can improve the
robustness of the model. After training, RRVM can
be used to make prediction based on the posterior
distribution over . For a new input datum , the
output is 󰇛󰇜.
The size of sample set is =100,150,300 and
400. At first, we investigate the approximation
performance of RRVM with the clean training
sample set. Then, some outliers generated from
standard Gaussian distribution are added into the
training sample set. We interfuse 
outliers with the clean training samples,
respectively. To evaluate the generalization
performance in terms of the robustness, each data
set is randomly divided into the training (60%) and
test data sets (40%).
6.1 Iterative algorithm
Update equations from (35) to (41) given the
hyperparameters. The training procedure of the
proposed method can be summarized as follows:
Step 1: Initialize the hyperparameters and
as well as and ;
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
90
Volume 17, 2022
Step 2: Compute the variance matrix and mean
value vector of posterior distribution over by
the use of equations (35) and (36), respectively.
Step 3: Iteratively optimise the hyperparameters,
and according to (39) (41). Many of
will trend to infinity during the optimization method
(as determined by a big threshold number, such as
). This indicates that will trend to zero, as
would , based on Eq. (39). The model sparsity is
obtained by pruning the corresponding basis
functions.
Step 4: Check to see if all of the parameters are
convergent or if the maximum number of iterations
has been achieved. If this is the case, you should
cease iterating and training. Return to Step 2 if
necessary. The basis functions corresponding to
non-zero are referred to as "relevance vectors"
when the training is completed.
Step 5: All Monte Carlo experiments involved
replications and all the results of all separate
experiments are obtained by precisely the same
series of random numbers.
6.2 Error Estimation Methods
For comparison, five other methods are also
implemented in the experiment, including one-
nearest neighbor (1-NN), -nearest neighbor
(-NN), SVM, classical RVM and TRVM. To
verify the robustness of the proposed method
RRVM compared to other classification algorithms,
the generalization performance of each method is
evaluated in terms of three performance measures
which are listed below:
Mean square error (MSE)
Mean absolute error (MAE)
Root mean square error (RMSE)
MSE
󰇛󰇜,
MAE
󰇛󰇜 ,
RMSE
󰇛󰇜.
Moreover, we used the Coefficient of Determination
(), which are defined as;
 󰇛󰇜
󰇛󰇜
where
 are the target value, forecast
value, average target value and the number of data,
respectively.
The RMSE depends on the predicted values, not
on how the values fall relative to a threshold or
relative to each other. It measures how much
predictions deviated from the true target values.
Note that smaller values of the MSE, MAE, and
RMSE mean the better classification ability of the
model, while for the, higher is better.
6.3 Results and Discussions
The SVM is implemented using LIBSVM software
[32], and the source code to run Classical RVM is
obtained from Tipping’s website
1
. Moreover,
Hybrid learning algorithm was employed to update
the network parameters and optimum model of
ANFIS [33] was constructed using the trial-and-
error process. Also, the proposed method (RRVM)
toolbox of MATLAB 7.5 is utilized to implement
the algorithm.
The number of nearest neighbor should be
chosen for -NN. In this, simulation study, the
training data set is subjected to a five-fold cross
validation technique, after which the ideal number
of resulting in the lowest error rate is determined.
The SVM and RVM model parameters are
optimized using a similar technique. The SVM has
two model parameters: the regularization parameter
and the kernel parameter (e.g., the width of the
kernel function in the case of the RBF kernel),
whereas the RVM only has one (the kernel
parameter value). The proposed RRVM also has the
kernel parameter as a single model parameter. While
the parameters of the SVM should be optimized
through the cross validation procedure which is
computationally demanding, the parameter of the
RVM can be selected efficiently by comparing the
lower bound values.
1
Tipping’s website:
http://www.miketipping.com/.
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
91
Volume 17, 2022
Table 1: Generalization Performance of Classification Methods for 
Method
Measure
Percentages of Outliers
0%
10%
20%
30%
40%
-NN
MSE
2.9658
4.2119
4.7871
5.9100
7.2963
MAE
3.4365
4.5104
5.5024
6.7931
8.7091
RMSE
1.7222
2.0523
2.1879
2.4311
2.7012
0.9034
0.8722
0.9013
0.8925
0.8860
-NN
MSE
1.2021
2.8330
3.1983
3.9486
4.8748
MAE
3.3571
4.1632
5.1848
6.4010
8.2064
RMSE
1.0964
1.6832
1.7884
1.9871
2.2079
0.9126
0.9347
0.9061
0.8694
0.9224
SVM
MSE
1.9404
2.1270
2.9807
3.6799
4.5431
MAE
2.9488
4.1296
5.0418
6.2244
7.9800
RMSE
1.3930
1.4584
1.7265
1.9183
2.1314
0.9041
0.8305
0.9164
0.9237
0.9039
Classical RVM
MSE
0.5156
1.4561
2.5820
3.1877
3.9354
MAE
2.2480
3.9856
4.8214
5.9523
7.6312
RMSE
0.7181
1.2067
1.6069
1.7854
1.9838
0.9930
0.9287
0.8866
0.9388
0.9378
TRVM
MSE
0.8174
1.0361
1.7075
2.1080
2.6025
MAE
2.4096
3.3568
4.0352
4.9817
6.3868
RMSE
0.9041
1.0179
1.3067
1.4519
1.6132
0.8959
0.9362
0.9208
0.9612
0.9459
RRVM
MSE
0.6289
0.9580
1.4346
1.7711
2.1865
MAE
2.5381
2.9485
3.3294
4.1104
5.2698
RMSE
0.7930
0.9788
1.1977
1.3308
1.4787
0.8711
0.9634
0.9886
0.9802
0.9708
The best performance for each percentage of outliers is given in bold.
The simulation results are presented in Tables 1 to
3, with different sample size=100,150,300 and
400, respectively. Each table has five sections
represent the percentages of outliers. From Tables 1
to 4, we can summarize the effects of the main
simulation factors on MSE, MAE, RMSE and
values for all methods as follows:
As increases, the values of MSE, MAE and
RMSE are decreases in all situations.
As percentages of outliers increases, the
values of MSE, MAE and RMSE are increases
in all situations.
The MSE, MAE, RMSE and comparison of
six methods is listed in Tables 1 to 3. When the
training sample set excludes outliers, the MSE,
MAE and RMSE of RRVM is very close to that of
TRVM but is worse than that of classical RVM. We
can conclude that in the absence of outliers classical
RVM method is more efficient than other methods,
because it has minimum MSE, MAE, RMSE and
higher values of . When outliers are added, the
approximation performance of classical RVM
deteriorates drastically, while TRVM and RRVM
can still get good results. With the increase of
outlier number, RRVM can obtain better result than
classical RVM and TRVM, which demonstrates that
RRVM can effectively resist the impact of outliers
and has good robustness.
The results show that as the contamination
percentage increases, the predictive performances of
the classifiers get worse and worse, while the
RRVM clearly shows its robustness. In addition, it
is shown that the RRVM gives a sparse solution.
Furthermore, it is confirmed from Tables 1-2 that
the RRVM is competitive with other methods in
terms of the computation time since it takes
relatively a short time to optimize the model
parameters. From Table 3, it is clearly shown that
the generalization performances of the RRVM are
consistently better than other methods even if the
training data set is contaminated by the outliers.
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
92
Volume 17, 2022
Table 2: Generalization Performance of Classification Methods for 
Method
Measure
Percentages of Outliers
0%
10%
20%
30%
40%
-NN
MSE
0.9491
1.3478
1.5319
1.8912
2.3348
MAE
0.8591
1.1276
1.3756
1.6983
2.1773
RMSE
0.9742
1.1610
1.2377
1.3752
1.5280
0.8925
0.8616
0.8904
0.8817
0.8753
-NN
MSE
0.3847
0.9066
1.0235
1.2635
1.5599
MAE
0.8393
1.0408
1.2962
1.6002
2.0516
RMSE
0.6202
0.9521
1.0117
1.1241
1.2490
0.9015
0.9234
0.8952
0.8589
0.9112
SVM
MSE
0.6209
0.6806
0.9538
1.1776
1.4538
MAE
0.7372
1.0324
1.2604
1.5561
1.9950
RMSE
0.7880
0.8250
0.9766
1.0852
1.2057
0.8932
0.8205
0.9053
0.9125
0.8929
Classical RVM
MSE
0.1650
0.4659
0.8263
1.0201
1.2593
MAE
0.5620
0.9964
1.2053
1.4881
1.9078
RMSE
0.4062
0.6826
0.9090
1.0100
1.1222
0.9810
0.9175
0.8758
0.9274
0.9265
TRVM
MSE
0.2616
0.3316
0.5464
0.6746
0.8328
MAE
0.6024
0.8392
1.0088
1.2454
1.5967
RMSE
0.5114
0.5758
0.7392
0.8213
0.9126
0.8851
0.9248
0.9096
0.9496
0.9344
RRVM
MSE
0.2013
0.3066
0.4591
0.5667
0.6997
MAE
0.6345
0.7371
0.8324
1.0276
1.3174
RMSE
0.4486
0.5537
0.6775
0.7528
0.8365
0.8605
0.9517
0.9766
0.9684
0.9590
The best performance for each percentage of outliers is given in bold.
Graphically, we illustrate the MSE and RMSE
values for different methods in all cases with
different main factors by 3D graphs are shown in
Figures 5 and 6, when . Figures 5 and 6
illustrate the effect of outliers on the decision
boundaries obtained from the SVM, Classical RVM,
TRVM and the RRVM. Note that the SVM does not
provide such probabilistic information. From the
figures, it can be observed that the SVM and
Classical RVM are not robust to the outliers, i.e. the
decision boundaries are distorted by a few outliers.
In contrast to them, the TRVM and RRVM is more
insensitive to outliers since it reduces the effect of
outliers by giving a small weight to them. In terms
of the sparsity, the RRVM preserves the sparsity,
i.e. the number of non-zero coefficient is small
enough, although the training data set contains
outliers, see Abonazel [34] for more details to 3D
graphs using R software.
7 Conclusions
In this paper, we propose the robust RVM based
on an ANFIS and weighting scheme, which is
insensitive to outliers and simultaneously maintains
the advantages of the original RVM. Given a prior
distribution of weights, weight values are
determined in a probabilistic way and computed
automatically during training. Our theoretical result
indicates that the influences of outliers are bounded
through the probabilistic weights. Also, a guideline
for determining hyperparameters governing a prior
is discussed. For comparison, five other methods are
also implemented in the experiment, to verify the
robustness of the proposed method RRVM
compared to other classification algorithms. The
simulation results showed that, based on MSE,
MAE, RMSE and criteria, the proposed RRVM
give better performance than other methods when
the data contain outliers. While when the dataset
does not contain outliers, the results showed that the
classical RVM is more efficient than other methods.
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
93
Volume 17, 2022
Table 3: Generalization Performance of Classification Methods for 
Method
Measure
Percentages of Outliers
0%
10%
20%
30%
40%
-NN
MSE
0.3037
0.4313
0.4902
0.6052
0.7471
MAE
0.2148
0.2819
0.3439
0.4246
0.5443
RMSE
0.5511
0.6567
0.7001
0.7779
0.8644
0.8817
0.9051
0.9268
0.8710
0.8647
-NN
MSE
0.1231
0.2901
0.3275
0.4043
0.4992
MAE
0.2098
0.2602
0.3241
0.4001
0.5129
RMSE
0.3509
0.5386
0.5723
0.6359
0.7065
0.8906
0.9122
0.9402
0.8485
0.9002
SVM
MSE
0.1987
0.2178
0.3052
0.3768
0.4652
MAE
0.1843
0.2581
0.3151
0.3890
0.4987
RMSE
0.4458
0.4667
0.5525
0.6139
0.6821
0.9203
0.8939
0.9419
0.9137
0.9179
Classical RVM
MSE
0.0528
0.1491
0.2644
0.3264
0.4030
MAE
0.1405
0.2491
0.3013
0.3720
0.4769
RMSE
0.2298
0.3861
0.5142
0.5713
0.6348
0.9691
0.9064
0.9469
0.9162
0.9153
TRVM
MSE
0.0837
0.1061
0.1749
0.2159
0.2665
MAE
0.1506
0.2098
0.2522
0.3114
0.3992
RMSE
0.2893
0.3257
0.4182
0.4646
0.5162
0.9394
0.9368
0.9525
0.9381
0.9231
RRVM
MSE
0.0644
0.0981
0.1469
0.1814
0.2239
MAE
0.1586
0.1843
0.2081
0.2569
0.3294
RMSE
0.2538
0.3132
0.3833
0.4259
0.4732
0.9398
0.9402
0.9648
0.9567
0.9474
The best performance for each percentage of outliers is given in bold.
Fig. 5 The MSE values for all methods with different percentages of outliers when 
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
94
Volume 17, 2022
Fig. 6 The RMSE values for all methods with different percentages of outliers when 
References:
[1] Das, S. S., Biswal, S. R., Tripathy, U. K., & Das, P.
(2011). Mass transfer effects on unsteady
hydromagnetic convective flow past a vertical
porous plate in a porous medium with heat source.
Journal of Applied Fluid Mechanics, 4(4), 91100.
[2] Toki, C. J., & Tokis, J. N. (2007). Exact solutions
for the unsteady free convection flows on a porous
plate with timedependent
heating. ZAMM
Journal of Applied Mathematics
and Mechanics/Zeitschrift für Angewandte
Mathematik und Mechanik: Applied Mathematics
and Mechanics, 87(1), 4-13.
[3] Senapati, N., Dhal, R. K., & Das, T. K. (2012).
Effects of chemical reaction on free convection
MHD flow through porous medium bounded by
vertical surface with slip flow region. American
Journal of Computational and Applied
Mathematics, 2(3), 124-135.
[4] Khan, M., & Azam, M. (2017). Unsteady heat and
mass transfer mechanisms in MHD Carreau
nanofluid flow. Journal of Molecular Liquids, 225,
554-562.
[5] Eid, M. R., Mahny, K. L., Muhammad, T., &
Sheikholeslami, M. (2018). Numerical treatment for
Carreau nanofluid flow over a porous nonlinear
stretching surface. Results in physics, 8, 1185-1193.
[6] Tipping, M. E. (2001). Sparse Bayesian learning
and the relevance vector machine. Journal of
machine learning research, 1(Jun), 211-244.
[7] Tipping, M. E. (2000). The relevance vector
machine. in advances in neural information
processing systems. vol, 12, 652-658.
[8] Lee, K., Kim, N., & Jeong, M. K. (2014). The
sparse signomial classification and regression
model. Annals of Operations Research, 216(1), 257-
286.
[9] Naveed, M., Abbas, Z., & Sajid, M. (2016).
Hydromagnetic flow over an unsteady curved
stretching surface. Engineering Science and
Technology, an International Journal, 19(2), 841-
845.
[10] Sahoo, S. N. (2013). Heat and mass transfer effect
on MHD flow of a viscoelastic fluid through a
porous medium bounded by an oscillating porous
plate in slip flow regime. International Journal of
Chemical Engineering, 2013.
[11] Noor, N. F. M., Abbasbandy, S., & Hashim, I.
(2012). Heat and mass transfer of thermophoretic
MHD flow over an inclined radiate isothermal
permeable surface in the presence of heat
source/sink. International Journal of Heat and Mass
Transfer, 55(7-8), 2122-2128.
[12] Turkyilmazoglu, M. (2012). MHD fluid flow and
heat transfer due to a stretching rotating
disk. International journal of thermal sciences, 51,
195-201.
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
95
Volume 17, 2022
[13] Chen, C. H. (2004). Combined heat and mass
transfer in MHD free convection from a vertical
surface with Ohmic heating and viscous
dissipation. International journal of engineering
science, 42(7), 699-713.
[14] Vapnik, V. N. (2000). The nature of statistical
learning theory (second ed.). New York, USA:
Springer science & business media.
[15] Muller, K. R., Mika, S., Ratsch, G., Tsuda, K., &
Scholkopf, B. (2001). An introduction to kernel-
based learning algorithms. IEEE transactions on
neural networks, 12(2), 181-201.
[16] Zhang, R., & Wang, S. (2008). Support vector
machine based predictive functional control design
for output temperature of coking furnace. Journal of
Process Control, 18(5), 439-448.
[17] Yang, B., Zhang, Z., & Sun, Z. (2007). Robust
relevance vector regression with trimmed likelihood
function. IEEE Signal Processing Letters, 14(10),
746-749.
[18] Valyon, J., & Horváth, G. (2009). A sparse robust
model for a LinzDonawitz steel converter. IEEE
Transactions on Instrumentation and
measurement, 58(8), 2611-2617.
[19] Hwang, S., Jeong, M. K., & Yum, B. J. (2013).
Robust relevance vector machine with variational
inference for improving virtual metrology
accuracy. IEEE Transactions on Semiconductor
Manufacturing, 27(1), 83-94.
[20] Hwang, S., Kim, D., Jeong, M. K., & Yum, B. J.
(2015). Robust kernel-based regression with
bounded influence for outliers. Journal of the
Operational Research Society, 66(8), 1385-1398.
[21] Wu, Y., & Liu, Y. (2007). Robust truncated hinge
loss support vector machines. Journal of the
American Statistical Association, 102(479), 974-
983.
[22] Park, S. Y., & Liu, Y. (2011). Robust penalized
logistic regression with truncated loss
functions. Canadian Journal of Statistics, 39(2),
300-323.
[23] Abonazel, M., & Rabie, A. (2019). The impact of
using robust estimations in regression models: An
application on the Egyptian economy. Journal of
Advanced Research in Applied Mathematics and
Statistics, 4(2), 8-16.
[24] Abonazel, M., & Gad, A. A. E. (2020). Robust
partial residuals estimation in semiparametric
partially linear model. Communications in Statistics-
Simulation and Computation, 49(5), 1223-1236.
[25] Youssef, A. H., Kamel, A.R. & Abonazel, M. R
(2021). Robust SURE estimates of profitability in
the Egyptian insurance market, Statistical journal of
the IAOS, (Preprint), 1-13 (2021). DOI:10.3233/SJI-
200734.
[26] Kamel, A.R. (2021). Handling outliers in seemingly
unrelated regression equations model, MSc thesis,
Faculty of graduate studies for statistical research
(FGSSR), Cairo University, Egypt.
[27] Melin, P., & Castillo, O. (2005). Intelligent control
of a stepping motor drive using an adaptive neuro
fuzzy inference system. Information
Sciences, 170(2-4), 133-151.
[28] Tipping, M. E., & Lawrence, N. D. (2005).
Variational inference for Student-t models: Robust
Bayesian interpolation and generalised component
analysis. Neurocomputing, 69(1-3), 123-141.
[29] Ting, J. A., D'Souza, A., & Schaal, S. (2007, April).
Automatic outlier detection: A Bayesian approach.
In Proceedings 2007 IEEE International
Conference on Robotics and Automation (pp. 2489-
2494). IEEE.
[30] Jaakkola, T. S., & Jordan, M. I. (2000). Bayesian
parameter estimation via variational
methods. Statistics and Computing, 10(1), 25-37.
[31] Bishop, C. M., & Tipping, M. (2013). Variational
relevance vector machines. arXiv preprint
arXiv:1301.3838, available at:
https://arxiv.org/ftp/arxiv/papers/1301/1301.3838.pd
f
[32] Chang, C. C., & Lin, C. J. (2011). LIBSVM: a
library for support vector machines. ACM
transactions on intelligent systems and technology
(TIST), 2(3), 1-27.
[33] Jang, J. S. (1993). ANFIS: adaptive-network-based
fuzzy inference system. IEEE transactions on
systems, man, and cybernetics, 23(3), 665-685.
[34] Abonazel, M. R. (2018). A practical guide for
creating Monte Carlo simulation studies using
R. International Journal of Mathematics and
Computational Science, 4(1), 18-33.
Creative Commons Attribution
License 4.0 (Attribution 4.0
International , CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en
_US
WSEAS TRANSACTIONS on HEAT and MASS TRANSFER
DOI: 10.37394/232012.2022.17.9
Ahmad A. Alharbi, Amr R. Kamel, Samah A. Atia
E-ISSN: 2224-3461
96
Volume 17, 2022