On a derivative-free optimization approach
to some problems of civil engineering
JIŘÍ VALA, PETRA JAROŠOVÁ
Institute of Mathematics and Descriptive Geometry
Brno University of Technology, Faculty of Civil Engineering
CZ - 602 00 Brno, Veveří 331/95
CZECH REPUBLIC
Abstract: Development of advanced materials and structures for civil engineering, due to the requirements of
green and sustainable building, including the reduction of energy consumption and the balance between occupant
comfort and environmental friendliness, needs proper analysis of related physical, chemical, etc. processes, whose
mathematical description leads to direct, sensitivity and inverse initial and boundary value problems for nonlin-
ear partial differential equations, analysed numerically using finite element, difference and similar techniques.
Design optimization requires to implement a set of additional variable parameters into all related computations,
which is very expensive or quite impossible in most cases. Thus realistic computational strategies work with
the minimizations of some cost functions with unknown parameters using certain kind of numerical differentia-
tion, like quasi-Newton, inexact Newton or conjugate gradient methods, some derivative-free approach, or, as a
much-favoured alternative, some heuristic soft-computing algorithm. A reasonable compromise seems to be the
exploitation of an algorithm coming from the non-gradient Nelder-Mead simplex approach. In this paper, referring
to the experience with i) the direct problem of thermal design of a residential building and ii) the inverse problem
of identification of material characteristics as thermal conductivity and diffusivity from well-advised laboratory
experiments, after several remarks to the history and progress of the Nelder-Mead method and its improvements,
we shall demonstrate some convergence properties of such approach, regardless of the highly cited evaluation of
the original Nelder-Mead algorithm: “Mathematicians hate it because you cannot prove convergence; engineers
seem to love it because it often works.”
Key-Words: Derivative-free optimization; design in civil engineering; direct and inverse problems; finite
element and difference techniques.
Received: April 23, 2023. Revised: July 26, 2023. Accepted: August 19, 2023. Published: September 26, 2023.
1 Introduction
Buildings are responsible for a substantial part of
global energy consumption, as well as of greenhouse
gas emissions, both in developed and developing
countries, at all levels of standard of living. Although
the estimates of these parts are not quite consistent,
the related research of advanced materials and struc-
tures for civil engineering, including the reduction of
energy consumption and the balance between occu-
pant comfort and environmental friendliness, accel-
erates in 2 last decades: for its development from the
preference of massive insulation of separate residen-
tial houses to contemporary green and sustainable ur-
ban planning, [1], [2]. However, such research re-
quires deeper analysis of relevant physical, chemi-
cal, etc. processes, whose mathematical description
leads to initial and boundary value problems for non-
linear partial differential equations, analysed numeri-
cally using finite element, difference and similar tech-
niques; in addition to direct problems, also the sen-
sitivity and inverse ones, [3], are substantial. De-
sign optimization requires to implement a set of ad-
ditional variable parameters into all related computa-
tions, which is still very expensive or quite impossible
in most cases, although the progress in computational
hardware and software, parallel and distributed archi-
tectures, etc., has a potential to make such argumen-
tation less valuable.
Effective computational strategies try to minimize
some cost functions with unknown parameters using
certain kind of numerical differentiation, like quasi-
Newton, inexact Newton or conjugate gradient meth-
ods, [4], [5], Chaps. 5, 6, 7, some derivative-free ap-
proach, [6], or, as a much-favoured alternative, some
soft-computing heuristic algorithm, [7]. A reasonable
compromise seems to be the exploitation of an al-
gorithm coming from the non-gradient Nelder-Mead
simplex approach, suggested by [8] originally, al-
though the historical summary, [9], popularizes the
interview, [10]: “Mathematicians hate it because you
cannot prove convergence; engineers seem to love it
because it often works.” Nevertheless, just this ap-
proach succeeded in the concurrence of other classical
direct search algorithms, [11], [12], [13], as demon-
strated by their critical historical overview, [14].
The strong motivation for this study comes from
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.71
Jiří Vala, Petra Jarošová
E-ISSN: 2224-2880
641
Volume 22, 2023
the engineering analysis of i) the direct problem of
thermal design of a residential building, [15], [16],
and ii) the inverse problem of identification of ma-
terial characteristics as thermal conductivity and dif-
fusivity from well-advised laboratory experiments,
[17], [18]. Parallel to the development of special soft-
ware packages for i) and ii), some convergence prop-
erties of ad hoc modifications of the original Nelder-
Mead simplex algorithm, [16], have been studied by
the authors of this paper, with the concluding require-
ment of their deeper analysis. In such context, a
more general class of promising non-expensive com-
putational algorithms will be sketched in this article,
with the preference of those implemented in the above
mentioned software. After this Introduction (Section
1) and the Available results (Section 2) with numer-
ous notations and historical remarks, the Preliminary
considerations (Section 3) will prepare all means for
the Convergence of numerical algorithms (Section 4),
followed by the two-dimensional Illustrative exam-
ple (Section 5), supplied by instructive figures, the
Selected applications (Section 6), referring to previ-
ous authors’research activities, containing experience
with optimization strategies, and the brief Conclusion
(Section 7).
2 Available results
To be able to explain the Nelder-Mead approach, in-
cluding various modifications and potential improve-
ments, we have to introduce some basic notations.
As a model introductory problem let us consider a
real-valued function Fof a finite integer number n
of real variables x= (x1, . . . , xn)
>Rn. Let us
suppose that Fhas an infimum F×, whereas G(x) =
(F(x)/x1, . . . , F(x)/xn)
>, i. e. the gradient of
F, is a vector of Lipschitz continuous functions. All
vectors from the Euclidean space Rnlike xhere will
be considered as column ones, supplied by lower in-
dices referring to their components and by upper ones
(if needed) referring for special positions. Especially
an ordered set {x1, x2, . . . , xn+1}of simplex vertices
will define such simplex Sin Rnthat
F(x1) F(x2). . . F(xn+1)(1)
is satisfied. In the following text Gwithout any ar-
gument will denote G(xn)for brevity. We shall use
the additional notations vi=xixn+1 for any
i {1, . . . , n}and v= (v1+. . . +vn)/n. We shall
need also VRn×n, introduced as
V=
v1
1. . . vn
1
.
.
.....
.
.
v1
n. . . vn
n
,(2)
with h=max(|v1|, . . . , |vn|).(3)
Let us consider some direction S Rnand such real
constant γthat the point
x0=xn+1 +v0, v0=γS,(4)
lies in the (n1)-dimensional hyperplane Hin Rn
defined by {x1, x2, . . . , xn}. The evaluation of γis
then easy: the hyperplane His characterized by
V>a=e , a = (V1)
>e , a
>=e
>V1(5)
where e= (1,1, . . . , 1)
>; an important consequence
of (5) is
a
>γS=γe
>V1S= 1 , γ1=e
>V1S.(6)
To support the notations and considerations of Sec-
tion 3, let us introduce the following straightforward
7-step ad hoc generalization of the original Nelder-
Mead simplex algorithm. Using, in addition to the a
priori choice of S, five constants αR,αE,αCO,αCI
and αS, satisfying the inequality
min(αR, αE, αCO, αCI, αS,1αCI,1αS)>0,(7)
such algorithm reads:
1. Sort {x1, x2, . . . , xn+1}to guarantee (1).
2. Evaluate γfrom (6), set (ςR, ςE, ςCO, ςCI) =
γ(αR, αE, αCO, αCI)and x0from (4).
3. Try reflection xR=x0+ςRS.
If F(x2)<F(xR)<F(xn+1)then replace
xn+1 by xRand go to step 7.
4. If F(xR) F(xN)then go to step 5.
Try expansion xE=x0+ςES.
If F(xE)<F(xR)then replace xn+1 by xE, oth-
erwise replace xn+1 by xR, and (in both cases)
go to step 7.
5. Try outside contraction xCO =x0+ςCOSand
inside contraction xCI =x0ςCOS.
If min(F(xCO),F(xCI)) F(xn)then go to
step 6.
If F(xCO) F(xCI)then replace xn+1 by
xCO, otherwise replace xn+1 by xCI, and (in both
cases) go to step 7.
6. Perform shrink: take (x1+αS(x2x1), . . . , x1+
αS(xn+1 x1)) instead of (x2, . . . , xn+1).
7. Check convergence and stop, or return to step 1.
Let us emphasize that none of the constants occurring
in (7) is allowed to be equal to 0: such αSwould force
the reduction of Sto one point, four remaining con-
stants would lead to the unwanted degeneration of S
to a simplex S0with nvertices in H, whereas the hy-
pothetical choice αCI = 1 or αS= 1 could not work,
preserving Sunchanged.
In particular, the classical choice, [8], referred as
SNMS (Standard Nelder-Mead Simplex) frequently,
making use of our notation (2), is
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.71
Jiří Vala, Petra Jarošová
E-ISSN: 2224-2880
642
Volume 22, 2023
S=v=1
nV e , (8)
supplied with
αR= 1 , αE= 2 , αCO =αCI =αS= 1/2 ; (9)
consequently γ= 1, thus step 2 can be reduced to
x0= (x1+. . . +xn)/n, i. e. to x0=xn+1 +v,
whereas a vector (ςR, ςE, ςCO, ςCI)is identical with
(αR, αE, αCO, αCI). This case, whose original terms
reflection,expansion,outer /inner contraction and
shrink have their intuitive meanings, is still signif-
icant in computational practice. Indeed, by (6) we
have γ1=1
ne
>V1V e = 1 .(10)
Unfortunately, various well-documented counterex-
amples of stagnation or divergence of this algorithm
can be found both in the classical literature, [19], and
in their later extensive survey, [9]. This has moti-
vated numerous attempts to derive some “better con-
vergent” variants of SNMS.
To improve SNMS, especially for n>>2, [20] sug-
gested ANMS (Adaptive Nelder-Mead Simplex) with
αR= 1 , αE= 1 + 2/n , (11)
αCO =αCI = 3/4 1/(2n), αS= 1/2
instead of (9), thus for n= 2 ANMS coincides with
SNMS; moreover, certain sufficient descent property
is required for expansion and contraction steps. Po-
tential stagnation trends in both SNMS and ANMS
can be handled using the sufficient decrease moti-
vated oriented restarts, [21], reinitializing the sim-
plex to a smaller one with orthogonal edges, or, in the
more complicated way, introducing certain pseudo-
expansion on so-called “ghost simplices”, additional
to S, coupled with the sequences of quasi-minimal
frames, reshaping the simplices Susing the matrix
QR-decomposition, [22], with help of the knowledge
of properties of positive bases and frames, [23]. Such
algorithm seems to be also implemented in the MAT-
LAB function fminsearch from the toolbox optimiza-
tion.
Unlike ANMS, GBNM (Globalized Bounded
Nelder-Mead), [24], still preserving γ= 1, relies
on another generalization (so-called globalization)
of searching for appropriate αR,αE,αCO,αCI and
αS, satisfying (7), together with probabilistic restarts,
whereas RMNM (Restarted Modified Nelder-Mead),
[25], works with an advanced deterministic modifica-
tion of the shrink step. Still other attempt to improve
SNMS rely on its hybridization with appropriate soft
computing techniques, as with differential evolution,
[26], with genetic algorithms, [27], [28], [29], with ar-
tificial neural networks (NM-ANN), [30], with parti-
cle swarm optimization (NM-PSO), [31], or with ma-
chine learning, [32], Chap. 12.
The existing convergence theory for SNMS and
related algorithms is far from being satisfactory
and comparable with that available for gradient ap-
proaches. The following results try to handle strictly
convex Fwith bounded level sets. Namely SNMS
is verified to converge to the minimizer for n= 1
(where, in our notation, the directions of Gand Sal-
ways coincide), [33] ; for n= 2 only the conver-
gence of hby (3) to zero during SNMS can be guar-
anteed, [33]. The strongest result, up to now, [34], for
n= 2 seems to be: SNMS with disabled expansion
converges always to the minimizer. The computer-
assisted tedious proof of this result, presented at 25
pages, needs the step-by-step elimination of all pos-
sible non-convergent cases; one can see the origin
of such proof technique, [9], in the Sherlock Holmes
argument, [35], Chap. 6: “How often have I said to
you that when you have eliminated the impossible,
whatever remains, however improbable, must be the
truth ?” Nevertheless, none of these results, includ-
ing recommended corrections, [21], [22], guarantees
the convergence even for a probably simplest example
with n= 2, i. e. that of revolution paraboloid surface
F(x1, x2) = x2
1+x2
2,(12)
with an arbitrary regular initial triangle!
Apart from the various advanced strategies for the
initial choice of S, [36], [37], [38], avoiding the di-
vergence (not only) for (12), some results working
with approximation of derivatives, [39], [40], can be
useful, too. Unfortunately, the hypothetical standard
approximation of Fby a quadratic polynomial, whose
minimum or other stationary point could be detected
easily, needs to calculate N= 1 + 1 + 1 = 3 values
for n= 1,N= 1 + 2 + 3 = 6 values for n= 2,
N=1+3+6 =10 values for n=3, etc., in general
N=
2
X
k=0 n+k1
k=
2
X
k=0
(n+k1)!
k! (n1)! (13)
= 1 + n+1
2n(n+ 1) = 1 + 1
2n(n+ 3)
=n+ 1
1+n+ 1
2,
thus e. g. N= 66 values for n= 10, which is dif-
ficult to implement into any effective numerical al-
gorithm, namely if such values must be obtained in
some non-trivial way, typically as outputs from the
numerical analysis of partial differential equations of
evolution, as mentioned above. However, the last
equality in (13) exhibits the possibility of a simple
choice of evaluation points in all vertices and in all
centres of one-dimensional edges of S. For com-
parison N {1,2,3,4}in every step of SNMS, as
well as in our generalization of SNMS, except shrink
with N=n(as number of edges of S), which is
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.71
Jiří Vala, Petra Jarošová
E-ISSN: 2224-2880
643
Volume 22, 2023
much smaller than Nreceived from (13), especially
for large n. Thus also the progress in the conver-
gence theory of related matrix algorithms and iter-
ative subspaces in the last several years, [41], [42],
[43], [44], must be mentioned, regardless of the (at
least partial) loss of user-friendliness, simplicity and
transparency, appreciated as advantages of algorithms
closer to SNMS, namely in the control theory, [45],
[46].
3 Preliminary considerations
For simplicity we shall refer the classical choice of
S=vby SNMS as a), without a deeper study on set-
tings of parameters αR,αE,αCO,αCI and αSlike (9) or
(11) here; The evident drawback of a) is its ignorance
of differences DFi=FiFn+1 where Fi=F(xi)
for i {1,2, . . . , n}and Fn+1 =F(xn+1); the brief
notation DF= (DF1, DF2, . . . , DFn)
>will be ap-
plied here. Let us show some examples of alternative
choices of Snow.
As b) we can take the orthogonal projection S=a
by (5). Clearly (6) is applicable to any choice of S.
Therefore, modifying (10), we receive
γ1= (e
>V1)(e
>V1)
>.(14)
A positive γcan be evaluated from (14) for any reg-
ular Sevidently. The length of u0is the lowest pos-
sible, but x0can lie in Houtside S, which cannot be
excluded in all following cases, too; DFis still not
taken into account.
The motivation for c) can be seen in the application
of the Lagrange theorem
DFi=vi· G(exi) = vi· G +vi·(G(exi) G)(15)
for any i {1,2, . . . , n}; central dots here and in
the following formulae will always refer to standard
scalar products in Rn, whereas |.|will denote lengths
of vectors in Rn,exiare some (a priori unknown)
points lying in the line segment between xiand xn+1.
Let us notice such idea of “steepest descent” (with-
out SMNS) for numerical evaluation of function ex-
trema, [47], as well as more historical matters of inter-
est, [14]. Thanks to the assumed Lipschitz continuity,
working with some positive constant L, taking also
|ui| hinto account, we can rewrite (15) as
DF=V>G+LhV>ϕ(16)
G= (V1)
>DF L
where a vector ϕin Rnis allowed to return only real
values between 1and 1. As a simplified analogue
to (16), neglecting its second right-hand-side additive
term, let us set
DF=V>S,S= (V1)
>DF.(17)
By (6) we obtain
γ1=e
>V1(V1)
>DF.(18)
Unfortunately, unlike (14), we cannot guarantee the
correct evaluation of γfrom (18) in all situations; this
fails in the hypothetical case of the orthogonality of
vectors (V1)
>DF,(V1)
>ein Rn, in other words:
of the V1(V1)
>- orthogonality of vectors DF,e.
Consequently some correction strategy should be im-
plemented.
The above mentioned disadvantage of c) can be
compensated (at least theoretically) by the following
setting d). Let us replace (17) by
DF=V>e
S,S=1
h2V V>e
S=1
h2V DF.(19)
Thus (6) yields
γ1=1
h2e
>DF,(20)
e
>DFon the right-hand side being just the average
value of DF, multiplied by n. Therefore, exclud-
ing the unwanted stagnation case, a positive γcan be
evaluated from (20) for any regular S, similarly to b),
unlike c). Nevertheless, the gradient motivation of c)
is weakened by the application of an artificial matrix
weight V V>in (19), not needed in (17).
All cases a), b), c), d) admit various modifications,
including linear combinations as the simplest ones.
As a representative one, in the following considera-
tion we shall try to improve c) utilizing a small per-
turbation from a), i. e.
S= (V1)
>DF+β
nV e , (21)
DF=V>S β
nV>V e ,
dependent on a positive parameter β. From (21) and
(6) we have
γ1=e
>V1(V1)
>DF+β
ne
>e(22)
=e
>V1(V1)
>DF+β .
Thus at least an adaptive choice of βin (22), whose
upper bound can be a priori prescribed, is able to sup-
press the danger of fail of the evaluation of γby (18)
in c). However, this does not handle potential degen-
eration of Sin general, as we shall see in more details
in Section 4.
Before the formulation of convergence criteria, let
us prepare some a priori estimates. Since S G was
intended by c) in a reasonable sense, the estimates for
S·G and |S|2=S·S in comparison with |G|2=G·G
are desirable. Confronting (16) with (21), we obtain
S=G+β
nV e +L , (23)
=G+βh
ne
V e +L ,
S · G =|G|2+βh
nG · e
V e +LhG · ϕ
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.71
Jiří Vala, Petra Jarošová
E-ISSN: 2224-2880
644
Volume 22, 2023
where e
V=h1V. Applying the spectral norms of
matrices from Rn×n, [48], Chap. 9, we are allowed to
take ke
Vkas the square root Nof the maximal eigen-
value (spectral radius) of e
V>e
V, independent of h; the
much cheaper estimate by the Schur norm is Nn.
Clearly |ϕ|2 |e|2=n, too. Thus (23) together with
the Young inequality implies
S · G |G|2βh
n|G|ke
Vk|e|−Lh|G||ϕ|(24)
|G|2(β+L)nh|G|
(1 )|G|2(β+L)2n
4h2.
In the quite analogous way we can derive
|S| |G| +βh
nke
Vk|e|+Lh|ϕ|(25)
|G| + (β+L)nh ,
|S|22|G|2+ 2(β+L)2nh2.
Let us exploit the Lipschitz continuity of a gradient
of Fonce more, similarly to (15), (16). In a sequence
of standard steps (except shrink where the modifica-
tion of the following discussion is obvious) of our
algorithm we come to some x=xn+1 +ςSwith
ς {ςR, ςE, ςCO, ςCI}where F=F(x)>F×.
Thus we have
Fn+1 F=ςS · G(ex)(26)
=ςS · G +ςS · (G(ex) G)
with exsomewhere in the line segment between x
and xn+1. Then (26) implies
Fn+1 FςS · G |ςS|L|ςS| (27)
=ςS · G ς2L|S|2.
Inserting (24) and (25) into (27), we receive
Fn+1 FCς|G|2b
Cςh2,(28)
Cς=ς((1 )2ςL),
b
Cς=ς1
4+ 2ςL(β+L)2n .
We shall need positive Cςand b
Cςin the following
considerations, thus ς < (1 )/(2L)is needed here;
for admissible exceptions see the final part of Sec-
tion 4. Let us notice that the careful selection of Cς
and b
Cςis welcome: e. g. the improper choice of very
small can produce a negative right-hand-side esti-
mate in (28), whereas the left-hand side of (28) is non-
negative by its definition.
Let us now introduce the simplified notations F=
(F1+. . . +Fn+Fn+1)/(n+ 1) and F= (F1+
. . . +Fn+F)/(n+ 1). Simulating the substitution
of xn+1 by certain new x(before resorting), i. e. the
reshaping of Sin a current k-th iteration, from (27)
we come to the result
(n+ 1) FFCς|G|2b
Cςh2.(29)
Such result is then applicable to any iteration k
{0,1,2, . . .}, starting from an initial simplex Swith
k= 0.
4 Convergence of numerical
algorithms
Let us consider xk,h(k),G(k),V(k)and Fas x,h,G,
Vand F(k)from the k-th iteration, k {0,1,2, . . .},
F(0) being identical with Fin the initial iteration. Let
the selection of h(k)satisfy the condition
X
k=1
h(k)2 H (30)
where His some finite positive constant. If |G(k)|= 0
for some finite k, the minimum of Gis just attained;
this lucky case will be excluded from further consider-
ations. In the general (computationally realistic) case
we have
F(0) F×
X
k=1 F(k1) F(k)(31)
C
X
k=1 |G(k)|2b
C
X
k=1
h(k)2,
taking some lower bound of applied values Cςfrom
(29) as Cand some upper bound of applied values b
Cς
from (29) as b
Cinto account. Thanks to (30), from
(31) we obtain
X
k=1 |G(k)|2C1F0 F×+b
C.(32)
The existence of some (at least) accumulation point
for x(k)with k follows from (3), forcing h
(k)
0; then G(k)tends to (0, . . . , 0)
>in Rnby (32). More-
over, if Fhas a positive definite Hessian matrix than
the standard differential calculus, [49], Chap. 4, re-
sults that Fhas exactly one minimum F×in Rn. The
limit passage x(k)x×,F(x×) = F×, can be per-
formed according to [4], Chap. 1.
Let us notice that the condition (30) is rather re-
strictive: introducing e. g. two simple choices
h(k)=1
2h(k1) =1
2kh(0) ,(33)
h(k)=rk
k+ 1 h(k1) =1
k+ 1 h(0) ,(34)
we can see easily that (30) is satisfied applying h(k)by
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.71
Jiří Vala, Petra Jarošová
E-ISSN: 2224-2880
645
Volume 22, 2023
(33) with H=h(0)2(as a sum of geometric se-
ries), which cannot be repeated for h(k)by (34) lead-
ing to H . Only a formal computational rem-
edy is available: since F(k1) F(k)in the first right-
hand-side estimate of (31) is always non-negative, it
is possible to consider the above presented limit pas-
sage up to a subsequence, i. e. all sums in (30), (31)
and (32) are allowed to be calculated over any infinite
set of integers, instead of the whole set {1,2,3, . . .},
with the same qualitative result. Nevertheless, sub-
stantial decrease of the convergence rate can be ex-
pected in such case. The following comments refer to
a possible remeshing strategy and further limitations
of this approach.
Numerous more expensive shrink steps, or even
some restarting ones, not included in the basic al-
gorithm of Section 3 explicitly, with various moti-
vations, [21], [22], [24], [25], can be required to
force (30). Restarting steps also prevent the de-
generation of Sto a simplex in a less-dimensional
space than Rn. This could be checked using det V
from (2) related to hform (3), but such evalua-
tion is too expensive. Thus, as a typical example
for practical computations, let consider the remesh-
ing strategy based on the following idea: all u2=
x2x1, . . . , un+1 =xn+1 x1replace by such
(|u2|/|bu2|)bu2, . . . , (|u2|/|bun+1|)bun+1 that i) bu2=u2
and ii) bui+1 =ui+1 +ζ2bu2+. . . ζibuiwhere
bui+1 ·bu2=. . . =bui+1 ·bui=Z(35)
for i {2, . . . , n},Zbeing a fixed constant, 0<
Z 1. Clearly (35) requires to solve an auxiliary sys-
tem of i1linear algebraic equations with unknown
parameters ζ2, . . . , ζi. Let us also notice that, in com-
parison with (2), x1=xn+1 +v1,x1+ui=xn+1 +vi
for i {2, . . . , n},xn+1 +v1=x1. Taking i)
bx1=x1and ii) bxi+1 =xi+1 + (|u2|/|bui+1|)bui+1 for
any i {1, . . . , n}(in particular, also bx2=x2), we
are able to replace xiby bxifor all i {1, . . . , n + 1}.
Let us notice that Sis always regular for the choice
Z= 1/2, with the length of all edges h=|u2|; un-
like this, the choice Z= 0 coincides with the Gram-
Schmidt orthonormalization, [48], Chap. 2. However,
no such choice guarantees the decrease of (F1+. . .+
Fn+Fn+1)/(n+ 1) in general, therefore additional
shrink steps cannot be avoided.
Consequently no reasonable a priori information
on the convergence rate is available; one should still
rely on a repeated check of the validity of (30) dur-
ing the computation. This is replaced by cheaper ad
hoc remeshing tricks in engineering software pack-
ages, whose priorities must be reassessed in re-
cent parallel and distributed computational systems:
namely the above sketched rearrangement of S(i. e.
the remeshing of x1, . . . , xn+1) represents a classi-
cal sequential algorithm, but the parallel evaluation
of F(x1), . . . , F(xn+1), expected as the most expen-
sive, is possible.
5 Illustrative example
As an illustrative example, we shall present the min-
imization of (12), using the in-house experimental
software package created at the Institute of Mathe-
matics and Descriptive Geometry (founded 1899) of
Brno University of Technology, Faculty of Civil En-
gineering (abbreviated as FCE BUT). Since the exact
minimizing point x×= 0,F(x×) = 0 is well-known
in this case, an absolute error lesser than E= 3 .103
for the stopping criterion can be required: such value
is chosen to support an appropriate visualization, us-
ing the standard MATLAB functions; the initial sim-
plex Slies rather far from the minimizing point for
the testing purposes. The computational results are
shown for the classical SNMS algorithm on Fig. 1;
for its modification c) with a small perturbation by
(17), taking β= 0.1(only formally, rarely occurring
in practice, switching to a non-zero βfor |γ| E
only), supplied by the recommended potential reini-
tialization of S, [21], as discussed in Section 3, on
Fig. 2. Finally Fig. 3 corresponds to the same ap-
proach, switching such cautious reinitialization off.
The dotted lines show the orthogonal projection of re-
lated simplices Sto the plane (x1, x2)everywhere.
Figure 1: Classical algorithm SNMS, [8]: 12 itera-
tions, 36 evaluations of F.
F(x1, x2)
x1x2
Of course, this example, including its strange ini-
tial setting of S, has been prepared intentionally to
demonstrate some interesting properties of the dis-
cussed class of algorithms. As evident from Fig. 1,
the classical SNMS algorithm exhibits the lowest ten-
dency to the degeneration of S, whereas the improved
one tends to the minimum x×of Fmore efficiently,
but at the cost of more shrink steps by Fig. 3, or even
at the cost of reinitialization to certain “more regular”
Sby Fig. 2; this can be observed here especially in the
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.71
Jiří Vala, Petra Jarošová
E-ISSN: 2224-2880
646
Volume 22, 2023
Figure 2: Modified algorithm by c), Section 3,
reinitialization of simplices enabled: 12 iterations
and 1 reset, 41 evaluations of F.
F(x1, x2)
x1x2
Figure 3: Modified algorithm by c), Section 3,
reinitialization of simplices disabled: 11 iterations,
34 evaluations of F.
F(x1, x2)
x1x2
last (still visible) iterations near x×because the first
5 iterations shown on Fig. 2 and Fig. 3 are exactly the
same, unlike those on Fig. 1. However, the number of
sequential calculations of particular values of F, as
presented here, cannot be understood as a unique cri-
terion for the algorithm effectiveness (especially from
such unconvincing differences in number of iterations
occurring in an example) because of the contempo-
rary offer of parallel and distributed computations.
For more realistic applications (but worse transpar-
ent from graphical presentations) in civil engineering,
as referenced in Section 6, (at least) the modification
c) promises to bring certain non-negligible computa-
tional benefit.
6 Selected applications
Let us now present some useful references to two
representative kinds of such problems, whose experi-
mental, physical, mathematical and numerical analy-
ses including software implementation are still in de-
velopment at FCE BUT. To retain a reader-friendly
extent of this article, we shall avoid the formulation
details, which can be found in parts in 6 referenced
publications, [15], [16], [17], [18], [50], [51].
As such first kind we can mention the thermal
design of buildings, as a special part of significant
environmental research in civil engineering, oriented
to advanced materials and structures, announced by
Section 1; A residential building can be considered,
[15], [16], in a simplified way as a thermal system,
whose particular rooms have a constant temperature,
whereas all outer and inner walls, ceilings, floors, etc.,
are represented by a one-dimensional, typically lay-
ered interfaces both between such rooms and between
rooms and exterior environment. Now i) the composi-
tion of all such interfaces must be prescribed, as well
as ii) the required temperature in particular rooms dur-
ing a computational year and iii) all climatic data,
namely the temperature and thermal radiation due to
the Stefan-Boltzmann law, corrected by their wave-
length ranges (because of their selective absorption by
the Earth’s atmosphere), assumed shading by vegeta-
tion, etc., in daily quasi-cycles for a complete refer-
ence year; also iv) some initial distribution of temper-
ature in a building must be given. All exterior build-
ing surfaces are exposed to such climatic conditions
due to the apparent motion of the Sun on the sky at
the prescribed geographic location.
Such type of direct problems are usually analysed
using the principles of classical thermodynamics: the
first one can be reduced to the conservation of ther-
mal energy here, the second one should be satisfied
automatically for an appropriate design of constitu-
tive relations, namely of the empirical Fourier, Stefan-
Boltzmann, etc. laws, whereas the third one forbids
the temperature lesser than absolute Kelvin zero for-
mally (which is far from being realistic in our consid-
erations). The above sketched simplification gener-
ates only one parabolic partial differential equation of
evolution with a unknown temperature field θ(ξ, t),
i. e. of the second order in ξand in the first order in
t, where ξrefers to the “spacial” coordinate in lay-
ers and tis the time, starting from its initial value,
typically t= 0. This equation contains certain ma-
terial characteristics, namely the thermal conductiv-
ity λ(ξ, θ), responsible for the heat insulation ability
of particular layers, and the thermal capacity κ(ξ, θ),
corresponding to their heat accumulation properties.
For the evaluation of thermal consumption of resi-
dential buildings both λand κare considered as con-
stant in any such layer frequently, which seemingly
preserves the linearity of such equation. However, at
least two sources of nonlinearity cannot be removed:
i) the power-law form of the Stefan-Boltzmann law
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.71
Jiří Vala, Petra Jarošová
E-ISSN: 2224-2880
647
Volume 22, 2023
and ii) the strategy of switching of heating (and also
air-conditioning, if required) and choice of heating
power, bringing some additional inequalities and re-
ferring to selected results from the control theory.
The numerical analysis for a direct problem can
rely on the Fourier multiplicative decomposition of
θ(ξ, t): the resulting approximation, exploiting e. g.
the finite element method (FEM), can be then writ-
ten of a sparse system of ordinary differential equa-
tions. Without above mentioned nonlinearities, such
θcan be then expressed analytically, using the gen-
eralized eigenvalues and eigenvectors of real square
matrices; however, proper implementation of nonlin-
earities needs additional iterative procedures in suf-
ficiently small time steps, using the finite difference
technique. Potential dependence of λand κon ξ(in-
side layers) can be handled using a standard approach
of numerical integration, involved in FEM; their de-
pendence on θforces more complicated iterative pro-
cedures.
The aim function, realizable as Fin the previous
considerations, is now the energy consumption dur-
ing a reference year. The choice of unknown param-
eters can be rather delicate: as their instructive exam-
ples we can mention a) the angle of horizontal rota-
tion of the house around a fixed axis, b) the fenes-
tration on a selected wall, or c) the dislocation and
operating power of one or more heating devices. Un-
fortunately, numerous limitations from technical stan-
dards must be taken into account via artificial penalty
terms, as minimal fenestration for natural lighting, or
minimal air exchange in rooms per hour. The valida-
tion of the computational results has been performed
on a small family house in Moravian Karst, about
30 km northern from Brno, with the slightly modified
climatic data from the international airport in Brno-
Tuřany. Nevertheless, the chance to much bigger en-
ergy savings can be seen in the buildings with con-
trolled temperature for industrial applications, namely
in freezing and cooling plants, [50].
The announced second kind of problems is ad-
dressed to the identification ones. Continuing the pre-
vious considerations, we can now study the thermal
transfer in the general 3-dimensional Euclidean space,
simulating the laboratory conditions computationally,
without the knowledge of values of λand κ, but with
the knowledge of time distribution of θat pre-defined
locations during some standard, carefully controlled
heating process. Such experiments work with very
special geometric configuration of specimens under
strictly allotted conditions, to suppress the unwanted
influence of further physical processes: thus for the
identification of λand κ, [17], [18], we can distin-
guish between hot-plate, hot-ball and hot-wire mea-
surement. If λand κare constant and the material
is isotropic then the analytical solutions are avail-
able, or at least the semi-analytical ones, containing
Fourier series, non-elementary integrals, Bessel func-
tions, etc.; the large collection of useful classical for-
mulae, whose tedious development was motivated by
the limited possibilities of numerical analysis, [52].
This offers the possibility to intervene into the iden-
tification procedure directly, without time-consuming
repeated waiting for the evaluation of Ffrom certain
“black box” with suggested values of parameters λ
and κ. As Fwe can consider the quadratic error be-
tween the predicted and simulated values of θ(ξ, t)for
several fixed points ξ(e. g. 2 points can be sufficient
for hot-wire measurements on long cylindrical spec-
imens in the radially symmetric configuration) and
sufficiently large number of time steps, such prob-
lem then refers to nonlinear regression in mathemati-
cal statistics.
Unfortunately, some of the above presented as-
sumptions are violated frequently: e. g. the influence
of the thermal transfer factor (which may be not con-
stant) at the interface(s) between the specimen and
other part(s) of the measurement apparatus is non-
negligible, or the dependence of λand κon θcannot
be neglected during the experiment in some expected
temperature range. All such significant disturbances
in semi-analytical formulations argue for switching
back to the above discussed approaches of minimiza-
tion of F, if not to soft computing recipes with artifi-
cial neural networks, genetic algorithms, fuzzy logic
or cluster analysis, [53]. Identification of λand κfor
refractory materials at high temperatures, utilizing a
special oven, [51], can serve as an indubitable exam-
ple.
7 Conclusion
We have demonstrated some convergence properties
of a class of rather simple and computationally trans-
parent algorithms, derived from the classical Nelder-
Mead one. Avoiding the difficulties with the construc-
tion of a complete set of derivatives of a (sufficiently
smooth) real function F, or a quadratic polynomial
approximating F, the crucial step of the modification
of the classical SNMS is the replacement of the grav-
ity centres x0of the n-dimensional simplices S0in
Hby various other points of H, not belonging to S0
(as introduced in Section 2) necessarily, e. g. those in
the intersection of lines, respecting the roughly es-
timated gradient directions from a remaining vertex
of xn+1 S(xn+1 /S0), with Hagain. Clearly
this generates a much larger class of directions Spro-
portional to x0xn+1, which can be seen as a non-
negligible contribution of this article, not covered by
approaches reviewed by Section 2.
However, some disadvantages and limitations of
this approach must be mentioned: i) The conver-
gence results presented in this article do not overcome
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.71
Jiří Vala, Petra Jarošová
E-ISSN: 2224-2880
648
Volume 22, 2023
the criticized too strong assumptions in convergence
proofs for SNMS completely: cf. the disabled expan-
sion in SMNS, not respected in effective engineer-
ing computations, to obtain a sufficient decrease of h,
even for n= 2, [34], as well the purpose-built restart-
ing strategies to handle published counterexamples of
algorithm divergence or stagnation. ii) Other than suf-
ficiently smooth functions Fare needed in applica-
tions frequently, even some additional conditions in
the form in inequalities, coming from technical stan-
dards, [16]; for such cases no proper verification tech-
nique is available, only some validation one, relying
on the comparison with experiments or other com-
putations, [54]. iii) Restarting steps, as described
in Section 4, may require a comparable number of
evaluations of Fwith simplest descent methods, [4],
Chap. 2, with potential support of parallel processing.
Also some soft-computing techniques can be capa-
ble of competing, regardless of their heuristic nature.
Thus the research plans of the authors’ team at BUT
for the near future cover i. a. both some deeper conver-
gence considerations than those sketched by Section 4
for an appropriate class of algorithms compatible with
Section 2 and the development of hybrid methods for
special civil engineering applications, [55].
References:
[1] W. Feist: Gestaltungsgrundlagen Passivhäuser.
Das Beispiel, Darmstadt, 2001. (In German.)
[2] D. B. Corner: Passive House Details: Solu-
tions for High-Performance Design. Routledge,
Abingdon (UK), 2017.
[3] H. R. B. Orlande: Inverse problems in heat
transfer: New trends on solution methodologies
and applications. J. Heat Transfer 134 (2012),
031011 / 1–13.
[4] L. Lukšan: Numerické optimalizační metody.
Technical report 1152, Institute of Informatics
CAS, Prague, 2017. (In Czech.)
[5] A. Neculai: Modern Numerical Nonlinear Opti-
mization. Springer, New York, 2022.
[6] J. Larson, M. Menickelly, S. M. Wild:
Derivative-free optimization methods. Acta
Numer. 28, 287–404.
[7] P. O. Omolaye, J. M. Mom, G. A. Igwue: A
holistic review of soft computing techniques.
Appl. Comput. Math. 6 (2017), 93–110.
[8] J. A. Nelder, R. Mead: A simplex method for
function minimization. Computer J. 7 (1965),
308–313.
[9] M. H. Wright: Nelder, Mead, and the other sim-
plex method. Doc. Math., Extra Volume ISMP
(2012), 271–276.
[10] S. Senn: A conversation with John Nelder. Stat.
Sci. 18 (2003), 118–131.
[11] H. H. Rosenbrock: An automatic method for
finding the greatest or least value of a function.
Computer J. 3 (1960), 175–184.
[12] M. J. D. Powell: An efficient method for finding
the minimum of a function of several variables
without calculating derivatives. Computer J. 7
(1964), 155–162.
[13] W. H. Swann: Direct search methods, in: Nu-
merical Methods for Unconstrained Optimiza-
tion (W. Murray, ed.). Academic Press, London,
1972, 13–28.
[14] R. M. Lewis, V. Torczon, M. W. Troset: Direct
search methods: then and now. J. Comput. Appl.
Math. 124 (2000), 191–207.
[15] J. Vala, P. Jarošová: Optimization of de-
sign parameters of low-energy buildings.
Proc. 22nd Thermophysics in Terchová (Slovak
Rep., 2017), AIP Conf. Proc. 1866 (2017),
040041 / 1–6.
[16] J. Vala, P. Jarošová: Optimization approaches to
some problems of building design. Appl. Math.
63 (2018), 305–331.
[17] J. Vala: On some non-gradient approaches to
inverse and optimization problems of thermal
transfer. Proc. 23th Thermophysics in Smolenice
(Slovak Rep., 2018), AIP Conf. Proc. 1988
(2018), 020047 / 1–6.
[18] J. Vala: On the computational heat transfer eval-
uation in buildings with controlled temperature.
Proc. 27th Thermophysics in Dalešice (Czech
Rep., 2022), AIP Conf. Proc. ? (2023), ? / 1–6,
to appear.
[19] K. I. M. McKinnon: Convergence of the Nelder-
Mead simplex algorithm to a non-stationary
point. SIAM J. Optim. 9 (1998), 308–313.
[20] F. Gao, L. Han: Implementing the Nelder-
Mead simplex algorithm with adaptive parame-
ters. Comput. Optim. Appl. 51 (2012), 259–277.
[21] C. T. Kelley: Detection and remediation of stag-
nation in the Nelder-Mead algorithm using a suf-
ficient decrease condition. SIAM J. Optim. 10
(1999), 43–55.
[22] C. J. Price, I. D. Coope, D. Byatt: A convergent
variant of the Nelder-Mead algorithm. J. Optim.
Theory Appl. 113 (2002), 5–19.
[23] I. D. Coope, C. J. Price: Frame based methods
for unconstrained optimization. J. Optim. The-
ory Appl. 107 (2000), 261–274.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.71
Jiří Vala, Petra Jarošová
E-ISSN: 2224-2880
649
Volume 22, 2023
[24] M. A. Luersen, R. Le Riche, F. Guyon: A con-
strained, globalized, and bounded Nelder-Mead
method for engineering optimization. Struct.
Multidiscip. Optim. 27 (2004), 43–54.
[25] Q. H. Zhao, D. Urosević, N. Mladenović,
P. Hansen: A restarted and modified simplex
search for unconstrained optimization. Comput.
Oper. Res. 36 (2009), 3263–3271.
[26] Z. Gao, T. Xiao, W. Fan: Hybrid differential
evolution and Nelder-Mead algorithm with re-
optimization. Soft Comput. 15 (2011), 581–594.
[27] N. E. Mastorakis: Genetic algorithms with
Nelder-Mead optimization in the variational
methods of boundary value problems. WSEAS
Trans. Math. 8 (2009), 107–116.
[28] H. Assimi, A. Jamali: A hybrid algorithm cou-
pling genetic programming and Nelder–Mead
for topology and size optimization of trusses
with static and dynamic constraints. Expert Syst.
Appl. 95 (2018), 127–141.
[29] A. A. Reddy, M. Vundela and G. Manju: Port-
folio optimization using genetic algorithms with
Nelder–Mead algorithm. Proc. 6th Int. Conf.
on Recent Trends in Computing (ICRTC) in
Ghaziabad (India, 2020), in: Lect. Notes
Netw. Syst. (R. J. Mahapatra, B. K. Panigrahi,
B. K. Kaushik, S. Roy, eds.) 177 (2021),
197–206, Springer, Singapore, 2021.
[30] A. Nawaz, T. H. Malik, N. Saleem, E. Mustafa:
Globalized Nelder-Mead trained artificial neural
networks for short term load forecasting. J. of
Basic and Applied Scientific Research 5 (2015),
1–13.
[31] A. Abdelhalim, K. Nakata, M. El-Alem,
A. Eltawil: Guided particle swarm optimization
method to solve general nonlinear optimization
problems. Eng. Optim. 50 (2018), 568–583.
[32] J. Brownlee: Optimization for Machine Learn-
ing: Finding Function Optima with Python. Ma-
chine Learning Mastery, Melbourne, 2021.
[33] J. C. Lagarias, J. A. Reeds, M. H. Wrigth,
P. E. Wrigth: Convergence properties of the
Nelder-Mead simplex method in low dimen-
sions. SIAM J. Optim. 9 (1998), 112–147.
[34] J. C. Lagarias, B. Poonen, M. H. Wrigth: Con-
vergence of the restricted Nelder-Mead algo-
rithm in two dimensions. SIAM J. Optim. 22
(2012), 501–532.
[35] A. Conan Doyle: The Sign of the Four. Lip-
pincott’s Monthly Magazine 45 (February 1890),
145–223.
[36] S. Wessing: Proper initialization is crucial for
the Nelder–Mead simplex search. Optim. Lett.
13 (2019), 847–856.
[37] I. Fajfar, Á. Bürmen, J. Puhan: The Nelder-
Mead simplex algorithm with perturbed centroid
for high-dimensional function optimization. Op-
tim. Lett. 13 (2019), 1011–1025.
[38] S. Takenaga, Y. Ozaki, M. Onishi: Practical ini-
tialization of the Nelder–Mead method for com-
putationally expensive optimization problems.
Optim. Lett. 17 (2023), 283–297.
[39] N. Pham, B. M. Wilamowski: Improved Nelder-
Mead’s simplex method and applications. J.
Comput. 3 (2011), 55–63.
[40] S. Chen, X. Wang: A derivative-free optimiza-
tion algorithm using space grid integration. Am.
J. Comput. Math. 3 (2013), 16–26.
[41] A. Galántai: Convergence of the Nelder-
Mead method. Numer. Algorithms 90 (2022),
1043–1072.
[42] A. Galántai: A stochastic convergence result for
the Nelder-Mead simplex method. MDPI Math-
ematics 11 (2023), 11 / 1–12.
[43] H. A. Tekile, M. Fedrizzi, M. Brunelli: Con-
strained eigenvalue minimization of incom-
plete pairwise comparison matrices by Nelder-
Mead algorithm. MDPI Algorithms 14 (2021),
222 / 1–20.
[44] H. Kitaoka, K. Amano, N. Nishi, T. Sakka: Im-
provement of the Nelder-Mead method using di-
rect inversion in iterative subspace. Optim. Eng.
23 (2022), 1033–1055.
[45] S. S. N. Zulkifli, M. Abdullah, A. Zaharim:
Ionospheric effects on GPS range finding us-
ing 3D ray-tracing and Nelder-Mead optimisa-
tion algorithm. WSEAS Trans. Circuits Syst. 8
(2009), 1–10.
[46] A. Yasmine Begum, G. V. Marutheeswar: Op-
timal tuning of controllers for cascade tem-
perature control of superheater using Nelder
Mead algorithm. WSEAS Trans. Circuits Syst.
15 (2016), 236–241.
[47] W. Spendley, G. R. Hext, F. R. Himsworth: Se-
quential application of simplex designs in op-
timisation and evolutionary operation. Techno-
metrics 4 (1962), 441–461.
[48] M. Fiedler: Special Matrices and Their Appli-
cation to Numerical Mathematics. Dover Publi-
cations, Mineola (USA), 2008.
[49] K. Binmore, J. Davies: Calculus Concepts and
Methods. Cambridge University Press, Cam-
bridge (UK), 2001.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.71
Jiří Vala, Petra Jarošová
E-ISSN: 2224-2880
650
Volume 22, 2023
[50] S. Šťastník: Computational evaluation of heat
demand of a freezing plant. Proc. 23th Int. Conf.
of Numerical Analysis and Applied Mathematics
(ICNAAM) in Rhodes (Greece, 2019), AIP Conf.
Proc. 2293 (2020), 340115 / 1–4.
[51] S. Šťastník, J. Vala, F. Šot: Physical proper-
ties of selected silicates for a long-term heat
container. Solid State Phenom. 276 (2018),
154–159.
[52] H. S. Carslaw, J. C. Jaeger: Conduction of Heat
in Solids. Clarendon Press, Oxford, 1959.
[53] A. Pacheco-Vega: Soft computing applications
in thermal energy systems, in: Soft Comput-
ing in Green and Renewable Energy Systems
(K. Gopalakrishnan, S. K. Khaitan, S. Kalo-
girou, eds.). Springer, Singapore, 2011.
[54] I. Babuška, J. T. Oden: Verification and valida-
tion in computational engineering and science:
basic concepts. Comput. Methods Appl. Mech.
Eng. 193 (2004), 4057–4066.
[55] T. Hanák, M. Tuscher, O. Přibyl: Hybrid genetic
algorithm-based approach for estimating flood
losses on structures of buildings. MDPI Sustain-
ability 12 (2020), 3047 / 1–16.
Contribution of individual authors
Jiří Vala was responsible for the algorithm design and
convergence analysis. Petra Jarošová carried out the
numerical simulation.
Conflicts of interest
The authors have no conflicts of interest to declare
that are relevant to the content of this article.
Sources of funding for research presented in
a scientific article or scientific article itself
This research has been supported from the project
of specific university research at Brno University of
Technology No. FAST-S-22-7867.
Creative Commons Attribution License 4.0
This article is published under the terms of Creative
Commons Attribution License (Attribution 4.0 Inter-
national, CC BY 4.0).
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.71
Jiří Vala, Petra Jarošová
E-ISSN: 2224-2880
651
Volume 22, 2023