On a derivative-free optimization approach

to some problems of civil engineering

JIŘÍ VALA, PETRA JAROŠOVÁ

Institute of Mathematics and Descriptive Geometry

Brno University of Technology, Faculty of Civil Engineering

CZ - 602 00 Brno, Veveří 331/95

CZECH REPUBLIC

Abstract: Development of advanced materials and structures for civil engineering, due to the requirements of

green and sustainable building, including the reduction of energy consumption and the balance between occupant

comfort and environmental friendliness, needs proper analysis of related physical, chemical, etc. processes, whose

mathematical description leads to direct, sensitivity and inverse initial and boundary value problems for nonlin-

ear partial differential equations, analysed numerically using finite element, difference and similar techniques.

Design optimization requires to implement a set of additional variable parameters into all related computations,

which is very expensive or quite impossible in most cases. Thus realistic computational strategies work with

the minimizations of some cost functions with unknown parameters using certain kind of numerical differentia-

tion, like quasi-Newton, inexact Newton or conjugate gradient methods, some derivative-free approach, or, as a

much-favoured alternative, some heuristic soft-computing algorithm. A reasonable compromise seems to be the

exploitation of an algorithm coming from the non-gradient Nelder-Mead simplex approach. In this paper, referring

to the experience with i) the direct problem of thermal design of a residential building and ii) the inverse problem

of identification of material characteristics as thermal conductivity and diffusivity from well-advised laboratory

experiments, after several remarks to the history and progress of the Nelder-Mead method and its improvements,

we shall demonstrate some convergence properties of such approach, regardless of the highly cited evaluation of

the original Nelder-Mead algorithm: “Mathematicians hate it because you cannot prove convergence; engineers

seem to love it because it often works.”

Key-Words: Derivative-free optimization; design in civil engineering; direct and inverse problems; finite

element and difference techniques.

Received: April 23, 2023. Revised: July 26, 2023. Accepted: August 19, 2023. Published: September 26, 2023.

1 Introduction

Buildings are responsible for a substantial part of

global energy consumption, as well as of greenhouse

gas emissions, both in developed and developing

countries, at all levels of standard of living. Although

the estimates of these parts are not quite consistent,

the related research of advanced materials and struc-

tures for civil engineering, including the reduction of

energy consumption and the balance between occu-

pant comfort and environmental friendliness, accel-

erates in 2 last decades: for its development from the

preference of massive insulation of separate residen-

tial houses to contemporary green and sustainable ur-

ban planning, [1], [2]. However, such research re-

quires deeper analysis of relevant physical, chemi-

cal, etc. processes, whose mathematical description

leads to initial and boundary value problems for non-

linear partial differential equations, analysed numeri-

cally using finite element, difference and similar tech-

niques; in addition to direct problems, also the sen-

sitivity and inverse ones, [3], are substantial. De-

sign optimization requires to implement a set of ad-

ditional variable parameters into all related computa-

tions, which is still very expensive or quite impossible

in most cases, although the progress in computational

hardware and software, parallel and distributed archi-

tectures, etc., has a potential to make such argumen-

tation less valuable.

Effective computational strategies try to minimize

some cost functions with unknown parameters using

certain kind of numerical differentiation, like quasi-

Newton, inexact Newton or conjugate gradient meth-

ods, [4], [5], Chaps. 5, 6, 7, some derivative-free ap-

proach, [6], or, as a much-favoured alternative, some

soft-computing heuristic algorithm, [7]. A reasonable

compromise seems to be the exploitation of an al-

gorithm coming from the non-gradient Nelder-Mead

simplex approach, suggested by [8] originally, al-

though the historical summary, [9], popularizes the

interview, [10]: “Mathematicians hate it because you

cannot prove convergence; engineers seem to love it

because it often works.” Nevertheless, just this ap-

proach succeeded in the concurrence of other classical

direct search algorithms, [11], [12], [13], as demon-

strated by their critical historical overview, [14].

The strong motivation for this study comes from

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.71

Jiří Vala, Petra Jarošová

E-ISSN: 2224-2880

641

Volume 22, 2023

the engineering analysis of i) the direct problem of

thermal design of a residential building, [15], [16],

and ii) the inverse problem of identification of ma-

terial characteristics as thermal conductivity and dif-

fusivity from well-advised laboratory experiments,

[17], [18]. Parallel to the development of special soft-

ware packages for i) and ii), some convergence prop-

erties of ad hoc modifications of the original Nelder-

Mead simplex algorithm, [16], have been studied by

the authors of this paper, with the concluding require-

ment of their deeper analysis. In such context, a

more general class of promising non-expensive com-

putational algorithms will be sketched in this article,

with the preference of those implemented in the above

mentioned software. After this Introduction (Section

1) and the Available results (Section 2) with numer-

ous notations and historical remarks, the Preliminary

considerations (Section 3) will prepare all means for

the Convergence of numerical algorithms (Section 4),

followed by the two-dimensional Illustrative exam-

ple (Section 5), supplied by instructive figures, the

Selected applications (Section 6), referring to previ-

ous authors’research activities, containing experience

with optimization strategies, and the brief Conclusion

(Section 7).

2 Available results

To be able to explain the Nelder-Mead approach, in-

cluding various modifications and potential improve-

ments, we have to introduce some basic notations.

As a model introductory problem let us consider a

real-valued function Fof a finite integer number n

of real variables x= (x1, . . . , xn)

>∈Rn. Let us

suppose that Fhas an infimum F×, whereas G(x) =

(∂F(x)/∂x1, . . . , ∂F(x)/∂xn)

>, i. e. the gradient of

F, is a vector of Lipschitz continuous functions. All

vectors from the Euclidean space Rnlike xhere will

be considered as column ones, supplied by lower in-

dices referring to their components and by upper ones

(if needed) referring for special positions. Especially

an ordered set {x1, x2, . . . , xn+1}of simplex vertices

will define such simplex Sin Rnthat

F(x1)≤ F(x2)≤. . . ≤ F(xn+1)(1)

is satisfied. In the following text Gwithout any ar-

gument will denote G(xn)for brevity. We shall use

the additional notations vi=xi−xn+1 for any

i∈ {1, . . . , n}and v= (v1+. . . +vn)/n. We shall

need also V∈Rn×n, introduced as

V=





1. . . vn

.....

n. . . vn





,(2)

with h=max(|v1|, . . . , |vn|).(3)

Let us consider some direction S ∈ Rnand such real

constant γthat the point

x0=xn+1 +v0, v0=γS,(4)

lies in the (n−1)-dimensional hyperplane Hin Rn

defined by {x1, x2, . . . , xn}. The evaluation of γis

then easy: the hyperplane His characterized by

V>a=e , a = (V−1)

>e , a

>=e

>V−1(5)

where e= (1,1, . . . , 1)

>; an important consequence

of (5) is

>γS=γe

>V−1S= 1 , γ−1=e

>V−1S.(6)

To support the notations and considerations of Sec-

tion 3, let us introduce the following straightforward

7-step ad hoc generalization of the original Nelder-

Mead simplex algorithm. Using, in addition to the a

priori choice of S, five constants αR,αE,αCO,αCI

and αS, satisfying the inequality

min(αR, αE, αCO, αCI, αS,1−αCI,1−αS)>0,(7)

such algorithm reads:

1. Sort {x1, x2, . . . , xn+1}to guarantee (1).

2. Evaluate γfrom (6), set (ςR, ςE, ςCO, ςCI) =

γ(αR, αE, αCO, αCI)and x0from (4).

3. Try reflection xR=x0+ςRS.

If F(x2)<F(xR)<F(xn+1)then replace

xn+1 by xRand go to step 7.

4. If F(xR)≥ F(xN)then go to step 5.

Try expansion xE=x0+ςES.

If F(xE)<F(xR)then replace xn+1 by xE, oth-

erwise replace xn+1 by xR, and (in both cases)

go to step 7.

5. Try outside contraction xCO =x0+ςCOSand

inside contraction xCI =x0−ςCOS.

If min(F(xCO),F(xCI)) ≥ F(xn)then go to

step 6.

If F(xCO)≤ F(xCI)then replace xn+1 by

xCO, otherwise replace xn+1 by xCI, and (in both

cases) go to step 7.

6. Perform shrink: take (x1+αS(x2−x1), . . . , x1+

αS(xn+1 −x1)) instead of (x2, . . . , xn+1).

7. Check convergence and stop, or return to step 1.

Let us emphasize that none of the constants occurring

in (7) is allowed to be equal to 0: such αSwould force

the reduction of Sto one point, four remaining con-

stants would lead to the unwanted degeneration of S

to a simplex S0with nvertices in H, whereas the hy-

pothetical choice αCI = 1 or αS= 1 could not work,

preserving Sunchanged.

In particular, the classical choice, [8], referred as

SNMS (Standard Nelder-Mead Simplex) frequently,

making use of our notation (2), is

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.71

Jiří Vala, Petra Jarošová

E-ISSN: 2224-2880

642

Volume 22, 2023

S=v=1

nV e , (8)

supplied with

αR= 1 , αE= 2 , αCO =αCI =αS= 1/2 ; (9)

consequently γ= 1, thus step 2 can be reduced to

x0= (x1+. . . +xn)/n, i. e. to x0=xn+1 +v,

whereas a vector (ςR, ςE, ςCO, ςCI)is identical with

(αR, αE, αCO, αCI). This case, whose original terms

reflection,expansion,outer /inner contraction and

shrink have their intuitive meanings, is still signif-

icant in computational practice. Indeed, by (6) we

have γ−1=1

>V−1V e = 1 .(10)

Unfortunately, various well-documented counterex-

amples of stagnation or divergence of this algorithm

can be found both in the classical literature, [19], and

in their later extensive survey, [9]. This has moti-

vated numerous attempts to derive some “better con-

vergent” variants of SNMS.

To improve SNMS, especially for n>>2, [20] sug-

gested ANMS (Adaptive Nelder-Mead Simplex) with

αR= 1 , αE= 1 + 2/n , (11)

αCO =αCI = 3/4 −1/(2n), αS= 1/2

instead of (9), thus for n= 2 ANMS coincides with

SNMS; moreover, certain sufficient descent property

is required for expansion and contraction steps. Po-

tential stagnation trends in both SNMS and ANMS

can be handled using the sufficient decrease moti-

vated oriented restarts, [21], reinitializing the sim-

plex to a smaller one with orthogonal edges, or, in the

more complicated way, introducing certain pseudo-

expansion on so-called “ghost simplices”, additional

to S, coupled with the sequences of quasi-minimal

frames, reshaping the simplices Susing the matrix

QR-decomposition, [22], with help of the knowledge

of properties of positive bases and frames, [23]. Such

algorithm seems to be also implemented in the MAT-

LAB function fminsearch from the toolbox optimiza-

tion.

Unlike ANMS, GBNM (Globalized Bounded

Nelder-Mead), [24], still preserving γ= 1, relies

on another generalization (so-called globalization)

of searching for appropriate αR,αE,αCO,αCI and

αS, satisfying (7), together with probabilistic restarts,

whereas RMNM (Restarted Modified Nelder-Mead),

[25], works with an advanced deterministic modifica-

tion of the shrink step. Still other attempt to improve

SNMS rely on its hybridization with appropriate soft

computing techniques, as with differential evolution,

[26], with genetic algorithms, [27], [28], [29], with ar-

tificial neural networks (NM-ANN), [30], with parti-

cle swarm optimization (NM-PSO), [31], or with ma-

chine learning, [32], Chap. 12.

The existing convergence theory for SNMS and

related algorithms is far from being satisfactory

and comparable with that available for gradient ap-

proaches. The following results try to handle strictly

convex Fwith bounded level sets. Namely SNMS

is verified to converge to the minimizer for n= 1

(where, in our notation, the directions of Gand Sal-

ways coincide), [33] ; for n= 2 only the conver-

gence of hby (3) to zero during SNMS can be guar-

anteed, [33]. The strongest result, up to now, [34], for

n= 2 seems to be: SNMS with disabled expansion

converges always to the minimizer. The computer-

assisted tedious proof of this result, presented at 25

pages, needs the step-by-step elimination of all pos-

sible non-convergent cases; one can see the origin

of such proof technique, [9], in the Sherlock Holmes

argument, [35], Chap. 6: “How often have I said to

you that when you have eliminated the impossible,

whatever remains, however improbable, must be the

truth ?” Nevertheless, none of these results, includ-

ing recommended corrections, [21], [22], guarantees

the convergence even for a probably simplest example

with n= 2, i. e. that of revolution paraboloid surface

F(x1, x2) = x2

1+x2

2,(12)

with an arbitrary regular initial triangle!

Apart from the various advanced strategies for the

initial choice of S, [36], [37], [38], avoiding the di-

vergence (not only) for (12), some results working

with approximation of derivatives, [39], [40], can be

useful, too. Unfortunately, the hypothetical standard

approximation of Fby a quadratic polynomial, whose

minimum or other stationary point could be detected

easily, needs to calculate N= 1 + 1 + 1 = 3 values

for n= 1,N= 1 + 2 + 3 = 6 values for n= 2,

N=1+3+6 =10 values for n=3, etc., in general

k=0 n+k−1

k=

k=0

(n+k−1)!

k! (n−1)! (13)

= 1 + n+1

2n(n+ 1) = 1 + 1

2n(n+ 3)

=n+ 1

1+n+ 1

2,

thus e. g. N= 66 values for n= 10, which is dif-

ficult to implement into any effective numerical al-

gorithm, namely if such values must be obtained in

some non-trivial way, typically as outputs from the

numerical analysis of partial differential equations of

evolution, as mentioned above. However, the last

equality in (13) exhibits the possibility of a simple

choice of evaluation points in all vertices and in all

centres of one-dimensional edges of S. For com-

parison N ∈ {1,2,3,4}in every step of SNMS, as

well as in our generalization of SNMS, except shrink

with N=n(as number of edges of S), which is

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.71

Jiří Vala, Petra Jarošová

E-ISSN: 2224-2880

643

Volume 22, 2023

much smaller than Nreceived from (13), especially

for large n. Thus also the progress in the conver-

gence theory of related matrix algorithms and iter-

ative subspaces in the last several years, [41], [42],

[43], [44], must be mentioned, regardless of the (at

least partial) loss of user-friendliness, simplicity and

transparency, appreciated as advantages of algorithms

closer to SNMS, namely in the control theory, [45],

[46].

3 Preliminary considerations

For simplicity we shall refer the classical choice of

S=vby SNMS as a), without a deeper study on set-

tings of parameters αR,αE,αCO,αCI and αSlike (9) or

(11) here; The evident drawback of a) is its ignorance

of differences DFi=Fi−Fn+1 where Fi=F(xi)

for i∈ {1,2, . . . , n}and Fn+1 =F(xn+1); the brief

notation DF= (DF1, DF2, . . . , DFn)

>will be ap-

plied here. Let us show some examples of alternative

choices of Snow.

As b) we can take the orthogonal projection S=a

by (5). Clearly (6) is applicable to any choice of S.

Therefore, modifying (10), we receive

γ−1= (e

>V−1)(e

>V−1)

>.(14)

A positive γcan be evaluated from (14) for any reg-

ular Sevidently. The length of u0is the lowest pos-

sible, but x0can lie in Houtside S, which cannot be

excluded in all following cases, too; DFis still not

taken into account.

The motivation for c) can be seen in the application

of the Lagrange theorem

DFi=vi· G(exi) = vi· G +vi·(G(exi)− G)(15)

for any i∈ {1,2, . . . , n}; central dots here and in

the following formulae will always refer to standard

scalar products in Rn, whereas |.|will denote lengths

of vectors in Rn,exiare some (a priori unknown)

points lying in the line segment between xiand xn+1.

Let us notice such idea of “steepest descent” (with-

out SMNS) for numerical evaluation of function ex-

trema, [47], as well as more historical matters of inter-

est, [14]. Thanks to the assumed Lipschitz continuity,

working with some positive constant L, taking also

|ui| ≤ hinto account, we can rewrite (15) as

DF=V>G+LhV>ϕ(16)

G= (V−1)

>DF − Lhϕ

where a vector ϕin Rnis allowed to return only real

values between −1and 1. As a simplified analogue

to (16), neglecting its second right-hand-side additive

term, let us set

DF=V>S,S= (V−1)

>DF.(17)

By (6) we obtain

γ−1=e

>V−1(V−1)

>DF.(18)

Unfortunately, unlike (14), we cannot guarantee the

correct evaluation of γfrom (18) in all situations; this

fails in the hypothetical case of the orthogonality of

vectors (V−1)

>DF,(V−1)

>ein Rn, in other words:

of the V−1(V−1)

>- orthogonality of vectors DF,e.

Consequently some correction strategy should be im-

plemented.

The above mentioned disadvantage of c) can be

compensated (at least theoretically) by the following

setting d). Let us replace (17) by

DF=V>e

S,S=1

h2V V>e

S=1

h2V DF.(19)

Thus (6) yields

γ−1=1

h2e

>DF,(20)

>DFon the right-hand side being just the average

value of DF, multiplied by n. Therefore, exclud-

ing the unwanted stagnation case, a positive γcan be

evaluated from (20) for any regular S, similarly to b),

unlike c). Nevertheless, the gradient motivation of c)

is weakened by the application of an artificial matrix

weight V V>in (19), not needed in (17).

All cases a), b), c), d) admit various modifications,

including linear combinations as the simplest ones.

As a representative one, in the following considera-

tion we shall try to improve c) utilizing a small per-

turbation from a), i. e.

S= (V−1)

>DF+β

nV e , (21)

DF=V>S − β

nV>V e ,

dependent on a positive parameter β. From (21) and

(6) we have

γ−1=e

>V−1(V−1)

>DF+β

>e(22)

>V−1(V−1)

>DF+β .

Thus at least an adaptive choice of βin (22), whose

upper bound can be a priori prescribed, is able to sup-

press the danger of fail of the evaluation of γby (18)

in c). However, this does not handle potential degen-

eration of Sin general, as we shall see in more details

in Section 4.

Before the formulation of convergence criteria, let

us prepare some a priori estimates. Since S ≈ G was

intended by c) in a reasonable sense, the estimates for

S·G and |S|2=S·S in comparison with |G|2=G·G

are desirable. Confronting (16) with (21), we obtain

S=G+β

nV e +Lhϕ , (23)

=G+βh

V e +Lhϕ ,

S · G =|G|2+βh

nG · e

V e +LhG · ϕ

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.71

Jiří Vala, Petra Jarošová

E-ISSN: 2224-2880

644

Volume 22, 2023

where e

V=h−1V. Applying the spectral norms of

matrices from Rn×n, [48], Chap. 9, we are allowed to

take ke

Vkas the square root Nof the maximal eigen-

value (spectral radius) of e

V>e

V, independent of h; the

much cheaper estimate by the Schur norm is N≤n.

Clearly |ϕ|2≤ |e|2=n, too. Thus (23) together with

the Young inequality implies

S · G ≥ |G|2−βh

n|G|ke

Vk|e|−Lh|G||ϕ|(24)

≥ |G|2−(β+L)√nh|G|

≥(1 −)|G|2−(β+L)2n

4h2.

In the quite analogous way we can derive

|S| ≤ |G| +βh

nke

Vk|e|+Lh|ϕ|(25)

≤ |G| + (β+L)√nh ,

|S|2≤2|G|2+ 2(β+L)2nh2.

Let us exploit the Lipschitz continuity of a gradient

of Fonce more, similarly to (15), (16). In a sequence

of standard steps (except shrink where the modifica-

tion of the following discussion is obvious) of our

algorithm we come to some x∗=xn+1 +ςSwith

ς∈ {ςR, ςE, ςCO, ςCI}where F∗=F(x∗)>F×.

Thus we have

Fn+1 −F∗=ςS · G(ex∗)(26)

=ςS · G +ςS · (G(ex∗)− G)

with ex∗somewhere in the line segment between x∗

and xn+1. Then (26) implies

Fn+1 −F∗≥ςS · G − |ςS|L|ςS| (27)

=ςS · G − ς2L|S|2.

Inserting (24) and (25) into (27), we receive

Fn+1 −F∗≥Cς|G|2−b

Cςh2,(28)

Cς=ς((1 −)−2ςL),

Cς=ς1

4+ 2ςL(β+L)2n .

We shall need positive Cςand b

Cςin the following

considerations, thus ς < (1 −)/(2L)is needed here;

for admissible exceptions see the final part of Sec-

tion 4. Let us notice that the careful selection of Cς

and b

Cςis welcome: e. g. the improper choice of very

small can produce a negative right-hand-side esti-

mate in (28), whereas the left-hand side of (28) is non-

negative by its definition.

Let us now introduce the simplified notations F=

(F1+. . . +Fn+Fn+1)/(n+ 1) and F∗= (F1+

. . . +Fn+F∗)/(n+ 1). Simulating the substitution

of xn+1 by certain new x∗(before resorting), i. e. the

reshaping of Sin a current k-th iteration, from (27)

we come to the result

(n+ 1) F−F∗≥Cς|G|2−b

Cςh2.(29)

Such result is then applicable to any iteration k∈

{0,1,2, . . .}, starting from an initial simplex Swith

k= 0.

4 Convergence of numerical

algorithms

Let us consider xk,h(k),G(k),V(k)and F∗as x,h,G,

Vand F(k)from the k-th iteration, k∈ {0,1,2, . . .},

F(0) being identical with Fin the initial iteration. Let

the selection of h(k)satisfy the condition

∞

k=1

h(k)2≤ H (30)

where His some finite positive constant. If |G(k)|= 0

for some finite k, the minimum of Gis just attained;

this lucky case will be excluded from further consider-

ations. In the general (computationally realistic) case

we have

F(0) − F×≥

∞

k=1 F(k−1) − F(k)(31)

≥C

∞

k=1 |G(k)|2−b

∞

k=1

h(k)2,

taking some lower bound of applied values Cςfrom

(29) as Cand some upper bound of applied values b

Cς

from (29) as b

Cinto account. Thanks to (30), from

(31) we obtain

∞

k=1 |G(k)|2≤C−1F0− F×+b

C.(32)

The existence of some (at least) accumulation point

for x(k)with k→∞ follows from (3), forcing h

(k)→

0; then G(k)tends to (0, . . . , 0)

>in Rnby (32). More-

over, if Fhas a positive definite Hessian matrix than

the standard differential calculus, [49], Chap. 4, re-

sults that Fhas exactly one minimum F×in Rn. The

limit passage x(k)→x×,F(x×) = F×, can be per-

formed according to [4], Chap. 1.

Let us notice that the condition (30) is rather re-

strictive: introducing e. g. two simple choices

h(k)=1

√2h(k−1) =1

√2kh(0) ,(33)

h(k)=rk

k+ 1 h(k−1) =1

√k+ 1 h(0) ,(34)

we can see easily that (30) is satisfied applying h(k)by

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.71

Jiří Vala, Petra Jarošová

E-ISSN: 2224-2880

645

Volume 22, 2023

(33) with √H=h(0)√2(as a sum of geometric se-

ries), which cannot be repeated for h(k)by (34) lead-

ing to H → ∞. Only a formal computational rem-

edy is available: since F(k−1) −F(k)in the first right-

hand-side estimate of (31) is always non-negative, it

is possible to consider the above presented limit pas-

sage up to a subsequence, i. e. all sums in (30), (31)

and (32) are allowed to be calculated over any infinite

set of integers, instead of the whole set {1,2,3, . . .},

with the same qualitative result. Nevertheless, sub-

stantial decrease of the convergence rate can be ex-

pected in such case. The following comments refer to

a possible remeshing strategy and further limitations

of this approach.

Numerous more expensive shrink steps, or even

some restarting ones, not included in the basic al-

gorithm of Section 3 explicitly, with various moti-

vations, [21], [22], [24], [25], can be required to

force (30). Restarting steps also prevent the de-

generation of Sto a simplex in a less-dimensional

space than Rn. This could be checked using det V

from (2) related to hform (3), but such evalua-

tion is too expensive. Thus, as a typical example

for practical computations, let consider the remesh-

ing strategy based on the following idea: all u2=

x2−x1, . . . , un+1 =xn+1 −x1replace by such

(|u2|/|bu2|)bu2, . . . , (|u2|/|bun+1|)bun+1 that i) bu2=u2

and ii) bui+1 =ui+1 +ζ2bu2+. . . ζibuiwhere

bui+1 ·bu2=. . . =bui+1 ·bui=Z(35)

for i∈ {2, . . . , n},Zbeing a fixed constant, 0<

Z ≤ 1. Clearly (35) requires to solve an auxiliary sys-

tem of i−1linear algebraic equations with unknown

parameters ζ2, . . . , ζi. Let us also notice that, in com-

parison with (2), x1=xn+1 +v1,x1+ui=xn+1 +vi

for i∈ {2, . . . , n},xn+1 +v1=x1. Taking i)

bx1=x1and ii) bxi+1 =xi+1 + (|u2|/|bui+1|)bui+1 for

any i∈ {1, . . . , n}(in particular, also bx2=x2), we

are able to replace xiby bxifor all i∈ {1, . . . , n + 1}.

Let us notice that Sis always regular for the choice

Z= 1/2, with the length of all edges h=|u2|; un-

like this, the choice Z= 0 coincides with the Gram-

Schmidt orthonormalization, [48], Chap. 2. However,

no such choice guarantees the decrease of (F1+. . .+

Fn+Fn+1)/(n+ 1) in general, therefore additional

shrink steps cannot be avoided.

Consequently no reasonable a priori information

on the convergence rate is available; one should still

rely on a repeated check of the validity of (30) dur-

ing the computation. This is replaced by cheaper ad

hoc remeshing tricks in engineering software pack-

ages, whose priorities must be reassessed in re-

cent parallel and distributed computational systems:

namely the above sketched rearrangement of S(i. e.

the remeshing of x1, . . . , xn+1) represents a classi-

cal sequential algorithm, but the parallel evaluation

of F(x1), . . . , F(xn+1), expected as the most expen-

sive, is possible.

5 Illustrative example

As an illustrative example, we shall present the min-

imization of (12), using the in-house experimental

software package created at the Institute of Mathe-

matics and Descriptive Geometry (founded 1899) of

Brno University of Technology, Faculty of Civil En-

gineering (abbreviated as FCE BUT). Since the exact

minimizing point x×= 0,F(x×) = 0 is well-known

in this case, an absolute error lesser than E= 3 .10−3

for the stopping criterion can be required: such value

is chosen to support an appropriate visualization, us-

ing the standard MATLAB functions; the initial sim-

plex Slies rather far from the minimizing point for

the testing purposes. The computational results are

shown for the classical SNMS algorithm on Fig. 1;

for its modification c) with a small perturbation by

(17), taking β= 0.1(only formally, rarely occurring

in practice, switching to a non-zero βfor |γ| ≤ E

only), supplied by the recommended potential reini-

tialization of S, [21], as discussed in Section 3, on

Fig. 2. Finally Fig. 3 corresponds to the same ap-

proach, switching such cautious reinitialization off.

The dotted lines show the orthogonal projection of re-

lated simplices Sto the plane (x1, x2)everywhere.

Figure 1: Classical algorithm SNMS, [8]: 12 itera-

tions, 36 evaluations of F.

F(x1, x2)

x1x2

Of course, this example, including its strange ini-

tial setting of S, has been prepared intentionally to

demonstrate some interesting properties of the dis-

cussed class of algorithms. As evident from Fig. 1,

the classical SNMS algorithm exhibits the lowest ten-

dency to the degeneration of S, whereas the improved

one tends to the minimum x×of Fmore efficiently,

but at the cost of more shrink steps by Fig. 3, or even

at the cost of reinitialization to certain “more regular”

Sby Fig. 2; this can be observed here especially in the

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.71

Jiří Vala, Petra Jarošová

E-ISSN: 2224-2880

646

Volume 22, 2023

Figure 2: Modified algorithm by c), Section 3,

reinitialization of simplices enabled: 12 iterations

and 1 reset, 41 evaluations of F.

F(x1, x2)

x1x2

Figure 3: Modified algorithm by c), Section 3,

reinitialization of simplices disabled: 11 iterations,

34 evaluations of F.

F(x1, x2)

x1x2

last (still visible) iterations near x×because the first

5 iterations shown on Fig. 2 and Fig. 3 are exactly the

same, unlike those on Fig. 1. However, the number of

sequential calculations of particular values of F, as

presented here, cannot be understood as a unique cri-

terion for the algorithm effectiveness (especially from

such unconvincing differences in number of iterations

occurring in an example) because of the contempo-

rary offer of parallel and distributed computations.

For more realistic applications (but worse transpar-

ent from graphical presentations) in civil engineering,

as referenced in Section 6, (at least) the modification

c) promises to bring certain non-negligible computa-

tional benefit.

6 Selected applications

Let us now present some useful references to two

representative kinds of such problems, whose experi-

mental, physical, mathematical and numerical analy-

ses including software implementation are still in de-

velopment at FCE BUT. To retain a reader-friendly

extent of this article, we shall avoid the formulation

details, which can be found in parts in 6 referenced

publications, [15], [16], [17], [18], [50], [51].

As such first kind we can mention the thermal

design of buildings, as a special part of significant

environmental research in civil engineering, oriented

to advanced materials and structures, announced by

Section 1; A residential building can be considered,

[15], [16], in a simplified way as a thermal system,

whose particular rooms have a constant temperature,

whereas all outer and inner walls, ceilings, floors, etc.,

are represented by a one-dimensional, typically lay-

ered interfaces both between such rooms and between

rooms and exterior environment. Now i) the composi-

tion of all such interfaces must be prescribed, as well

as ii) the required temperature in particular rooms dur-

ing a computational year and iii) all climatic data,

namely the temperature and thermal radiation due to

the Stefan-Boltzmann law, corrected by their wave-

length ranges (because of their selective absorption by

the Earth’s atmosphere), assumed shading by vegeta-

tion, etc., in daily quasi-cycles for a complete refer-

ence year; also iv) some initial distribution of temper-

ature in a building must be given. All exterior build-

ing surfaces are exposed to such climatic conditions

due to the apparent motion of the Sun on the sky at

the prescribed geographic location.

Such type of direct problems are usually analysed

using the principles of classical thermodynamics: the

first one can be reduced to the conservation of ther-

mal energy here, the second one should be satisfied

automatically for an appropriate design of constitu-

tive relations, namely of the empirical Fourier, Stefan-

Boltzmann, etc. laws, whereas the third one forbids

the temperature lesser than absolute Kelvin zero for-

mally (which is far from being realistic in our consid-

erations). The above sketched simplification gener-

ates only one parabolic partial differential equation of

evolution with a unknown temperature field θ(ξ, t),

i. e. of the second order in ξand in the first order in

t, where ξrefers to the “spacial” coordinate in lay-

ers and tis the time, starting from its initial value,

typically t= 0. This equation contains certain ma-

terial characteristics, namely the thermal conductiv-

ity λ(ξ, θ), responsible for the heat insulation ability

of particular layers, and the thermal capacity κ(ξ, θ),

corresponding to their heat accumulation properties.

For the evaluation of thermal consumption of resi-

dential buildings both λand κare considered as con-

stant in any such layer frequently, which seemingly

preserves the linearity of such equation. However, at

least two sources of nonlinearity cannot be removed:

i) the power-law form of the Stefan-Boltzmann law

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.71

Jiří Vala, Petra Jarošová

E-ISSN: 2224-2880

647

Volume 22, 2023

and ii) the strategy of switching of heating (and also

air-conditioning, if required) and choice of heating

power, bringing some additional inequalities and re-

ferring to selected results from the control theory.

The numerical analysis for a direct problem can

rely on the Fourier multiplicative decomposition of

θ(ξ, t): the resulting approximation, exploiting e. g.

the finite element method (FEM), can be then writ-

ten of a sparse system of ordinary differential equa-

tions. Without above mentioned nonlinearities, such

θcan be then expressed analytically, using the gen-

eralized eigenvalues and eigenvectors of real square

matrices; however, proper implementation of nonlin-

earities needs additional iterative procedures in suf-

ficiently small time steps, using the finite difference

technique. Potential dependence of λand κon ξ(in-

side layers) can be handled using a standard approach

of numerical integration, involved in FEM; their de-

pendence on θforces more complicated iterative pro-

cedures.

The aim function, realizable as Fin the previous

considerations, is now the energy consumption dur-

ing a reference year. The choice of unknown param-

eters can be rather delicate: as their instructive exam-

ples we can mention a) the angle of horizontal rota-

tion of the house around a fixed axis, b) the fenes-

tration on a selected wall, or c) the dislocation and

operating power of one or more heating devices. Un-

fortunately, numerous limitations from technical stan-

dards must be taken into account via artificial penalty

terms, as minimal fenestration for natural lighting, or

minimal air exchange in rooms per hour. The valida-

tion of the computational results has been performed

on a small family house in Moravian Karst, about

30 km northern from Brno, with the slightly modified

climatic data from the international airport in Brno-

Tuřany. Nevertheless, the chance to much bigger en-

ergy savings can be seen in the buildings with con-

trolled temperature for industrial applications, namely

in freezing and cooling plants, [50].

The announced second kind of problems is ad-

dressed to the identification ones. Continuing the pre-

vious considerations, we can now study the thermal

transfer in the general 3-dimensional Euclidean space,

simulating the laboratory conditions computationally,

without the knowledge of values of λand κ, but with

the knowledge of time distribution of θat pre-defined

locations during some standard, carefully controlled

heating process. Such experiments work with very

special geometric configuration of specimens under

strictly allotted conditions, to suppress the unwanted

influence of further physical processes: thus for the

identification of λand κ, [17], [18], we can distin-

guish between hot-plate, hot-ball and hot-wire mea-

surement. If λand κare constant and the material

is isotropic then the analytical solutions are avail-

able, or at least the semi-analytical ones, containing

Fourier series, non-elementary integrals, Bessel func-

tions, etc.; the large collection of useful classical for-

mulae, whose tedious development was motivated by

the limited possibilities of numerical analysis, [52].

This offers the possibility to intervene into the iden-

tification procedure directly, without time-consuming

repeated waiting for the evaluation of Ffrom certain

“black box” with suggested values of parameters λ

and κ. As Fwe can consider the quadratic error be-

tween the predicted and simulated values of θ(ξ, t)for

several fixed points ξ(e. g. 2 points can be sufficient

for hot-wire measurements on long cylindrical spec-

imens in the radially symmetric configuration) and

sufficiently large number of time steps, such prob-

lem then refers to nonlinear regression in mathemati-

cal statistics.

Unfortunately, some of the above presented as-

sumptions are violated frequently: e. g. the influence

of the thermal transfer factor (which may be not con-

stant) at the interface(s) between the specimen and

other part(s) of the measurement apparatus is non-

negligible, or the dependence of λand κon θcannot

be neglected during the experiment in some expected

temperature range. All such significant disturbances

in semi-analytical formulations argue for switching

back to the above discussed approaches of minimiza-

tion of F, if not to soft computing recipes with artifi-

cial neural networks, genetic algorithms, fuzzy logic

or cluster analysis, [53]. Identification of λand κfor

refractory materials at high temperatures, utilizing a

special oven, [51], can serve as an indubitable exam-

ple.

7 Conclusion

We have demonstrated some convergence properties

of a class of rather simple and computationally trans-

parent algorithms, derived from the classical Nelder-

Mead one. Avoiding the difficulties with the construc-

tion of a complete set of derivatives of a (sufficiently

smooth) real function F, or a quadratic polynomial

approximating F, the crucial step of the modification

of the classical SNMS is the replacement of the grav-

ity centres x0of the n-dimensional simplices S0in

Hby various other points of H, not belonging to S0

(as introduced in Section 2) necessarily, e. g. those in

the intersection of lines, respecting the roughly es-

timated gradient directions from a remaining vertex

of xn+1 ∈S(xn+1 /∈S0), with Hagain. Clearly

this generates a much larger class of directions Spro-

portional to x0−xn+1, which can be seen as a non-

negligible contribution of this article, not covered by

approaches reviewed by Section 2.

However, some disadvantages and limitations of

this approach must be mentioned: i) The conver-

gence results presented in this article do not overcome

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.71

Jiří Vala, Petra Jarošová

E-ISSN: 2224-2880

648

Volume 22, 2023

the criticized too strong assumptions in convergence

proofs for SNMS completely: cf. the disabled expan-

sion in SMNS, not respected in effective engineer-

ing computations, to obtain a sufficient decrease of h,

even for n= 2, [34], as well the purpose-built restart-

ing strategies to handle published counterexamples of

algorithm divergence or stagnation. ii) Other than suf-

ficiently smooth functions Fare needed in applica-

tions frequently, even some additional conditions in

the form in inequalities, coming from technical stan-

dards, [16]; for such cases no proper verification tech-

nique is available, only some validation one, relying

on the comparison with experiments or other com-

putations, [54]. iii) Restarting steps, as described

in Section 4, may require a comparable number of

evaluations of Fwith simplest descent methods, [4],

Chap. 2, with potential support of parallel processing.

Also some soft-computing techniques can be capa-

ble of competing, regardless of their heuristic nature.

Thus the research plans of the authors’ team at BUT

for the near future cover i. a. both some deeper conver-

gence considerations than those sketched by Section 4

for an appropriate class of algorithms compatible with

Section 2 and the development of hybrid methods for

special civil engineering applications, [55].

References:

[1] W. Feist: Gestaltungsgrundlagen Passivhäuser.

Das Beispiel, Darmstadt, 2001. (In German.)

[2] D. B. Corner: Passive House Details: Solu-

tions for High-Performance Design. Routledge,

Abingdon (UK), 2017.

[3] H. R. B. Orlande: Inverse problems in heat

transfer: New trends on solution methodologies

and applications. J. Heat Transfer 134 (2012),

031011 / 1–13.

[4] L. Lukšan: Numerické optimalizační metody.

Technical report 1152, Institute of Informatics

CAS, Prague, 2017. (In Czech.)

[5] A. Neculai: Modern Numerical Nonlinear Opti-

mization. Springer, New York, 2022.

[6] J. Larson, M. Menickelly, S. M. Wild:

Derivative-free optimization methods. Acta

Numer. 28, 287–404.

[7] P. O. Omolaye, J. M. Mom, G. A. Igwue: A

holistic review of soft computing techniques.

Appl. Comput. Math. 6 (2017), 93–110.

[8] J. A. Nelder, R. Mead: A simplex method for

function minimization. Computer J. 7 (1965),

308–313.

[9] M. H. Wright: Nelder, Mead, and the other sim-

plex method. Doc. Math., Extra Volume ISMP

(2012), 271–276.

[10] S. Senn: A conversation with John Nelder. Stat.

Sci. 18 (2003), 118–131.

[11] H. H. Rosenbrock: An automatic method for

finding the greatest or least value of a function.

Computer J. 3 (1960), 175–184.

[12] M. J. D. Powell: An efficient method for finding

the minimum of a function of several variables

without calculating derivatives. Computer J. 7

(1964), 155–162.

[13] W. H. Swann: Direct search methods, in: Nu-

merical Methods for Unconstrained Optimiza-

tion (W. Murray, ed.). Academic Press, London,

1972, 13–28.

[14] R. M. Lewis, V. Torczon, M. W. Troset: Direct

search methods: then and now. J. Comput. Appl.

Math. 124 (2000), 191–207.

[15] J. Vala, P. Jarošová: Optimization of de-

sign parameters of low-energy buildings.

Proc. 22nd Thermophysics in Terchová (Slovak

Rep., 2017), AIP Conf. Proc. 1866 (2017),

040041 / 1–6.

[16] J. Vala, P. Jarošová: Optimization approaches to

some problems of building design. Appl. Math.

63 (2018), 305–331.

[17] J. Vala: On some non-gradient approaches to

inverse and optimization problems of thermal

transfer. Proc. 23th Thermophysics in Smolenice

(Slovak Rep., 2018), AIP Conf. Proc. 1988

(2018), 020047 / 1–6.

[18] J. Vala: On the computational heat transfer eval-

uation in buildings with controlled temperature.

Proc. 27th Thermophysics in Dalešice (Czech

Rep., 2022), AIP Conf. Proc. ? (2023), ? / 1–6,

to appear.

[19] K. I. M. McKinnon: Convergence of the Nelder-

Mead simplex algorithm to a non-stationary

point. SIAM J. Optim. 9 (1998), 308–313.

[20] F. Gao, L. Han: Implementing the Nelder-

Mead simplex algorithm with adaptive parame-

ters. Comput. Optim. Appl. 51 (2012), 259–277.

[21] C. T. Kelley: Detection and remediation of stag-

nation in the Nelder-Mead algorithm using a suf-

ficient decrease condition. SIAM J. Optim. 10

(1999), 43–55.

[22] C. J. Price, I. D. Coope, D. Byatt: A convergent

variant of the Nelder-Mead algorithm. J. Optim.

Theory Appl. 113 (2002), 5–19.

[23] I. D. Coope, C. J. Price: Frame based methods

for unconstrained optimization. J. Optim. The-

ory Appl. 107 (2000), 261–274.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.71

Jiří Vala, Petra Jarošová

E-ISSN: 2224-2880

649

Volume 22, 2023

[24] M. A. Luersen, R. Le Riche, F. Guyon: A con-

strained, globalized, and bounded Nelder-Mead

method for engineering optimization. Struct.

Multidiscip. Optim. 27 (2004), 43–54.

[25] Q. H. Zhao, D. Urosević, N. Mladenović,

P. Hansen: A restarted and modified simplex

search for unconstrained optimization. Comput.

Oper. Res. 36 (2009), 3263–3271.

[26] Z. Gao, T. Xiao, W. Fan: Hybrid differential

evolution and Nelder-Mead algorithm with re-

optimization. Soft Comput. 15 (2011), 581–594.

[27] N. E. Mastorakis: Genetic algorithms with

Nelder-Mead optimization in the variational

methods of boundary value problems. WSEAS

Trans. Math. 8 (2009), 107–116.

[28] H. Assimi, A. Jamali: A hybrid algorithm cou-

pling genetic programming and Nelder–Mead

for topology and size optimization of trusses

with static and dynamic constraints. Expert Syst.

Appl. 95 (2018), 127–141.

[29] A. A. Reddy, M. Vundela and G. Manju: Port-

folio optimization using genetic algorithms with

Nelder–Mead algorithm. Proc. 6th Int. Conf.

on Recent Trends in Computing (ICRTC) in

Ghaziabad (India, 2020), in: Lect. Notes

Netw. Syst. (R. J. Mahapatra, B. K. Panigrahi,

B. K. Kaushik, S. Roy, eds.) 177 (2021),

197–206, Springer, Singapore, 2021.

[30] A. Nawaz, T. H. Malik, N. Saleem, E. Mustafa:

Globalized Nelder-Mead trained artificial neural

networks for short term load forecasting. J. of

Basic and Applied Scientific Research 5 (2015),

1–13.

[31] A. Abdelhalim, K. Nakata, M. El-Alem,

A. Eltawil: Guided particle swarm optimization

method to solve general nonlinear optimization

problems. Eng. Optim. 50 (2018), 568–583.

[32] J. Brownlee: Optimization for Machine Learn-

ing: Finding Function Optima with Python. Ma-

chine Learning Mastery, Melbourne, 2021.

[33] J. C. Lagarias, J. A. Reeds, M. H. Wrigth,

P. E. Wrigth: Convergence properties of the

Nelder-Mead simplex method in low dimen-

sions. SIAM J. Optim. 9 (1998), 112–147.

[34] J. C. Lagarias, B. Poonen, M. H. Wrigth: Con-

vergence of the restricted Nelder-Mead algo-

rithm in two dimensions. SIAM J. Optim. 22

(2012), 501–532.

[35] A. Conan Doyle: The Sign of the Four. Lip-

pincott’s Monthly Magazine 45 (February 1890),

145–223.

[36] S. Wessing: Proper initialization is crucial for

the Nelder–Mead simplex search. Optim. Lett.

13 (2019), 847–856.

[37] I. Fajfar, Á. Bürmen, J. Puhan: The Nelder-

Mead simplex algorithm with perturbed centroid

for high-dimensional function optimization. Op-

tim. Lett. 13 (2019), 1011–1025.

[38] S. Takenaga, Y. Ozaki, M. Onishi: Practical ini-

tialization of the Nelder–Mead method for com-

putationally expensive optimization problems.

Optim. Lett. 17 (2023), 283–297.

[39] N. Pham, B. M. Wilamowski: Improved Nelder-

Mead’s simplex method and applications. J.

Comput. 3 (2011), 55–63.

[40] S. Chen, X. Wang: A derivative-free optimiza-

tion algorithm using space grid integration. Am.

J. Comput. Math. 3 (2013), 16–26.

[41] A. Galántai: Convergence of the Nelder-

Mead method. Numer. Algorithms 90 (2022),

1043–1072.

[42] A. Galántai: A stochastic convergence result for

the Nelder-Mead simplex method. MDPI Math-

ematics 11 (2023), 11 / 1–12.

[43] H. A. Tekile, M. Fedrizzi, M. Brunelli: Con-

strained eigenvalue minimization of incom-

plete pairwise comparison matrices by Nelder-

Mead algorithm. MDPI Algorithms 14 (2021),

222 / 1–20.

[44] H. Kitaoka, K. Amano, N. Nishi, T. Sakka: Im-

provement of the Nelder-Mead method using di-

rect inversion in iterative subspace. Optim. Eng.

23 (2022), 1033–1055.

[45] S. S. N. Zulkifli, M. Abdullah, A. Zaharim:

Ionospheric effects on GPS range finding us-

ing 3D ray-tracing and Nelder-Mead optimisa-

tion algorithm. WSEAS Trans. Circuits Syst. 8

(2009), 1–10.

[46] A. Yasmine Begum, G. V. Marutheeswar: Op-

timal tuning of controllers for cascade tem-

perature control of superheater using Nelder

Mead algorithm. WSEAS Trans. Circuits Syst.

15 (2016), 236–241.

[47] W. Spendley, G. R. Hext, F. R. Himsworth: Se-

quential application of simplex designs in op-

timisation and evolutionary operation. Techno-

metrics 4 (1962), 441–461.

[48] M. Fiedler: Special Matrices and Their Appli-

cation to Numerical Mathematics. Dover Publi-

cations, Mineola (USA), 2008.

[49] K. Binmore, J. Davies: Calculus – Concepts and

Methods. Cambridge University Press, Cam-

bridge (UK), 2001.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.71

Jiří Vala, Petra Jarošová

E-ISSN: 2224-2880

650

Volume 22, 2023

[50] S. Šťastník: Computational evaluation of heat

demand of a freezing plant. Proc. 23th Int. Conf.

of Numerical Analysis and Applied Mathematics

(ICNAAM) in Rhodes (Greece, 2019), AIP Conf.

Proc. 2293 (2020), 340115 / 1–4.

[51] S. Šťastník, J. Vala, F. Šot: Physical proper-

ties of selected silicates for a long-term heat

container. Solid State Phenom. 276 (2018),

154–159.

[52] H. S. Carslaw, J. C. Jaeger: Conduction of Heat

in Solids. Clarendon Press, Oxford, 1959.

[53] A. Pacheco-Vega: Soft computing applications

in thermal energy systems, in: Soft Comput-

ing in Green and Renewable Energy Systems

(K. Gopalakrishnan, S. K. Khaitan, S. Kalo-

girou, eds.). Springer, Singapore, 2011.

[54] I. Babuška, J. T. Oden: Verification and valida-

tion in computational engineering and science:

basic concepts. Comput. Methods Appl. Mech.

Eng. 193 (2004), 4057–4066.

[55] T. Hanák, M. Tuscher, O. Přibyl: Hybrid genetic

algorithm-based approach for estimating flood

losses on structures of buildings. MDPI Sustain-

ability 12 (2020), 3047 / 1–16.

Contribution of individual authors

Jiří Vala was responsible for the algorithm design and

convergence analysis. Petra Jarošová carried out the

numerical simulation.

Conflicts of interest

The authors have no conflicts of interest to declare

that are relevant to the content of this article.

Sources of funding for research presented in

a scientific article or scientific article itself

This research has been supported from the project

of specific university research at Brno University of

Technology No. FAST-S-22-7867.

Creative Commons Attribution License 4.0

This article is published under the terms of Creative

Commons Attribution License (Attribution 4.0 Inter-

national, CC BY 4.0).

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2023.22.71

Jiří Vala, Petra Jarošová

E-ISSN: 2224-2880

651

Volume 22, 2023