Simulation assessment of Expectation-Maximization algorithm in

pseudo-convex mixtures generated by the exponential distribution

RUI SANTOS1,3, MIGUEL FELGUEIRAS 1,3,4, JOÃO MARTINS 2,3,5

1ESTG, Polytechnic Institute of Leiria, PORTUGAL

2ESS, Polytechnic Institute of Porto, PORTUGAL

3CEAUL, Faculdade de Ciências, Universidade de Lisboa, PORTUGAL

4CIDMA, University of Aveiro, PORTUGAL

5CEISUC/CIBB, Coimbra, PORTUGAL

Abstract: The use of pseudo-convex mixtures generated from stable distributions for extremes offers a valuable

approach for handling reliability-related data challenges. This framework encompasses pseudo-convex mixtures

stemming from exponential distribution. However, precise parameter estimation, particularly in cases where the

weight parameter ωis negative, remains a challenge. This work assesses the performance of the Expectation-

Maximization algorithm in estimating parameters for pseudo-convex mixtures generated by the exponential dis-

tribution through simulation.

Key-Words: Expectation-Maximization algorithm, exponential distribution, generalized mixtures, parameter

estimation, simulation.

Received: November 27, 2023. Revised: March 19, 2024. Accepted: April 13, 2024. Published: May 10, 2024.

1 Introduction

The exponential (Exp) distribution plays a pivotal role

in reliability analysis owing to its constant hazard rate,

signifying a consistent probability of an event oc-

curring within a specific time interval, regardless of

elapsed time. This characteristic harmonizes seam-

lessly with scenarios where failure rates are time-

independent, rendering it a fundamental model across

diverse fields such as engineering, medicine, finance,

among others (see, e.g., [1], [2], [3], [4]). The hazard

function within the exponential distribution frame-

work plays a crucial role in predicting and addressing

risks tied to system reliability. It enables proactive

planning to boost performance and mitigate potential

failures.

On another note, generalized mixtures distribu-

tions emerge as valuable tools in statistics for achiev-

ing more flexible distributions to better model random

phenomena. These mixtures, characterized by a dis-

tribution function that is a weighted average of other

distribution functions, allow for the incorporation of

negative weights, expanding the scope of modeling

possibilities. Preliminary work on this subject has

explored non-convex mixtures of exponentials (e.g.,

[5], [6], [7]) and Gaussian mixtures, [8], with recent

applications in various domains such as cluster analy-

sis, bioinformatics, biology, epidemiology, social sci-

ences, and finance (e.g., [9], [10], [11], [12]).

Further advancements include pseudo-convex

mixtures generated by the exponential distribution

(see, [13], [14]), which offer increased flexibility in

hazard functions while converging to the exponen-

tial distribution’s hazard function. However, estima-

tion techniques such as the method of moments or

maximum likelihood may exhibit limitations, prompt-

ing an evaluation of estimation performance using

the Expectation-Maximization (EM) algorithm. This

work aims to delve into such assessments, present-

ing parameter estimators and conducting a simulation

study to compare their performance.

Hence, Section 2 provides some preliminary con-

cepts and notations concerned with stable distribu-

tions for extremes, generalized mixtures and pseudo-

convex mixtures (PCM) generated by shape-extended

stable distributions for extremes. Afterwards, Section

3 delineates the pseudo-convex mixtures generated

by the exponential distribution and furnishes estima-

tors for the parameters derived through the method

of moments (MM), maximum likelihood (ML), and

Expectation-Maximization (EM) algorithm. In Sec-

tion 4, a simulation study is conducted to assess and

compare the performance of the provided estimators.

Lastly, Section 5 encapsulates the key findings and

provides final remarks.

2 Pseudo-convex0ixtures*enerated

by6hape-extended6table

'istributions for(xtremes

To establish pseudo-convex mixtures generated by

shape-extended stable distributions for extremes, this

section first outlines the definitions of min-stable and

max-stable distributions. Subsequently, it introduces

the concept of shape-extended stable distributions to

broaden the spectrum of available distributions, [14].

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.34

Rui Santos, Miguel Felgueiras, João Martins

E-ISSN: 2224-2880

312

Volume 23, 2024

2.1 Distributions6table for(xtremes

Consider a sequence of independent and identically

distributed (i.i.d.) absolutely continuous random vari-

ables (r.v.) denoted as X1, . . . , Xn, with distribution

function (d.f.) Fand survival function (s.f.) F, i.e.,

F(x) := 1 −F(x). Furthermore, let Xi:nrepresent

the i-th ascending order statistic associated with these

random variables. Consequently, X1:ndenotes the

minimum of X1, ..., Xn, while Xn:ndenotes the max-

imum of X1, ..., Xn.

A r.v. Xwith d.f. Fis stable for minima or min-

stable (minS) if there exist normalizing sequences

{αn∈R+}and {βn∈R}such that the equality in

distribution X1:n

=αnX+βnholds ∀n∈N, with

X∼F. This is equivalent to stating that the s.f. F

satisfies

FX1:n(x) = Fn(x) = Fx−βn

αn,

for all x∈Rwhere FX1:ndenotes the s.f. of X1:n.

Therefore, if Fis minS, the minima of nindependent

copies of X∼Falso follow the Fdistribution (po-

tentially with a scale and location adjustment). The

Extreme Value Distribution for minima (EVmγ), with

s.f. given by

FGEVmγ(x) =

(exp n−[1 −γx]−1/γo,1 + γx > 0γ= 0

exp {− exp(x)}, x ∈Rγ= 0,

represents the sole potential min-stable distribu-

tion. This distribution applies αn=nγand

βn=γ−1(1 −nγ)if γ= 0 or αn= 1 and

βn=−ln (n)if γ= 0.

The EVmγencompasses the Gumbel (γ= 0),

Fréchet (γ > 0), and Weibull (γ < 0) minimum

distributions. The parameter γserves as the extreme

value index, gauging the heaviness of the left tail

function F. Introducing location (µ) and scale (σ)

parameters allows for the generalization of EVmγ

through FEVmγ(x;µ, σ) = FEVmγ((x−µ)/σ).

Moreover, this distribution holds paramount impor-

tance in Extreme Value Theory (EVT), as per the Ex-

treme Value Theorem (Fisher-Tippett-Gnedenko): if

the minima of nrandom variables converge to a non-

degenerate distribution as nincreases to infinity, it

must converge to the EVmγdistribution.

All results pertaining to the minima of a sequence

of i.i.d. continuous r.v. can be similarly applied to

the maxima due to the relationship Y1:n=−Xn:n,

and also Yn:n=−X1:n, if Y=−X. Therefore, a

r.v. Xwith a d.f. Fis stable for maxima, or max-

stable (maxS), if there exist normalizing sequences

{αn∈R+}and {βn∈R}such that the equality in

distribution Xn:n

=αnX+βnholds for all n∈N,

meaning the d.f. Fsatisfies

FXn:n(x) = Fn(x) = Fx−βn

αn,

for all x∈Rwhere FXn:ndenotes the d.f. of Xn:n.

The only possible max-stable distribution is the Ex-

treme Value Distribution for maxima (EVMγ), with

its d.f. given by FGEVMγ(x) = FGEVmγ(−x). EVMγ

includes the Gumbel (γ= 0), Fréchet (γ > 0),

and Weibull (γ < 0) maximum distributions, and

can also incorporate location and scale parameters

through FGEVMγ(x;µ, σ) = FGEVMγ((x−µ)/σ).

Indeed, in many statistical applications, the focus

lies not on studying typical occurrences (events with

higher probability) but on modelling extreme events,

which tend to have lower probabilities. Therefore,

the primary objective of Extreme Value Theory is to

characterize the minimum and/or maximum of a set of

random variables. Fundamental concepts in this do-

main include order statistics, distributions stable for

extremes, and the Extreme Value Theorem. Key re-

sults and advancements in this theory are documented

in various sources (see, e.g., [15], [16], [17], [18],

[19], [20]). Presently, this theory finds numerous ap-

plications in fields like biostatistics, climatology, fi-

nance, hydrology, industry and insurance (see, e.g.,

[20], [21], [22], [23], [24]), and continues to be an

active area of research, as evidenced by works like,

[25], [26], [27], and their associated references.

2.2 Shape-extended 6table'istributions

The class of stable distributions can be expanded to

accommodate variations in the shape parameter, cf.,

[13], [14]. Consequently, Fqualifies as a shape-

extended min-stable (SEminS) distribution if there

exist normalizing sequences {αn∈R+},{βn∈R},

and {γn∈R}such that the equality in distribution

X1:n

=αnX+βnholds for all n∈N, where

X∼Fγn, and Fγnsignifies the same distribution as

Fbut with a modified shape parameter value (γnde-

notes the new shape parameter value). Therefore, this

equivalence in distribution can be expressed as:

FX1:n(x) = 1 −Fn(x) = Fγnx−βn

αn,

for all x∈R. Apart from the EVmγdistribution,

further examples of SEminS distributions encompass

the generalized logistic type II (GL2) distribution and

the Generalized Pareto (GP) distribution. For ex-

ample, considering the sequence X1, . . . , Xnof i.i.d.

random varibles with Generalized Pareto distribution,

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.34

Rui Santos, Miguel Felgueiras, João Martins

E-ISSN: 2224-2880

313

Volume 23, 2024

GP(µ, σ, γ), where µ∈R,σ, γ ∈R+, and

F(x) = 1 + x−µ

γσ −γ

, x > µ,

then the s.f. of the minimum X1:nis given by

FX1:n(x) = 1 + nx + (1 −n)µ−µ

nγσ −nγ

Thus, GP is a SEminS distribution with αn=n−1,

βn=n−1(n−1)µand γn=nγ, or analogously

nX1:n+ (1 −n)µ∼GP (µ, σ, nγ). The GP distri-

bution indeed holds significance in EVT, particularly

in modelling excesses, [28].

Similarly, Fis regarded as a shape-extended

max-stable (SEmaxS) distribution if there exist nor-

malizing sequences {αn∈R+},{βn∈R}, and

{γn∈R}such that the equality in distribution

Xn:n

=αnX+βn, with X∼Fγn, holds for all

n∈N, i.e.,

FXn:n(x) = Fn(x) = Fγnx−βn

αn,

for all x∈R. In addition to the EVMγdistribu-

tion, other examples of SEmaxS distributions include

the Generalized Logistic (type I) distribution and the

Power function distribution.

The shape-extended stable class of distributions

allows the generalization of stable distributions.

However, this shape-extended definition does not re-

tain the same properties. Another drawback is the ab-

sence of a precise definition of a shape parameter (un-

like the location and scale parameters that have pre-

cise meanings). Nevertheless, this generalization pro-

vides a richer family of distributions able to generate

the pseudo-convex mixtures (PCM).

2.3 PCM*enerated by6hape-extended

6table'istributions for(xtremes

Let Fbe SEminS distribution, then the r.v. Xmwith

d.f. FXmdefined by

FXm(x) = (1 + ω)F(x)−ωFX1:2 (x),

with ω∈[−1,1], is a pseudo-convex mixture (PCM)

generated by the SEminS distribution F.FXmis a

mixture between Fand FX1:2 , which is convex for

ω < 0and non-convex for ω > 0. The same rea-

soning can be applied to the maximum. Let Fbe a

SEmaxS distribution, then the r.v. XMwith d.f. FXM

defined by

FXM(x) = (1 −ω)F(x) + ωFX2:2 (x),

with ω∈[−1,1], is a PCM generated by the SEmaxS

distribution F. Hence, FXMis a mixture between F

and FX2:2 , convex for ω > 0and non-convex for

ω < 0. The formulas of FXmand FXMcan be sim-

plified to

FXm(x) = FXM(x) = F(x)1−ωF (x),(1)

with ω∈[−1,1], which only depends on F(x)and

ω.

Note that in generalized mixtures, when there is

one negative weight, as in equation (1), FXmis not

guaranteed to be a d.f., [29]. Nevertheless, [13],

proves that if Fis a shape-extended stable distribu-

tions for extremes then FXmdefined by equation (1)

is a d.f.. Thus, PCM have the same parameters as F

plus the ωparameter. Consequently, it is more flex-

ible than the convex mixtures without raising the es-

timation cost. Figure 1 and Figure 2 illustrate the re-

markable flexibility inherent in this distribution fam-

ily, showing the density function of PCM generated

by the standard Gumbel and the standard Logistic II

distributions for different omega values. The main

properties of PCM generated by shape-extended sta-

ble distributions are provided in [14].

Fig.1: Density function of PCM generated by the

standard Gumbel distribution with ω=−1+0.25k,

k= 0,1, . . . , 8.

In this study, we confine our focus to a specific

scenario: PCM generated by the exponential distri-

bution. The exponential distribution, as an ESminS

distribution, serves as the foundation for our inves-

tigation. It’s worth noting that the exponential dis-

tribution represents a particular case of the Weibull

distribution and holds significance across various do-

mains of reliability analysis due to its flexibility and

simplicity, [30].

3 PCM*enerated by the(xponential

'istribution

Let Xbe a r.v. with exponential (Exp) distribution

with parameter λ∈R+and d.f. F(x) = 1 −e−λx,

x∈R+, which is a SEminS distribution as

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.34

Rui Santos, Miguel Felgueiras, João Martins

E-ISSN: 2224-2880

314

Volume 23, 2024

0.0

0.1

0.2

0.3

0.4

0.5

−4 −2 0 2

−1

−0.75

−0.5

−0.25

0.25

0.5

0.75

Fig.2: Density function of PCM generated

by the standard Logistic II distribution with

ω=−1+0.25k,k= 0,1, . . . , 8.

X1:n∼Exp (nλ). The density function and the d.f.

of the PCM generated by the exponential distribution

(PCMExp)Xmare given by

FXm(x) = 1 −h1 + ω1−e−λxie−λx

and

fXm(x) = (1 + ω)λe−λx −ω2λe−2λx.

Figure 3 shows the shape of density functions of the

PCM generated by the standard exponential distribu-

tion for different values of ω, with ω=−1+0.25k,

k= 0,1, . . . , 8.

Fig.3: Density function of PCMExp with

ω=−1+0.25k,k= 0,1, . . . , 8.

The hazard rate rX(x) := fX(x)F

−1

X(x), of the

PCMExp is given by

rXm(x) = λ1−ωe−λx

1 + ω−ωe−λx 

=r(x)1−ωF(x)

1 + ωF (x).

Additionally, when ω=−1, the PCM hazard rate be-

comes equal to 2r(x), where r(x) = λrepresents the

hazard rate of a exponential distribution. It’s impor-

tant to note that when ω=−1, this implies that Xm

equals X1:2, and consequently, rX1:2 (x)=2r(x).

Conversely, if ωis not equal to −1, then the PCM

hazard rate will tend to converge to r(x) = λas

xapproaches infinity. Figure 4 illustrates the vari-

ations in the shape of the hazard rate functions of

the PCM, which are generated by the standard expo-

nential distribution, across different values of ω, with

ω= 1 + 0.25k, k = 0,1, . . . , 8.

Fig.4: Hazard rate of PCMExp with

ω=−1+0.25k,k= 0,1, . . . , 8.

3.1 Method of0oments(stimation

The k-th order raw moment of Xm, with k∈N, is

given by

EXk

m=k!

λk1 + ω1−1

2k.

Thus, the method of moments (MM) estimators can

be given by

ew= 2 λX −1

and

λ=3X+q9X2−4m2

2m2

with

X=1

i=1

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.34

Rui Santos, Miguel Felgueiras, João Martins

E-ISSN: 2224-2880

315

Volume 23, 2024

0.1

0.2

0.3

−6 −3 0 3 6

−1

−0.75

−0.5

−0.25

0.25

0.5

0.75

0.0

0.5

1.0

1.5

2.0

012345

w−1

−0.75

−0.5

−0.25

0.25

0.5

0.75

0.0

0.5

1.0

1.5

2.0

0246

w−1

−0.75

−0.5

−0.25

0.25

0.5

0.75

and

m2=1

i=1

3.2 Maximum/ikelihood(stimation

The log-likelihood function of λand ωgiven the ran-

dom sample X= (X1,· · · , Xn)is

ℓ(λ, ω|X) = ln L(λ, w|X) =

=nln(λ)−nλX +

i=1

ln (1 + w−2wexp (−λXi)) ,

and its first partial derivatives are

∂ℓ (λ, ω|X)

∂λ =

λ−nX +

i=1

2ωXiexp (−λXi)

1 + ω−2ωexp (−λXi)

and

∂ℓ (λ, ω|X)

∂ω =

i=1

1−2exp (−λXi)

1 + ω−2ωexp (−λXi).

Hence, it is not straightforward to find the vector

(λEMV, ωEMV)that maximizes the likelihood function.

Nevertheless, iterative methods for numerical approx-

imation can be applied in order to achieve (an approx-

imate value of) the maximum likelihood estimates

(ML).

3.3 Expectation-maximization$lgorithm

The expectation-maximization (EM) algorithm, [31],

can be applied to estimate the unknown parameter θ=

(ω, λ)∈[−1,1] ×]0,+∞[in the PCMExp. In this

case, for ω≤0,

fXm(x) = (1 + ω)λe−λx −ω2λe−2λx

is a convex mixture between λe−λx (Exp(λ) distribu-

tion) and 2λe−2λx (Exp(2λ) distribution). Thus, the

expectation step (E-step) in the k-th iteration can be

obtained by

γ0xi, θ(k)=

(1 + bω(k))exp(−b

λ(k)xi)

(1 + bω(k))exp(−b

λ(k)xi) + 2bω(k)exp(−2b

λ(k)xi),

where b

θ(k)= (bω(k),b

λ(k)). For the maximization step

(M-step) in the k-th iteration we get

Qθ, θ(k)=

i=1

γ0xi, θ(k)[ln(1 + ω) + ln(λ)−λxi] +

i=1 h1−γ0xi, θ(k)i[ln(−ω) + ln(2λ)−2λxi],

which is maximized by

bω(k+1)

i=1

γ0xi, θ(k)−1

and

λ(k+1)

i=−n

i=1 xiγ0xi, θ(k)−2.

However, EM algorithm does not converge with

negative weights, [32], as when ω > 0in the PCMExp.

Therefore, whenever bω(k)>0, the density mixture

was rewritten in the following convex mixture

fXm(x) = ω2λe−λx 1−e−λx+ (1 −ω)λe−λx.

Thus, for positive values of ω,fXmcan also be seen

as a convex mixture between λe−λx (Exp(λ) distri-

bution) and 2λe−λx 1−e−λx(density of the maxi-

mum of two independent Exp(λ) distributions).

Therefore, in these cases (bω(k)>0), the E-step in

the k-th iteration is given by

γ′

0xi, θ(k)=2ω(1 −exp(−λxi))

2ω(1 −exp(−λxi)) + 1 −ω,

and for the M-step in the k-th iteration

Q′θ, θ(k)=

i=1

γ′

0xi, θ(k)[ln(2ωλ)−λxi+

ln (1 −exp(−λxi))] +

i=1 1−γ′

0xi, θ(k)[ln(1 −ω) + ln(λ)−λxi]

which is maximized by

bω(k+1)

i=1

γ′

0xi, θ(k)

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.34

Rui Santos, Miguel Felgueiras, João Martins

E-ISSN: 2224-2880

316

Volume 23, 2024

and

λ(k+1)

"x−1

i=1

γ′

0xi, θ(k)xi(exp(λxi)−1)−1#−1

The EM algorithms repeat the E-step and the M-step

until a fixed point is reached, i.e.,



b

θ(k+1)

i−b

θ(k)

i



< ε,

for some fixed small enough ε > 0.

The EM algorithm’s sensitivity to initial values is

a well-known phenomenon, [31]. In this scenario,

where the PCM is divided into two different convex

mixtures, the problem is even worse as the sign of the

initial omega value will almost surely define the sign

of the final omega estimate. Hence, to address this is-

sue, two estimates were computed, each initiated with

different omega values: one with ω0=−0.5and the

other with ω0= 0.5. Regarding the initial λvalue, as

ew= 2 λX −1by the MM, it follows that

λ=1+0.5ew

Thus, the chosen initial values (λ0, ω0)are

0.75 x−1,−0.5and (1.25 x−1,0.5). Ultimately,

the two resulting estimates are compared using the

Akaike Information Criterion (AIC), [33]. The

estimate yielding the best fit (lowest AIC value) will

be designated as the final EM estimate.

4 Simulations

In this section, the performance of parametric es-

timators for PCMExp through Monte Carlo simu-

lation (104replicas) is analysed. This evaluation

was carry out in software R version 4.3.1, a lan-

guage and environment for statistical computing,

[34]. To this end, PCMExp were simulated with

λ∈ {1,10},ω∈ {−.75,−.50,−.25,0, .25, .50, .75}

and n∈ {100,1000}. The parameters have been

estimated using the MM, the ML based on numeri-

cal iterative methods using package maxLik, [35], on

R (Newton-Raphson algorithm) with starting points

(λ0, ω0) = x−1,0, and on the EM algorithm using

as starting points (λ0, ω0) = 0.75 x−1,−0.5and

(λ0, ω0) = (1.25 x−1,0.5), cf. Section 3.3. The EM

algorithm stops when 



b

θ(k+1)

i−b

θ(k)

i



<10−6. To

assess the performance of the estimators, the bias

(Bias), the absolute relative bias (ARB) and the mean

square error (MSE) were used. The results obtained

are presented in Table 1 and Table 2, and Figure 5.

Table 1λestimation in PCMExp with 104replicas

ω−.75 −.50 −.25 .00 .25 .50 .75

MM, with λ= 1,n= 100

Bias .3683 .1291 .0557 .0332 .0306 .0222 .0235

ARB .4249 .2741 .2235 .1768 .1438 .1232 .1088

MSE .3005 .1170 .0739 .0493 .0338 .0244 .0191

ML, with λ= 1,n= 100

Bias .4017 .1523 .0509 .0096 .0047 .0047 .0059

ARB .4219 .2380 .1936 .1602 .1272 .1008 .0819

MSE .2896 .0973 .0578 .0404 .0273 .0169 .0107

EM, with λ= 1,n= 100

Bias .3649 .1366 .0331 .0054 −.0027 .0033 .0056

ARB .3950 .2352 .1936 .1584 .1310 .1037 .0812

MSE .2662 .0970 .0574 .0405 .0293 .0189 .0108

MM, with λ= 1,n= 1000

Bias .1953 .0150 −.0035 .0030 .0024 .0023 .0026

ARB .2031 .1199 .0833 .0567 .0453 .0387 .0342

MSE .0693 .0196 .0131 .0052 .0033 .0024 .0018

ML, with λ= 1,n= 1000

Bias .1708 .0198 −.0127 −.0012 .0002 .0009 .0007

ARB .1973 .1125 .0862 .0516 .0381 .0306 .0253

MSE .0705 .0190 .0135 .0046 .0023 .0015 .0010

EM, with λ= 1,n= 1000

Bias .1507 .0145 −.0119 −.0026 .0002 .0008 .0007

ARB .1846 .1122 .0852 .0515 .0379 .0299 .0250

MSE .0635 .0188 .0131 .0047 .0023 .0014 .0010

MM, with λ= 10,n= 1000

Bias 1.957 .1473 −.0160 .0288 .0343 .0287 .0194

ARB .2038 .1208 .0830 .0571 .0454 .0388 .0342

MSE 6.946 1.971 1.130 .5196 .3253 .2371 .1854

ML, with λ= 10,n= 1000

Bias 1.776 .2268 −.1001 −.0098 .0108 .0056 .0035

ARB .2014 .1114 .0841 .0518 .0383 .0305 .0250

MSE 7.247 1.852 1.263 .4506 .2310 .1463 .0984

EM, with λ= 10,n= 1000

Bias 1.516 .1652 −.1157 −.0028 −.006 .0108 .0004

ARB .1857 .1112 .0858 .0516 .0386 .0304 .0248

MSE 6.408 1.842 1.324 .4662 .2380 .1448 .0973

The accuracy of estimating the parameter λis intri-

cately tied to the precision of estimating ω; when one

achieves precision, so does the other. In smaller sam-

ples (n= 100), MM notably demonstrates the poorest

performance, evidenced by higher MSE. Moreover,

EM outperforms ML when ω < 0, although ML and

EM display similar performances whenever ω > 0.

As anticipated, increasing the sample size to

n= 1000 enhances estimation quality across all esti-

mators, resulting in more comparable performances.

Nonetheless, MM continues to exhibit inferior perfor-

mance compared to ML and EM, albeit showing sim-

ilarities when ωis negative (mainly with ML). The

performance of ML and EM continue to shows no

significant differences for n= 1000 when ω > 0,

but maintains some differences in the performance of

these estimators for ω < 0. Additionally, altering the

parameter value (for λ= 10) appears to have min-

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.34

Rui Santos, Miguel Felgueiras, João Martins

E-ISSN: 2224-2880

317

Volume 23, 2024

Table 2ωestimation in PCMExp with 104replicas

ω−.75 −.50 −.25 .00 .25 .50 .75

MM, with λ= 1,n= 100

Bias .4312 .1705 .0733 .0434 .0496 .0300 .0103

ARB .6291 .7406 1.386 −1.098 .4898 .2618

MSE .3675 .1921 .1675 .1451 .1211 .0936 .0591

ML, with λ= 1,n= 100

Bias .4621 .1948 .0580 −.0055 −.0071 −.0066 −.0008

ARB .6329 .6148 1.179 −.9350 .3698 .1827

MSE .3527 .1532 .1246 .1183 .0973 .0601 .0305

EM, with λ= 1,n= 100

Bias .4191 .1698 .0274 −.0091 −.0228 .0011 .0037

ARB .6261 .6175 1.155 −.9172 .3821 .1856

MSE .3261 .1521 .1205 .1171 .0936 .0582 .0310

MM, with λ= 1,n= 1000

Bias .2410 .0202 −.0082 .0050 .0042 .0041 .0051

ARB .3265 .3422 .5408 −.3496 .1608 .0991

MSE .0997 .0385 .0304 .0167 .0121 .0102 .0086

ML, with λ= 1,n= 1000

Bias .2076 .0242 −.0254 −.0035 −.0007 .0007 .0003

ARB .3180 .3230 .5624 −.2751 .1127 .0573

MSE .1018 .0382 .0375 .0146 .0075 .0050 .0029

EM, with λ= 1,n= 1000

Bias .1832 .0155 −.0233 −.0070 −.0004 .0005 .0002

ARB .2992 .3206 .5541 −.2781 .1111 .0577

MSE .0924 .0377 .0359 .0149 .0079 .0049 .0030

MM, with λ= 10,n= 1000

Bias .2405 .0193 −.0060 .0043 .0065 .0049 .0034

ARB .3260 .3437 .5374 −.3533 .1599 .1008

MSE .0995 .0385 .0301 .0165 .0124 .0101 .0090

ML, with λ= 10,n= 1000

Bias .2151 .0281 −.0215 −.0034 .0013 −.0007 −.0007

ARB .3242 .3178 .5456 −.2782 .1117 .0575

MSE .1044 .0368 .0346 .0138 .0076 .0049 .0029

EM, with λ= 10,n= 1000

Bias .1833 .0187 −.0237 −.0029 −.0020 .0004 −.0004

ARB .3002 .3201 .5574 −.2802 .1113 .0568

MSE .0922 .0372 .0364 .0143 .0079 .0048 .0027

imal relative impact on estimation quality across all

estimators.

Moreover, results tend to enhance with higher val-

ues of ω, particularly when dealing with non-convex

mixtures, indicating superior outcomes. Specifically,

for low values of ω, such as ω=−0.75, all meth-

ods tend to overestimate ω, though this overestima-

tion tends to diminish with larger sample sizes (albeit

remaining significant even with n= 1000). Conse-

quently, for these ωvalues, estimates may still lack

precision.

The boxplots depicted in Figure 5 clearly illus-

trate that estimation precision notably increases when

ωis positive. Additionally, bias tends towards

zero or its proximity, a trend notably absent when

ω=−0.75. Noteworthy is the presence of outliers,

indicating significantly lower estimation precision.

Even employing EM, instances arise, particularly ev-

Fig.5: λand ωestimation in PCMExp with 104

replicas and λ= 1 for MM (top), ML (middle) and

EM (below).

ident when ω= 0.25, where the estimate of ωnears

−1(the furthest value within the support of ω), re-

sulting in similarly inaccurate estimates for λ(ap-

proximately 4.5 when λ= 10). It’s worth not-

ing that when (ω, λ) = (0.25,10),E(X) = 0.1125,

and conversely, when (ω, λ) = (−1,4.(4)),E(X)

remains 0.1125. Equivalently, the same expected

value for Xis obtained when (ω, λ) = (0,10) and

(ω, λ) = (−1,5), or when (ω, λ) = (−0.25,10) and

(ω, λ)≈(−1,5.7143); representing some of the less

precise scenarios observed in the simulations. De-

spite clear differences in the distribution functions in

these cases, it appears that AIC occasionally strug-

gles to select the optimal solution. Hence, it becomes

pertinent to employ alternative measures of model se-

lection or employ a combination of different metrics.

However, it’s crucial to acknowledge that such in-

stances of very low precision in estimation, while im-

pacting the overall metrics presented in Tables 1 and

2, are infrequent (less than 0.5%) and predominantly

occur when the estimate of ωapproaches −1. Con-

sequently, in practical applications, exercising cau-

tion and employing a broader range of initial values

is advisable when encountering such cases (bω≈ −1)

to ascertain the presence of significantly disparate es-

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.34

Rui Santos, Miguel Felgueiras, João Martins

E-ISSN: 2224-2880

318

Volume 23, 2024

timates.

It’s worth noting that different sample sizes (n)

and λparameter values were evaluated, and the re-

sults remained consistent with those reported, al-

though there is a slight decrease in the number of

cases where the estimate becomes less precise as the

sample size increases.

Additionally, while variations in initial values

were examined in ML, there were no noticeable dif-

ferences observed, although these results were not de-

tailed in the provided tables. Furthermore, the EM

estimator were also assessed using different starting

points, such as the MM estimates, i.e., considering

(λ0, ω0) = (λMM, ωMM)as it is straightforward com-

puted. In this cases, only one estimate were evalu-

ated and, therefore, the results were slightly worse.

Nevertheless, probably the reason for this proximity

is the fact that the sign of the MM estimate of ω(ωMM)

is the same as the true sign of ωwith hight proba-

bility, namely whenever |ω| ≥ 0.25. Although this

probability is low, in this cases the λestimate can be

quite different. For ωvalues in the neighbourhood of

zero, the percentage of opposite signs is higher, but in

these scenarios the density functions are quite similar,

so the difference in λestimates is not so significant.

Furthermore, this percentages clearly decreases when

the sample size increases, being quite lower when the

sample size is n= 1000 than when n= 100.

5 Conclusion

Any PCMExp can be conceptualized as two separate

convex mixtures, delineated for positive and negative

values of ω. Hence, the final EM estimate for PCMExp

will be the best of these EM estimates obtained under

these two scenarios. Thus, this structure allows the

application of the EM algorithm to be carried out only

under convex mixtures, wherein the algorithm typi-

cally yields favourable outcomes. However, although

yield superior estimates compared to other methods

previously used, such as the maximum likelihood es-

timator, employing this algorithm doesn’t appear to

yield precise estimates across the entire support of

(ω, λ). Hence, we plan to incorporate additional fit

measures alongside AIC to evaluate potential dispar-

ities in the obtained results and explore alternative

parameter estimation methods for cases requiring en-

hanced precision. In addition to other information cri-

teria, it can be used goodness-of-fit statistics to de-

termine the best estimate among the EM estimates,

such as Kolmogorov-Smirnov, Anderson-Darling or

Cramér-von Mises statistics, cf., [36], [37]. Further-

more, we aim to adopt a similar methodology to anal-

yse other PCM generated by shape-extended stable

distributions for extremes, with the goal of assessing

the suitability of this approach.

References:

[1] O’Connor, A.N., Modarres, M. Mosleh, A., Prob-

ability Distributions Used in Reliability Engi-

neering, Center for Risk and Reliability, Univer-

sity of Maryland, 2016.

[2] Elsayed, A.E., Reliability Engineering, John Wi-

ley & Sons, Inc, 2021.

[3] Hosmer, D.W., Lemeshow, S, and May, S., Ap-

plied Survival Analysis: Regression Modeling of

Time to Event Data, John Wiley & Sons, Inc.,

2008.

[4] Zeng, K., Xu, X., Zhou, P. et al., Financing the

newsvendor with vendor credit line, Oper Manag

Res, 2024. Published online. https://doi.org/

10.1007/s12063-024-00475-3

[5] Bartholomew, D., Sufficient conditions for a

mixture of exponentials to be a probability

density function, Ann. Math. Stat. 40, 1969,

pp. 2189–2194. https://doi.org/10.1214/

AOMS%2F1177697296

[6] Steutel, F., Note on the infinite divisibility of ex-

ponential mixtures, Ann. Math. Stat., 38, 1967,

pp. 1303–1305. https://doi.org/10.1214/

AOMS%2F1177698806

[7] Steutel, F., Preservation of infinite divisibil-

ity under mixing and related topics, Mathe-

matical Center Tracts 33, Mathematisch Cen-

trum, Amsterdam, 1970. https://doi.org/

10.2307/2556202

[8] Zhang, B., Zhang, C., Finite mixture mod-

els with negative components, MLDM 2005,

2005, pp. 31–41. https://link.springer.

com/chapter/10.1007/11510888_4

[9] McLachlan, G, Peel, D., Finite Mixture Models,

Wiley Series in Probability and Statistics, John

Wiley & Sons, Inc., 2000.

[10] Murphy, K.P., Machine Learning A Probabilis-

tic Perspective, Massachusetts Institute of Tech-

nology, 2012.

[11] Klüppelberg, C., Seifert, M.I., Explicit results

on conditional distributions of generalized ex-

ponential mixtures, Journal of Applied Probabil-

ity, 2020, pp. 760–774. https://doi.org/10.

1017/jpr.2020.26

[12] Yang Y., Tian W., Tong T., Generalized mix-

tures of exponential distribution and associated

inference, Mathematics, Vol.9, No.12, 2021.

https://doi.org/10.3390/math9121371

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.34

Rui Santos, Miguel Felgueiras, João Martins

E-ISSN: 2224-2880

319

Volume 23, 2024

[13] Felgueiras, M., Martins, J.P., and Santos, R.,

Pseudo-convex Mixtures, AIP Conf. Proc., 1479,

2012, pp. 1125-1128. https://doi.org/10.

1063/1.4756346

[14] Santos, R., Felgueiras, M., and Martins, J.P.,

Pseudo-convex Mixtures Generated by Shape-

extended Stable Distributions for Extremes, Jour-

nal of Statistical Theory and Practice, Vol.10,

No.2, 2016, pp. 357–374. https://doi.org/

10.1080/15598608.2016.1146929

[15] Beirlant, J., Goegebeur, Y., Segers, J., Teugels,

J., Statistics of Extremes: Theory and Applica-

tions, Wiley, England, 2004.

[16] David, H.A., Nagaraja, H.N., Order Statistics,

John Wiley & Sons, New Jersey, 2003.

[17] De Haan, L., Ferreira, A., Extreme Value The-

ory: An Introduction, Springer, New York, 2006.

[18] Kotz, S., Nadarajah, S., Extreme Value Distribu-

tions: Theory and Applications, Imperial College

Press, London, 2000.

[19] Pickands, J., Statistical inference using ex-

treme order statistics, Ann. Stat. 3, 1975, pp.

119–131. https://doi.org/10.1214/AOS%

2F1176343003

[20] Reiss, R.D., Thomas, M., Statistical Analysis of

Extreme Values, with Application to Insurance,

Finance, Hydrology and Other Fields 3rd edition,

Birkhäuser Verlag, 2007.

[21] Beirlant, J., Caeiro, F., Gomes, M.I., An

overview and open research topics in statistics

of univariate extremes, Revstat 10, 2012, pp.

1–31. https://doi.org/10.57805/revstat.

v10i1.109

[22] Castillo, E., Hadi, A.S., Balakrishnan, N., Sara-

bia, J.M., Extreme Value and Related Models with

Applications in Engineering and Science, John

Wiley & Sons, Hoboken, New Jersey, 2005.

[23] Embrechts, P., Klüppelberg, C., Mikosch, T.,

Modelling Extremal Events for Insurance and Fi-

nance, Springer, Berlin, 2001.

[24] Ferreira, M., Clustering of extreme values:

estimation and application, AStA Advances

in Statistical Analysis, Vol.108, No.1, 2024,

pp. 101–125. https://doi.org/10.1007/

s10182-023-00474-y

[25] Albrecher, H., Beirlant, J., Statistics of Extremes

for the Insurance Industry, In book:Handbook of

Statistics of Extremes, Chapman & Hall, 2024.

[26] Dey, D.K., Yan, J., Extreme Value Modeling and

Risk Analysis, Chapman and Hall/CRC, 2016.

[27] Longin, F., Extreme Events in Finance: A Hand-

book of Extreme Value Theory and its Applica-

tions, John Wiley & Sons, Inc, 2016.

[28] Zhao X., Zhang Z., Cheng W., Zhang P., A

New Parameter Estimator for the Generalized

Pareto Distribution under the Peaks over Thresh-

old Framework. Mathematics, Vol.7, No.5, 2019.

https://doi.org/10.3390/math7050406

[29] Wu, J.W., Characterizations of generalized mix-

tures of geometric and exponential distribu-

tions based on upper record values, Stat. Pap.,

Vol42, 2001, pp. 123–133. https://doi.org/

10.1007/s003620000045

[30] Das, S., Kundu, D., On Weighted Ex-

ponential Distribution and Its Length

Biased Version, J Indian Soc Probab

Stat, Vol17, 2016, pp. 57–77. https:

//doi.org/10.1007/s41096-016-0001-9

[31] Dempster, A.P., Laird, N.M., Rubin, D.B., Max-

imum Likelihood from Incomplete Data via the

EM Algorithm, Journal of the Royal Statistical

Society. Series B (Methodological), Vol.39, No.1,

1977, pp. 1–38. https://doi.org/10.1111/

j.2517-6161.1977.tb01600.x

[32] Frühwirth-Schnatter, S., Finite Mixture and

Markov Switching Models, Springer, New York,

2012.

[33] Akaike, H., Factor analysis and AIC, Psychome-

trika 52, 1987, pp. 317–332. https://doi.org/

10.1007/BF02294359

[34] R Core Team, R: A Language and Environ-

ment for Statistical Computing, R Foundation

for Statistical Computing, 2023. https://www.

R-project.org/

[35] Toomet, O., Henningsen, A., Graves, S.,

Croissant, Y., Hugh-Jones, D., Scrucca, L.,

Package ’maxLik’: Maximum Likelihood Esti-

mation and Related Tools, Publishing House,

2022. https://cran.r-project.org/web/

/packages/maxLik/

[36] Dickhaus, T., Goodness-of-Fit Tests. In: Theory

of Nonparametric Tests, Springer, 2018. https:

//doi.org/10.1007/978-3-319-76315-6_3

[37] Darling, D.A., The Kolmogorov-Smirnov,

Cramer-von Mises Tests, The Annals of Math-

ematical Statistics, Vol. 28, No. 4, 1957, pp.

823–38. https://doi.org/10.1214/AOMS%

2F1177706788

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.34

Rui Santos, Miguel Felgueiras, João Martins

E-ISSN: 2224-2880

320

Volume 23, 2024

Contribution of Individual Authors to the

Creation of a Scientific Article (Ghostwriting

Policy)

The authors equally contributed in the present re-

search, at all stages from the formulation of the prob-

lem to the final findings and solution.

Sources of Funding for Research Presented in a

Scientific Article or Scientific Article Itself

This work is partially financed by national funds

through FCT – Fundação para a Ciência e a Tec-

nologia under the project UIDB/00006/2020.

DOI: 10.54499/UIDB/00006/2020 (https:

//doi.org/10.54499/UIDB/00006/2020)

Conflicts of Interest

The authors have no conflicts of interest to

declare that are relevant to the content of this

article.

Creative Commons Attribution License 4.0

(Attribution 4.0 International , CC BY 4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.0/deed.en

_US

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.34

Rui Santos, Miguel Felgueiras, João Martins

E-ISSN: 2224-2880

321

Volume 23, 2024