Parametric landmark estimation of the transition probabilities in

survival data with multiple events

GUSTAVO SOUTINHO

EPIUnit

University of Porto

Rua das Taipas 135, 4050-600 Porto

PORTUGAL

LUÍS MEIRA-MACHADO

University of Minho

Centre of Mathematics

Campus de Azurém, 4800-058 Guimarães

PORTUGAL

Abstract: Multi-state models are a useful tool for analyzing survival data with multiple events. The transition

probabilities play an important role in these models since they allow for long-term predictions of the process in

a simple and summarized manner. Recent papers have used the idea of subsampling to estimate these quantities,

providing estimators with superior performance in the case of strong violations of the Markov condition. Sub-

sampling, also referred to as landmarking, leads to small sample sizes and usually heavily censored data, which

leads to estimators with higher variability. Here, we use the flexibility of the generalized gamma distribution com-

bined with the same idea of subsampling to obtain estimators free of the Markov condition with less variability.

Simulation studies show the good small sample properties of the proposed estimators. The proposed methods are

illustrated using real data.

Key-Words: Multi-state models, Transition probabilities, Generalized gamma distribution, Landmark approach

Received: June 6, 2021. Revised: March 12, 2022. Accepted: April 2, 2022. Published: May 3, 2022.

1 Introduction

Multi-state models are models for a stochastic pro-

cess that at any time occupies one of a set of discrete

states [1, 2, 3, 4]. These models provide a relevant

modeling framework to deal with complex survival

data in which individuals may experience more than

one event. A multi-state model can be represented

schematically by diagrams with boxes representing

the states and arrows representing the possible transi-

tions. The complexity of a multi-state model greatly

depends on the number of states and the transitions

allowed between these states.

Among the examples, the simplest case is the mor-

tality model for survival data, which involves only

two states and one transition. The competing risks

model [5, 6] can be seen as an extension of the mor-

tality model, considering that a subject may reach the

ultimate state due to any of several causes. The ir-

reversible illness-death or disability model is a spe-

cial case of multi-state model, commonly used in the

literature to introduce the theoretical background of

multi-state models, in which the individuals may pass

from the initial state (State 1) to the intermediate state

(State 2) and then to the absorbing state (State 3) (Fig-

ure 1). Individuals are at risk of death in each transient

state (States 1 and 2).

One important goal in the analysis of multi-state

survival data is to assess the relationship of explana-

tory variables to the several event times. Another pri-

mary goal is to obtain predictions of the clinical prog-

nosis of a patient at a certain point in his or her illness

process. The main assumption implied in both cases is

the Markov assumption, which claims that given the

present state, the future evolution of the process is in-

dependent of the states previously visited and the tran-

sition times among them. Traditionally, the transition

probabilities are estimated using the Aalen-Johansen

estimator [7] that gives consistent estimators of the

transition probabilities when the multi-state model is

Markov. When the multi-state model is non-Markov,

this is no longer the case. Recently, [8] and [9] in-

troduced alternative estimators based on subsampling

(also known as landmarking) that are consistent re-

gardless of the Markov assumption. Both methods

are derived from a subset of the data consisting of

all subjects observed to be in the given state at the

given time. The first paper, by [8], uses a procedure

based on (differences between) Kaplan–Meier esti-

mators, whereas the second, by [9] used the Aalen-

Johansen estimate of the state occupation probabili-

ties derived from the same subset. Since the compu-

tation of the transition probabilities of the two meth-

ods is based on reduced data, they will lead to esti-

mators with higher variability. To overcome this is-

sue, parametric estimation based on the same specific

portions of data may be used to provide estimators

with less variability that are free of the Markov con-

dition. In some situations, fully-parametric modeling

can be more convenient to model complex data struc-

tures and processes [10, 11]. In this paper, we use

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2022.21.27

Gustavo Soutinho, Luís Meira-Μachado

E-ISSN: 2224-2880

207

Volume 21, 2022

the flexibility of the generalized gamma distribution

combined with the same idea of subsampling to ob-

tain estimators that are free of the Markov condition

with less variability.

The following section provides some notation and

introduces the theoretical background associated with

the landmark estimators and the proposed estimators

based on the generalized gamma distribution. In sec-

tion 3, these methods are compared through simula-

tions. The application to a real data set on colon can-

cer is presented in Section 4. Finally, the main con-

clusions of this paper are reported in Section 5.

2 Models and methods

2.1 Notation

In the following, we will devote our attention to the

particular case of the illness–death model that in-

volves three different states and three possible tran-

sitions among them, since, as mentioned previously,

this model plays an important role in the theory of

multi-state models. The process of the illness-death

model is fully characterized by the vector of times

(Z, T ), where Zdenotes the sojourn time in the ini-

tial state (State 1) and Tthe total survival time (i.e.,

the absorption time). Note that for those individu-

als undergoing a direct transition from State 1 to the

absorbing State 3 we have Z=T. On the other

hand, Z < T indicates that the individual visits the

intermediate State 2 at some time. We assume that

right-censoring issues may appear due to time lim-

itations in the follow-up, lost follow-up cases, etc.

This censoring is modeled by considering a censor-

ing variable, C, which we assume to be indepen-

dent of the process. Under censoring, only the cen-

sored versions of Zand T, along with their corre-

sponding censoring indicators, are available. Thus,

we define e

Z=min(Z, C)and e

T=min(T, C)

for the censored versions of Zand Tand introduce

∆1=I(Z≤C)and ∆ = I(T≤C)for the respec-

tive censoring indicators of Zand T. The variables

Zand e

T23 =e

T−e

Zare the observed sojourn times

in states 1 and 2, respectively. Finally, the available

data are (e

Zi,e

Ti,∆1i,∆i),1≤i≤n, i.i.d. copies of

Z, e

T , ∆1,∆).

2.2 Nonparametric landmark estimators of

the transition probabilities

The target is each of the five transition probabili-

ties phj (s, t) = P(X(t) = j|X(s) = h), where

1≤h≤j≤3and sand tare two pre-specified

time points with s < t. As noted by [8], these transi-

tion probabilities are functions involving Zand Tas

follow:



State1State2

State3

Figure 1: The progressive illness-death model.

p11(s, t) = P(Z > t |Z > s),

p12(s, t) = P(Z≤t, T > t |Z > s),

p13(s, t) = P(T≤t|Z > s),

p22(s, t) = P(Z≤t, T > t |Z≤s, T > s),

p23(s, t) = P(T≤t|Z≤s, T > s).

Two obvious relations stand out: p11(s, t) +

p12(s, t) +p13(s, t) = 1 and p22(s, t)+p23(s, t) = 1,

revealing that in practice one only need to estimate

three of these quantities.

When the multi-state model is Markovian, the

Aalen-Johansen estimator [7] gives consistent estima-

tors of the transition probabilities. Explicit formulae

of the Aalen-Johansen estimator for the illness-death

model are available [13]. When the multi-state model

is non-Markov, this is no longer the case. In these

cases, the Aalen-Johansen estimator may introduce

some bias, and therefore, they may be inappropriate.

Estimators that do not rely on the Markov assump-

tion were first introduced by [14]. The proposed esti-

mators were defined in terms of multivariate Kaplan-

Meier integrals and proved to be more efficient than

the Aalen-Johansen estimators in the case of strong

violations of the Markov assumption. However, this

proposal has the drawback of requiring that the sup-

port of the censoring distribution contain the support

of the lifetime distribution, otherwise they only re-

port valid estimators for truncated transition proba-

bilities. Recently, [8] recovered the work by [14]

and proposed alternative estimators that are consis-

tent regardless of the Markov condition and the re-

ferred assumption on the censoring support. The idea

of the new methods is to use a procedure based on

differences between Kaplan-Meier estimators derived

from a subset of the data consisting of all subjects ob-

served to be in a given state at a given time. Fol-

lowing equations above, given the time point s, to

estimate p1j(s, t)for j= 1,2,3the landmark anal-

ysis is restricted to the individuals observed in State

1 at time s. For the subpopulation Z > s, the cen-

soring time Cis still independent of the pair (Z, T )

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2022.21.27

Gustavo Soutinho, Luís Meira-Μachado

E-ISSN: 2224-2880

208

Volume 21, 2022

and, therefore, Kaplan-Meier-based estimation will

be consistent. Similarly, to estimate p2j(s, t),j=

2,3, the landmark analysis proceeds from the sample

restricted to the individuals observed in State 2 at time

s. Then, we may formally introduce the landmark es-

timators as follows:

bpLM

11 (s, t) = b

SKM(s)

1(t),

bpLM

12 (s, t) = b

SKM(s)(t)−b

SKM(s)

1(t),

bpLM

13 (s, t) = 1 −b

SKM(s)(t),

bpLM

22 (s, t) = b

SKM[s](t),

bpLM

23 (s, t) = 1 −b

SKM[s](t)

where b

SKM(s)

1and b

SKM(s)are the Kaplan-Meier es-

timators for the distributions of Zand T, respec-

tively, but computed from the subsample S1=

ni:e

Zi> so; whereas b

SKM[s]is the Kaplan-Meier es-

timator of the distribution of Tbut computed from the

subsample S2=ni:e

Zi≤s < e

Tio.

2.3 Parametric estimation

Nonparametric estimators of the transition probabili-

ties based on landmarking were introduced in the pre-

vious section. However, methods based on a fully-

specified model can be more convenient for repre-

senting complex data structures and processes [10].

This is the case for hazard functions that may vary

predictably due to interval censoring, frailties, or mul-

tiple responses. In fact, the use of parametric mod-

els is an attractive option because standard methods,

such as maximum likelihood, are available for param-

eter estimation and testing, and a complete descrip-

tion of the hazard function can be obtained [15]. [16]

also refers an example in biomedical application for

which the mean survival needs the survival function

S(t)to be fully-specified for all times t, and paramet-

ric models that combine data from multiple time peri-

ods should be used. Nevertheless, since the standard

parametric distribution has hazard functions that can

totally change shape according to the values of the pa-

rameters, the choice of the most suitable model from

the available families of distributions can be problem-

atic, and the results of the inference may be affected

[17]. The use of parametric models may also improve

the accuracy of the transition probabilities by reduc-

ing the high variability of the landmark estimators for

cases with small sample sizes or high censoring lev-

els.

The generalized gamma (GG) distribution is a very

flexible distribution that generalizes several impor-

tant and common parametric models that have wide

applications in lifetime data analysis. Among these

are the exponential, Weibull, log-normal, and two-

parameter gamma distributions. In fact, due to the

flexible variety of shapes of its hazard function, which

includes all four of the most common types of haz-

ard function (monotonically increasing and decreas-

ing, as well as bathtub and arc-shaped hazards), the

GG distribution can easily be used in different contexts

for modelling lifetime models [18]. Due to its impor-

tance, the following survival and density functions for

the GG distribution are introduced. Further details on

other special distributions can be found in [18].

2.3.1 The generalized gamma distribution

The generalized gamma distribution (GG), introduced

by [19], is a three-parameter distribution with the fol-

lowing probability density function

f(t|β > 0, α > 0, k > 0) =

=β

Γ (k)·tβk−1

αβk ·exp h−(t/α)βi, t > 0,(1)

This distribution is a generalization of the two-

parameter gamma distribution when β= 1. The

widely used exponential (β=k= 1) and weibull

(k= 1), are also special cases of the GG distribution.

[20] considers a new reparametrization of the GG dis-

tribution with location (µ=log(α)), scale (σ=β−1)

and shape (λ= 1/k2) that allows to extend the origi-

nal one to include a further class of models with λ <

0. Specifically, let Γ (t;γ) = Rt

0xγ−1e−xdx/Γ(γ)

the gamma distribution with mean and variance equal

to γ > 0. The survival function for GG(µ, σ, λ), if

λ > 0, is given by

S(t) = 1 −Γλ−2exp (λ[log(t)−µ] /σ) ; λ−2(2)

= 1 −Γhλ−2e−µtλ/σ;λ−2i(3)

and, if λ < 0, replacing z= (log(t)−µ)/σby −z,

take the following form

S(t) = Γ hλ−2e−µtλ/σ;λ−2i(4)

Finally, the probability density function is

f(t) = |λ|

σtΓ (λ−2)×

×hλ−2e−µtλ/σiλ−2

exp h−λ−2e−µtλ/σi(5)

and the hazard function, for λ > 0, is given by

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2022.21.27

Gustavo Soutinho, Luís Meira-Μachado

E-ISSN: 2224-2880

209

Volume 21, 2022

h(t) = f(t)

S(t)=|λ|

σtΓ (λ−2)hλ−2e−µtλ/σiλ−2

exp h−λ−2e−µtλ/σi×

1−Γhλ−2e−µtλ/σ;λ−2i−1

(6)

and for λ < 0, has the form

h(t) = |λ|

σtΓ (λ−2)hλ−2e−µtλ/σiλ−2

exp h−λ−2e−µtλ/σi×

Γhλ−2e−µtλ/σ;λ−2i−1

(7)

Figures 2, 3 and 4 plot the GG density, survival and

hazard functions for different choices of the parame-

ters µ,σand λ.

In case of λ=σthe GG distribution becomes the

two-parameter gamma distribution G(µ, σ). The den-

sity for the weibull distribution can be obtained with

λ= 1 and the exponential distribution corresponds to

the case λ=σ= 1. Another special case is when

λ= 0 which correspond to the log-normal distribu-

tion, with S(t) = 1 −Φlog (e−µt)1/σ, where Φ(·)

is the standard normal cumulative density function.

Therefore, the GG family contains almost all the most

commonly used parametric distributions [16, 17].

2.3.2 Maximum likelihood estimation

Let’s consider the probability density function of a

general parameter model, for an event at time t,

given by f(t|µ, α),t≥0, where µcorrespond

to the mean or location parameter and the vector

α= (α1,· · · , αR)represents other parameters such

as shape, variance or higher moments. A standard

approach for the estimation of parameters of a fully

parametric model is to use the likelihood function.

Thus, let ti,i= 1,…, n be a sample of times from

individuals i, with ci= 1 if tiis an observed time-to-

event, or ci= 0 if this is censored. Also let sibe cor-

responding left-truncation (or delayed-entry) times,

meaning that the ith survival time is only observed

conditionally on the individual’s having survived up

to si, thus si= 0 if there is no left-truncation. In the

case of right-censoring, the likelihood of the proba-

bility of density function for the previous parameters

θ= (µ, α)is [16]:

l(θ|t, c, s) = Qi:ci=1 fi(ti|µ, α)Qi:ci=0 Si(ti|µ, α)

QiSi(si)(8)

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0123456

f(t)

µ = 0.2 ,σ = 0.5 ,λ = 2

µ = 0.2 ,σ = 2 ,λ = 2

µ = 0.4 ,σ = 0.3 ,λ = 0.5

µ = 0.4 ,σ = 0.3 ,λ = −2

Figure 2: Plots of the GG density for some values of

µ,σand λparameters.

In the case of the GG distribution, the likelihood func-

tion is obtained by replacing, respectively, the density

and survival functions by the expressions (5) and (3

or 4, according to the value of λ). The log-likelihood

function is given from (6) as follow [16]:

log l(θ|t, c, s) = X

i:ci=1

{log (hi(ti)−Hi(ti))} −

−X

i:ci=0

Hi(ti) + X

Hi(si)(9)

where Hdenote the cumulative hazard function.

The parameters can be then estimated by maximiz-

ing the log-likelihood function with respect to θusing

the maximum likelihood estimation (MLE) method.

If the likelihood function is differentiable, the deriva-

tive test for determining the maximum can be applied.

In some cases, the first-order conditions of the like-

lihood function can be solved explicitly. However,

in most cases, numerical methods will be necessary

to find the maximum of the likelihood function [18].

In this paper, the maximization was done using the

flexsurvreg package, a software application for R,

through the optimization methods available in the op-

tim function.

2.3.3 Parametric estimators of the transition

probabilities

The parametric landmark estimators are based on the

landmark ideas as used in (1). This estimator can be

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2022.21.27

Gustavo Soutinho, Luís Meira-Μachado

E-ISSN: 2224-2880

210

Volume 21, 2022

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0 0.2 0.4 0.6 0.8 1.0

S(t)

µ = 0.2 ,σ = 0.5 ,λ = 2

µ = 0.2 ,σ = 2 ,λ = 2

µ = 0.4 ,σ = 0.3 ,λ = 0.5

µ = 0.4 ,σ = 0.3 ,λ = −2

Figure 3: Plots of the GG survival functions for some

values of µ,σand λparameters.

obtained by applying the procedure based on (differ-

ences between) parametric survival estimators intro-

duced in Section 2.3.1 to the same specific portions

of data.

3 Simulation studies

In this section, we report the results from simula-

tion in order to evaluate the performance of the pro-

posed parametric estimators. In particular, we com-

pare the estimator described in Section 2.3, using the

GG distribution (labeled GG, to the Landmark estima-

tor introduced by [8], labeled LM). Data were simu-

lated to represent a progressive illness-death model,

assuming that all individuals begin in the initial state

(State 1) at time t= 0. The movement of the

individuals was defined according to the scenario

described by [21] and [22] in which the subjects

may pass through the intermediate state (State 2)

at some time or go directly to the absorbing state

(State 3). For the individuals who visit the inter-

mediate state, we generated replicates of (Z, T −

Z)through the bivariate distribution F12(x, y) =

F1(x)F2(y) [1 + {1−F1(x)} {1−F2(y)}]with ex-

ponential marginal distribution functions with rate pa-

rameter 1. For individuals that experience a direct

transition to the absorbing state (Z=T), the value of

Zis simulated according to an exponential with rate

parameter 1. Random censoring times were indepen-

dently generated from uniform distributions U[0,3]

and U[0,4] to introduce heavier or lighter censor-

ing. To be more specific, in the case of the first sce-

nario, the censoring proportion in the first gap time

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0 10 20 30 40

f(t)

µ = 0.2 ,σ = 0.5 ,λ = 2

µ = 0.2 ,σ = 2 ,λ = 2

µ = 0.4 ,σ = 0.3 ,λ = 0.5

µ = 0.4 ,σ = 0.3 ,λ = −2

Figure 4: Plots of the GG hazard functions for some

values of µ,σand λparameters.

is 32% while for the second gap time, this percentage

increases to 57%. The second model decreases these

censoring levels to 24% and 47%, respectively. For

each simulated scenario, we fixed the values of (s, t)

pairs for four different points, corresponding to com-

binations of the percentiles of 20%, 40%, 60% and

80% of the exponential marginal distribution func-

tions with rate parameter 1 to summarize the results of

the transition probabilities. In each simulation, 1000

samples were generated with sample sizes of 50, 100,

and 250. From these samples, we obtained the mean

for all the generated data sets. As a measure of effi-

ciency, we took the Mean Squared Error (MSE) but

we also computed the standard deviations (SD) and

the Bias for each point (s, t).

Tables 1 and 2 show the results of the LM and

GG estimators for the transition probabilities bp12(s, t),

bp13(s, t)and bp23(s, t)from the simulation studies de-

scribed previously. The parametric estimators, la-

beled as GG, were implemented by fitting a general-

ized gamma distribution to the multi-state model un-

der the landmark approach through the flexsurvreg

function of the flexsurv of the Rpackage [16]. Re-

sults reveal that the accuracy of the estimators for all

transition probabilities decreases as sand tgrow due

to the censoring effects that are higher at the right tail

of the lifetime distribution. As would be expected,

the SD decreases as the sample sizes increase, and

with a decrease in the censoring percentage. From

the results we can observe that the performance of

the proposed parametric estimators is better than the

nonparametric counterpart for cases with higher lag

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2022.21.27

Gustavo Soutinho, Luís Meira-Μachado

E-ISSN: 2224-2880

211

Volume 21, 2022

times t−s, and consequently can be an alternative in

case with small sample sizes when comparing with

the LM estimator. This can be observed by the rel-

ative efficiency that is represented by the ratio be-

tween the corresponding MSEs of the nonparametric

and parametric estimators. For the transition proba-

bilities bp12(s, t)and bp23(s, t)the results are quite sim-

ilar, revealing the better accuracy of the parametric

estimator as the difference between times increases.

In the case of bp13(s, t), for almost all transition times

the MSE ratio is above 1, confirming the advantages

of the parametric estimator compared to the LM esti-

mator. For completeness purposes, we show in Fig-

ures 5 and 6 the boxplots of the estimates of the tran-

sition probabilities based on the 1000 Monte Carlo

replicates for the two sets of estimators, with differ-

ent sample sizes. The boxplots shown in these figures

reveal some results which agree with our findings re-

ported in Tables 1 and 2. From these plots, it can be

seen that all methods have small biases and confirm

the lower variability of the parametric estimators in

some cases. As expected, the lower variability of the

parametric estimator comes at the cost of a small in-

crease in the bias. This is more clear when the dif-

ference t−sis small or for small sample sizes, and

consequently, in these cases, the LM nonparametric es-

timator may be preferable.

4 Example of application

In this section, we analyse the results of the applica-

tion of the proposed methods using data from a colon

cancer cancer study. This is a study from a large clin-

ical trial on Duke’s stage III patients [24] that focuses

on 929 patients who were followed after suffering cu-

rative surgery for colorectal cancer until death or cen-

soring. Of the initial sample, 423 remained alive dur-

ing the course of the follow-up period. 468 of the pa-

tients had a recurrence, and among these, 414 ended

up dying. Finally, 38 individuals died without expe-

riencing a recurrence.

Figure 7 reports, for each row, the estimated tran-

sition probabilities, bp11(s, t),bp12(s, t),bp13(s, t)and

bp23(s, t), for fixed values of s= 365 (left hand side)

and s= 1095 (right hand side). Each plot shows the

estimated curves based on the parametric GG (black

line) and the LM estimators (red line). Plots reported

in the first row report the estimated transition prob-

abilities for p11(s, t), for fixed values s= 365 and

s= 1095 (days), along with time t. The estimated

curves report the survival fraction over time among

the individuals ‘alive and without recurrence’. Plots

in the second row allow for an inspection over time of

the probability of being alive with recurrence for in-

dividuals who are disease free 1 year (left hand side)

and 3 years (right hand side) after surgery. Since the

recurrence state is transient, this curve first increases

and then decreases. It is also evident that the probabil-

ities for s= 1095 are smaller than those for s= 365.

Finally, plots shown in the third and fourth rows report

the estimated transition probabilities for p13(s, t)and

p23(s, t). They report one minus the survival fraction

along time, among the individuals ‘alive and without

recurrence’ and those ‘alive and with recurrence’, re-

spectively. Curves depicted in Figure 7 reveal that the

parametric landmark estimators based on the general-

ized gamma distribution provide, in all cases, curves

with the expected behavior, similar to those obtained

from the nonparametric landmark estimators but with

less variability.

5 Conclusion

In this article, the relative performance of the pro-

posed parametric landmark estimators for the tran-

sition probabilities was investigated through simula-

tions. Attained results suggest that the use of the gen-

eralized gamma distribution combined with the idea

of subsampling can lead to competitive estimators

that may outperform the original landmark estimators

in some cases, providing estimators with less variabil-

ity. It is worth mentioning that, in some cases, para-

metric estimation can introduce some bias in estima-

tion while reducing the variance. The risk of intro-

ducing a large bias can be controlled in practice by

comparing the obtained estimated curves with those

obtained by the nonparametric estimator.

Acknowledgements: This research was financed by

Portuguese Funds through FCT “Fundação para a

Ciência e a Tecnologia”, within the research grants

PD/BD/142887/2018 and UID/BIA/04050/2019.

References:

[1] Andersen PK, Borgan O, Gill RD, Keiding N.

Statistical Models Based on Counting Processes.

Springer-Verlag, New York; 1993.

[2] Hougaard P. Analysis of multivariate survival

data. Springer-Verlag. New York; 2000.

[3] Meira-Machado L, de Uña-Álvarez J, Cadarso-

Suárez C, Andersen PK. Multi-state models

for the analysis of time to event data. Statis-

tical Methods in Medical Research; 2009; 18:

195–222.

[4] Meira-Machado L, Sestelo M. Estimation in

the progressive illness-death model: a nonex-

haustive review. Biometrical Journal; 2019; 61:

245–263.

[5] Andersen PK, Keiding N. Multi-state models

for event history analysis. Statistical Methods in

Medical Research; 2002; 11: 91–115.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2022.21.27

Gustavo Soutinho, Luís Meira-Μachado

E-ISSN: 2224-2880

212

Volume 21, 2022

[6] Putter H, Fiocco M, Geskus RB. Tutorial in

biostatistics: Competing risks and multi-state

models. Statistics in Medicine; 2007: 26,

2389–2430.

[7] Aalen OO, Johansen S. An empirical transition

matrix for non homogeneous Markov and chains

based on censored observations. Scandinavian

Journal of Statistics; 1978; 5: 141–150.

[8] de Uña-Álvarez J, Meira-Machado L. Nonpara-

metric estimation of transition probabilities in

the non-Markov illness-death model: A compar-

ative study. Biometrics; 2015; 71: 141–150

[9] Putter H, Spitoni C. Non-parametric estimation

of transition probabilities in non-Markov multi-

state models: The landmark Aalen-Johansen

estimator. Statistical Methods in Medical Re-

search; 2018; 27(7): 2081–2092.

[10] Aalen O, Borgan O, Gjessing H. Survival and

Event History Analysis: A Process Point of

View. Statistics. Springer-Verlag, New York.

2008.

[11] Benaglia T, Jackson C, Sharples L. Survival Ex-

trapolation in the Presence of Cause Specific

Hazards. Statistics in Medicine; 2015. 34 (5):

796–811.

[12] Datta S, Satten GA. Validity of the Aalen-

Johansen estimators of stage occupation prob-

abilities and Nelson Aalen integrated transition

hazards for non-Markov models. Statistics &

Probability Letters; 2001; 55: 403–411.

[13] Borgan O. Aalen-Johansen Estimator. Encyclo-

pedia of Biostatistics; 1988; 1: 5–10.

[14] Meira-Machado L,; de Uña-Álvarez J, Cadarso-

Suárez C. Nonparametric estimation of tran-

sition probabilities in a non-Markov illness-

death model. Lifetime Data Analysis; 2006; 12:

325–344.

[15] Wei LJ. The accelerated failure time model: a

useful alternative to the Cox regression model in

survival analysis. Statistics in Medicine; 1992;

11: 1871–1879.

[16] Jackson C. flexsurv: A Platform for Paramet-

ric Survival Modeling in R. Journal of Statistical

Software; 2016; 70 (8): 1–33.

[17] Cox C, Chu H, Schneider MF, Muñoz A. Para-

metric survival analysis and taxonomy of haz-

ard functions for the generalized gamma dis-

tribution. Statistics in Medicine; 2007; 26:

4352–4374.

[18] Kiche J, Oscar N, Orwa G. On Generalized

Gamma Distribution and Its Application to Sur-

vival Data. International Journal of Statistics and

Probability; 2019; 8 (5).

[19] Stacy AM. A Generalization of the Gamma

Distribution. Annals of Mathematical Statistics;

1962; 33 (3): 1187–1192.

[20] Prentice RL. A Log Gamma Model and Its Max-

imum Likelihood Estimation. Biometrika; 1974;

61 (3): 539–544.

[21] Moreira A, de Uña-Álvarez J, Meira-

Machado L. Presmoothing the Aalen-Johansen

estimator in the illness-death model. Electronic

Journal of Statistics; 2013; 7: 1491–1516.

[22] Araújo A, Meira-Machado L, Roca-Pardiñas J.

TPmsm: Estimation of the transition probabili-

ties in 3-state models. Journal of Statistical Soft-

ware; 2014; 62(4), 1–29.

[23] Soutinho G, Meira-Machado L. Estimation of

the Transition Probabilities in Multi-state Sur-

vival Data: New Developments and Practical

Recommendations. WSEAS TRANSACTIONS

on MATHEMATICS; 2020; 19, 353–366.

[24] Moertel CG, Fleming TR, Macdonald JS,

Haller DG, Laurie JA, Tangen CM, Ungerlei-

der JS, Emerson WA, Tormey DC, Glick JH,

Veeder MH, Mailliard JA. Fluorouracil Plus

Levamisole as Effective Adjuvant Therapy af-

ter Resection of Stage III. Colon Carcinoma: A

Final Report, The Annals of Internal Medicine;

1995; 122 (5): 321–326.

Contribution of individual authors to

the creation of a scientific article

(Ghostwriting Policy)

Gustavo Soutinho and Luís Meira-Machado con-

tributed equally to this work in terms of writing of

the paper, methodological developments, and imple-

mentation of computer codes to obtain results from

simulation studies and a real data set.

Creative Commons Attribution

License 4.0 (Attribution 4.0

International , CC BY 4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/li-

censes/by/4.0/deed.en_US

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2022.21.27

Gustavo Soutinho, Luís Meira-Μachado

E-ISSN: 2224-2880

213

Volume 21, 2022

Table 2: Bias and standard deviation (SD) for estimators of p23(s, t).

The relative MSEs are also given. Scenario 1: illness-death model with correlated

exponential gap times.

bpLM

23(s, t)bpGG

23(s, t)Relative MSE bpLM

23(s, t)bpGG

23(s, t)Relative MSE

bias (SD) bias (SD) LM/GG bias (SD) bias (SD) LM/GG

(s,t)=(0.2231,0.5108) (s,t)=(0.5108,0.9163)

n C

50 U[0,4] -0.0085 (0.2378) -0.0340 (0.2640) 0.7991 -0.0053 (0.1969) -0.0093 (0.2238) 0.7734

U[0,3] -0.0130 (0.2509) -0.0320 (0.2782) 0.8051 -0.0010 (0.1961) 0.0065 (0.2248) 0.7607

100 U[0,4] 0.0029 (0.1678) -0.0022 (0.2080) 0.6507 -0.0034 (0.1343) 0.0053 (0.1468) 0.8367

U[0,3] -0.0114 (0.1653) -0.0253 (0.1932) 0.7237 0.0013 (0.1363) 0.0053 (0.1418) 0.9223

250 U[0,4] 0.0020 (0.0976) 0.0018 (0.1091) 0.7997 -0.0019 (0.0829) 0.0177 (0.0813) 0.9926

U[0,3] 0.0012 (0.1036) 0.0042 (0.1107) 0.8740 0.0015 (0.0835) 0.0199 (0.0812) 0.9967

(s,t)=(0.2231,1.6094) (s,t)=(0.5108,1.6094)

n C

50 U[0,4] -0.0112 (0.1676) 0.0486 (0.1189) 1.7101 -0.0022 (0.1835) 0.0622 (0.1790) 0.9376

U[0,3] -0.0395 (0.1920) 0.0402 (0.1478) 1.6382 -0.0090 (0.1912) 0.0700 (0.1780) 1.0021

100 U[0,4] -0.0064 (0.1190) 0.0328 (0.0878) 1.6177 0.0023 (0.1188) 0.0482 (0.1151) 0.9070

U[0,3] -0.0104 0.1290 0.0282 (0.0993) 1.5693 0.0085 (0.1284) 0.0544 (0.1208) 0.9426

250 U[0,4] 0.0025 (0.0723) 0.0016 (0.0578) 1.5665 -0.0010 (0.0738) 0.0136 (0.0619) 1.3597

U[0,3] -0-0038 (0.0790) -0.0008 (0.0626) 1.5985 0.0016 (0.0798) 0.0094 (0.0713) 1.2329

Table 1: Bias and standard deviation (SD) for estimators of p12(s, t)and p13(s, t).

The relative MSEs are also given. Scenario 1: illness-death model with

correlated exponential gap times.

bpLM

12(s, t)bpGG

12(s, t)Relative MSE bpLM

13(s, t)bpGG

13(s, t)Relative MSE

bias (SD) bias (SD) LM/GG bias (SD) bias (SD) LM/GG

(s,t)=(0.2231,0.5108)

n C

50 U[0,4] 0.0023 (0.0593) 0.0021 (0.0632) 0.8812 -0.0009 (0.0507) -0.0150 (0.0437) 1.2049

U[0,3] 0.0022 (0.0582) 0.0032 (NA) 0.7368 -0.0005 (0.0529) -0.0105 (0.0470) 1.2055

100 U[0,4] 0.0013 (0.0404) 0.0009 (0.0407) 0.9884 0.0011 (0.0352) -0.0122 (0.0287) 1.2781

U[0,3] 0.0000 (0.0406) 0.0032 (0.0455) 0.7925 -0.0001 (0.0380) -0.0095 (0.0330) 1.2191

250 U[0,4] -0.0004 (0.0256) -0.0016 (0.0244) 1.0939 -0.0009 (0.0223) -0.0132 (0.0168) 1.0874

U[0,3] 0.0012 (0.0272) 0.0049 (0.0300) 0.8014 0.0006 (0.0226) -0.0088 (0.0194) 1.1235

(s,t)=(0.2231,1.6094)

n C

50 U[0,4] 0.0002 (0.0823) -0.0025 (0.0701) 1.3766 0.0009 (0.0940) 0.0164 (0.0838) 1.2110

U[0,3] 0.0018 (0.0908) -0.0055 (NA) 1.2650 -0.0012 (0.0981) 0.0112 (0,0881) 1.2197

100 U[0,4] 0.0010 (0.0583) -0.0024 (0.0480) 1.4762 0.0001 (0,0659) 0.0171 (0.0582) 1.1783

U[0,3] -0.0015 (0.0660) -0.0101 (0.0539) 1.4504 -0.0006 (0.0741) 0.0109 (0.0627) 1.3551

250 U[0,4] -0.0013 (0.0385) -0.0051 (0.0282) 1.8054 0.0010 (0.0408) 0.0184 (0.0348) 1.0729

U[0,3] -0.0003 (0.0406) -0.0110 (0.0325) 1.3962 0.0007 (0.0427) 0.0138 (0.0362) 1.2130

(s,t)=(0.5108,0.9163)

n C

50 U[0,4] 0.0005 (0.0837) 0.0102 (0.0987) 0.7124 0.0001 (0.0733) -0.0139 (0.0700) 1.0559

U[0,3] 0.0013 (0.0859) 0.0097 (NA) 0.7640 0.0033 (0.0775) -0.0086 (0.0691) 1.2400

100 U[0,4] 0.0000 (0.0558) 0.0172 (0.0681) 0.6310 -0.0015 (0.0519) -0.0123 (0.0483) 1.0840

U[0,3] 0.0023 (0.0581) 0.0185 (0.0653) 0.7330 -0.0037 (0.0514) -0.0056 (0.0516) 0.9844

250 U[0,4] -0.0005 (0.0357) 0.02970.0484) 0.3967 -0.0003 (0.0334) -0.0123 (0.0318) 0.9580

U[0,3] 0.0003 (0.0386) 0.0264 (0.0478) 0.4994 -0.0003 (0.0332) -0.0035 (0.0352) 0.8779

(s,t)=(0.5108,1.6094)

n C

50 U[0,4] 0.0018 (0.1025) 0.0127 (0.1053) 0.9340 0.0019 (0.1103) 0.0156 (0.1061) 1.0591

U[0,3] -0.0001 (0,1142) 0.0119 (NA) 1.0281 -0.0020 (0.1201) 0.0085 (0.1063) 1.2683

100 U[0,4] -0.0014 (0.0707) 0.0039 (0.0654) 1.1671 0.0031 (0.0743) 0.0179 (0.0691) 1.0846

U[0,3] 0,0027 0,0800 0-0001 (0.0686) 1-3597 -0.0025 (0.0825) 0.0131 (0.0731) 1.2356

250 U[0,4] 0.0032 (0.0437) 0.0034 (0.0406) 1.1583 -0.0030 (0.0472) 0-0194 (0.0460) 0.8972

U[0,3] 0.0002 (0.0477) -0.0112 (0.0435) 1.1324 0.0009 (0.0508) 0.0188 (0.0455) 1.0635

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2022.21.27

Gustavo Soutinho, Luís Meira-Μachado

E-ISSN: 2224-2880

214

Volume 21, 2022

LM GGam LM GGam LM GGam

0.5 0.6 0.7 0.8 0.9 1.0

^11(0.2231, 0.5108)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

−0.1 0.0 0.1 0.2 0.3 0.4

^12(0.2231, 0.5108)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^23(0.2231, 0.5108)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.1 0.2 0.3 0.4 0.5

^11(0.2231, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.1 0.2 0.3 0.4 0.5 0.6

^12(0.2231, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^23(0.2231, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.4 0.5 0.6 0.7 0.8 0.9 1.0

^11(0.5108, 0.9163)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

−0.2 0.0 0.2 0.4

^12(0.5108, 0.9163)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^23(0.5108, 0.9163)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8

^11(0.5108, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8

^12(0.5108, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^23(0.5108, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^11(0.9163, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

−0.2 0.0 0.2 0.4 0.6 0.8

^12(0.9163, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^23(0.9163, 1.6094)

n = 50 n = 100 n = 250

Figure 5: Boxplot of the M= 1000 estimates of the transition probabilities. Horizontal solid red line correspond

to the true value of the transition probability. Censoring times uniformly distributed between 0 and 3.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2022.21.27

Gustavo Soutinho, Luís Meira-Μachado

E-ISSN: 2224-2880

215

Volume 21, 2022

LM GGam LM GGam LM GGam

0.5 0.6 0.7 0.8 0.9 1.0

^11(0.2231, 0.5108)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

−0.1 0.0 0.1 0.2 0.3 0.4

^12(0.2231, 0.5108)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^23(0.2231, 0.5108)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.1 0.2 0.3 0.4 0.5

^11(0.2231, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.1 0.2 0.3 0.4 0.5

^12(0.2231, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^23(0.2231, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.4 0.5 0.6 0.7 0.8 0.9 1.0

^11(0.5108, 0.9163)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6

^12(0.5108, 0.9163)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^23(0.5108, 0.9163)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

^11(0.5108, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6

^12(0.5108, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^23(0.5108, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^11(0.9163, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

−0.2 0.0 0.2 0.4 0.6 0.8

^12(0.9163, 1.6094)

n = 50 n = 100 n = 250

LM GGam LM GGam LM GGam

0.0 0.2 0.4 0.6 0.8 1.0

^23(0.9163, 1.6094)

n = 50 n = 100 n = 250

Figure 6: Boxplot of the M= 1000 estimates of the transition probabilities. Horizontal solid red line correspond

to the true value of the transition probability. Censoring times uniformly distributed between 0 and 4.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2022.21.27

Gustavo Soutinho, Luís Meira-Μachado

E-ISSN: 2224-2880

216

Volume 21, 2022

500 1000 1500 2000 2500 3000

0.6 0.7 0.8 0.9 1.0

Time (days)

P11(s,t)

Generalized Gamma

1500 2000 2500

0.80 0.85 0.90 0.95 1.00

Time (days)

P11(s,t)

Generalized Gamma

500 1000 1500 2000 2500 3000

0.00 0.05 0.10 0.15

Time (days)

P12(s,t)

Generalized Gamma

1500 2000 2500

0.00 0.02 0.04 0.06 0.08

Time (days)

P12(s,t)

Generalized Gamma

500 1000 1500 2000 2500 3000

0.0 0.1 0.2 0.3 0.4

Time (days)

P13(s,t)

Generalized Gamma

1500 2000 2500

0.00 0.05 0.10 0.15 0.20

Time (days)

P13(s,t)

Generalized Gamma

500 1000 1500 2000 2500 3000

0.0 0.2 0.4 0.6 0.8

Time (days)

P23(s,t)

Generalized Gamma

1500 2000 2500

0.0 0.2 0.4 0.6 0.8 1.0

Time (days)

P23(s,t)

Generalized Gamma

Figure 7: Estimated transition probabilities for s= 365 and s= 1095. ColonIDM data.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2022.21.27

Gustavo Soutinho, Luís Meira-Μachado

E-ISSN: 2224-2880

217

Volume 21, 2022