Parametric landmark estimation of the transition probabilities in
survival data with multiple events
GUSTAVO SOUTINHO
EPIUnit
University of Porto
Rua das Taipas 135, 4050-600 Porto
PORTUGAL
LUÍS MEIRA-MACHADO
University of Minho
Centre of Mathematics
Campus de Azurém, 4800-058 Guimarães
PORTUGAL
Abstract: Multi-state models are a useful tool for analyzing survival data with multiple events. The transition
probabilities play an important role in these models since they allow for long-term predictions of the process in
a simple and summarized manner. Recent papers have used the idea of subsampling to estimate these quantities,
providing estimators with superior performance in the case of strong violations of the Markov condition. Sub-
sampling, also referred to as landmarking, leads to small sample sizes and usually heavily censored data, which
leads to estimators with higher variability. Here, we use the flexibility of the generalized gamma distribution com-
bined with the same idea of subsampling to obtain estimators free of the Markov condition with less variability.
Simulation studies show the good small sample properties of the proposed estimators. The proposed methods are
illustrated using real data.
Key-Words: Multi-state models, Transition probabilities, Generalized gamma distribution, Landmark approach
Received: June 6, 2021. Revised: March 12, 2022. Accepted: April 2, 2022. Published: May 3, 2022.
1 Introduction
Multi-state models are models for a stochastic pro-
cess that at any time occupies one of a set of discrete
states [1, 2, 3, 4]. These models provide a relevant
modeling framework to deal with complex survival
data in which individuals may experience more than
one event. A multi-state model can be represented
schematically by diagrams with boxes representing
the states and arrows representing the possible transi-
tions. The complexity of a multi-state model greatly
depends on the number of states and the transitions
allowed between these states.
Among the examples, the simplest case is the mor-
tality model for survival data, which involves only
two states and one transition. The competing risks
model [5, 6] can be seen as an extension of the mor-
tality model, considering that a subject may reach the
ultimate state due to any of several causes. The ir-
reversible illness-death or disability model is a spe-
cial case of multi-state model, commonly used in the
literature to introduce the theoretical background of
multi-state models, in which the individuals may pass
from the initial state (State 1) to the intermediate state
(State 2) and then to the absorbing state (State 3) (Fig-
ure 1). Individuals are at risk of death in each transient
state (States 1 and 2).
One important goal in the analysis of multi-state
survival data is to assess the relationship of explana-
tory variables to the several event times. Another pri-
mary goal is to obtain predictions of the clinical prog-
nosis of a patient at a certain point in his or her illness
process. The main assumption implied in both cases is
the Markov assumption, which claims that given the
present state, the future evolution of the process is in-
dependent of the states previously visited and the tran-
sition times among them. Traditionally, the transition
probabilities are estimated using the Aalen-Johansen
estimator [7] that gives consistent estimators of the
transition probabilities when the multi-state model is
Markov. When the multi-state model is non-Markov,
this is no longer the case. Recently, [8] and [9] in-
troduced alternative estimators based on subsampling
(also known as landmarking) that are consistent re-
gardless of the Markov assumption. Both methods
are derived from a subset of the data consisting of
all subjects observed to be in the given state at the
given time. The first paper, by [8], uses a procedure
based on (differences between) Kaplan–Meier esti-
mators, whereas the second, by [9] used the Aalen-
Johansen estimate of the state occupation probabili-
ties derived from the same subset. Since the compu-
tation of the transition probabilities of the two meth-
ods is based on reduced data, they will lead to esti-
mators with higher variability. To overcome this is-
sue, parametric estimation based on the same specific
portions of data may be used to provide estimators
with less variability that are free of the Markov con-
dition. In some situations, fully-parametric modeling
can be more convenient to model complex data struc-
tures and processes [10, 11]. In this paper, we use
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2022.21.27
Gustavo Soutinho, Luís Meira-Μachado
E-ISSN: 2224-2880
207
Volume 21, 2022
the flexibility of the generalized gamma distribution
combined with the same idea of subsampling to ob-
tain estimators that are free of the Markov condition
with less variability.
The following section provides some notation and
introduces the theoretical background associated with
the landmark estimators and the proposed estimators
based on the generalized gamma distribution. In sec-
tion 3, these methods are compared through simula-
tions. The application to a real data set on colon can-
cer is presented in Section 4. Finally, the main con-
clusions of this paper are reported in Section 5.
2 Models and methods
2.1 Notation
In the following, we will devote our attention to the
particular case of the illness–death model that in-
volves three different states and three possible tran-
sitions among them, since, as mentioned previously,
this model plays an important role in the theory of
multi-state models. The process of the illness-death
model is fully characterized by the vector of times
(Z, T ), where Zdenotes the sojourn time in the ini-
tial state (State 1) and Tthe total survival time (i.e.,
the absorption time). Note that for those individu-
als undergoing a direct transition from State 1 to the
absorbing State 3 we have Z=T. On the other
hand, Z < T indicates that the individual visits the
intermediate State 2 at some time. We assume that
right-censoring issues may appear due to time lim-
itations in the follow-up, lost follow-up cases, etc.
This censoring is modeled by considering a censor-
ing variable, C, which we assume to be indepen-
dent of the process. Under censoring, only the cen-
sored versions of Zand T, along with their corre-
sponding censoring indicators, are available. Thus,
we define e
Z=min(Z, C)and e
T=min(T, C)
for the censored versions of Zand Tand introduce
1=I(ZC)and = I(TC)for the respec-
tive censoring indicators of Zand T. The variables
e
Zand e
T23 =e
Te
Zare the observed sojourn times
in states 1 and 2, respectively. Finally, the available
data are (e
Zi,e
Ti,1i,i),1in, i.i.d. copies of
(e
Z, e
T , 1,∆).
2.2 Nonparametric landmark estimators of
the transition probabilities
The target is each of the five transition probabili-
ties phj (s, t) = P(X(t) = j|X(s) = h), where
1hj3and sand tare two pre-specified
time points with s < t. As noted by [8], these transi-
tion probabilities are functions involving Zand Tas
follow:
State1State2
State3
Figure 1: The progressive illness-death model.
p11(s, t) = P(Z > t |Z > s),
p12(s, t) = P(Zt, T > t |Z > s),
p13(s, t) = P(Tt|Z > s),
p22(s, t) = P(Zt, T > t |Zs, T > s),
p23(s, t) = P(Tt|Zs, T > s).
Two obvious relations stand out: p11(s, t) +
p12(s, t) +p13(s, t) = 1 and p22(s, t)+p23(s, t) = 1,
revealing that in practice one only need to estimate
three of these quantities.
When the multi-state model is Markovian, the
Aalen-Johansen estimator [7] gives consistent estima-
tors of the transition probabilities. Explicit formulae
of the Aalen-Johansen estimator for the illness-death
model are available [13]. When the multi-state model
is non-Markov, this is no longer the case. In these
cases, the Aalen-Johansen estimator may introduce
some bias, and therefore, they may be inappropriate.
Estimators that do not rely on the Markov assump-
tion were first introduced by [14]. The proposed esti-
mators were defined in terms of multivariate Kaplan-
Meier integrals and proved to be more efficient than
the Aalen-Johansen estimators in the case of strong
violations of the Markov assumption. However, this
proposal has the drawback of requiring that the sup-
port of the censoring distribution contain the support
of the lifetime distribution, otherwise they only re-
port valid estimators for truncated transition proba-
bilities. Recently, [8] recovered the work by [14]
and proposed alternative estimators that are consis-
tent regardless of the Markov condition and the re-
ferred assumption on the censoring support. The idea
of the new methods is to use a procedure based on
differences between Kaplan-Meier estimators derived
from a subset of the data consisting of all subjects ob-
served to be in a given state at a given time. Fol-
lowing equations above, given the time point s, to
estimate p1j(s, t)for j= 1,2,3the landmark anal-
ysis is restricted to the individuals observed in State
1 at time s. For the subpopulation Z > s, the cen-
soring time Cis still independent of the pair (Z, T )
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2022.21.27
Gustavo Soutinho, Luís Meira-Μachado
E-ISSN: 2224-2880
208
Volume 21, 2022
and, therefore, Kaplan-Meier-based estimation will
be consistent. Similarly, to estimate p2j(s, t),j=
2,3, the landmark analysis proceeds from the sample
restricted to the individuals observed in State 2 at time
s. Then, we may formally introduce the landmark es-
timators as follows:
bpLM
11 (s, t) = b
SKM(s)
1(t),
bpLM
12 (s, t) = b
SKM(s)(t)b
SKM(s)
1(t),
bpLM
13 (s, t) = 1 b
SKM(s)(t),
bpLM
22 (s, t) = b
SKM[s](t),
bpLM
23 (s, t) = 1 b
SKM[s](t)
where b
SKM(s)
1and b
SKM(s)are the Kaplan-Meier es-
timators for the distributions of Zand T, respec-
tively, but computed from the subsample S1=
ni:e
Zi> so; whereas b
SKM[s]is the Kaplan-Meier es-
timator of the distribution of Tbut computed from the
subsample S2=ni:e
Zis < e
Tio.
2.3 Parametric estimation
Nonparametric estimators of the transition probabili-
ties based on landmarking were introduced in the pre-
vious section. However, methods based on a fully-
specified model can be more convenient for repre-
senting complex data structures and processes [10].
This is the case for hazard functions that may vary
predictably due to interval censoring, frailties, or mul-
tiple responses. In fact, the use of parametric mod-
els is an attractive option because standard methods,
such as maximum likelihood, are available for param-
eter estimation and testing, and a complete descrip-
tion of the hazard function can be obtained [15]. [16]
also refers an example in biomedical application for
which the mean survival needs the survival function
S(t)to be fully-specified for all times t, and paramet-
ric models that combine data from multiple time peri-
ods should be used. Nevertheless, since the standard
parametric distribution has hazard functions that can
totally change shape according to the values of the pa-
rameters, the choice of the most suitable model from
the available families of distributions can be problem-
atic, and the results of the inference may be affected
[17]. The use of parametric models may also improve
the accuracy of the transition probabilities by reduc-
ing the high variability of the landmark estimators for
cases with small sample sizes or high censoring lev-
els.
The generalized gamma (GG) distribution is a very
flexible distribution that generalizes several impor-
tant and common parametric models that have wide
applications in lifetime data analysis. Among these
are the exponential, Weibull, log-normal, and two-
parameter gamma distributions. In fact, due to the
flexible variety of shapes of its hazard function, which
includes all four of the most common types of haz-
ard function (monotonically increasing and decreas-
ing, as well as bathtub and arc-shaped hazards), the
GG distribution can easily be used in different contexts
for modelling lifetime models [18]. Due to its impor-
tance, the following survival and density functions for
the GG distribution are introduced. Further details on
other special distributions can be found in [18].
2.3.1 The generalized gamma distribution
The generalized gamma distribution (GG), introduced
by [19], is a three-parameter distribution with the fol-
lowing probability density function
f(t|β > 0, α > 0, k > 0) =
=β
Γ (k)·tβk1
αβk ·exp h(t/α)βi, t > 0,(1)
This distribution is a generalization of the two-
parameter gamma distribution when β= 1. The
widely used exponential (β=k= 1) and weibull
(k= 1), are also special cases of the GG distribution.
[20] considers a new reparametrization of the GG dis-
tribution with location (µ=log(α)), scale (σ=β1)
and shape (λ= 1/k2) that allows to extend the origi-
nal one to include a further class of models with λ <
0. Specifically, let Γ (t;γ) = Rt
0xγ1exdx/Γ(γ)
the gamma distribution with mean and variance equal
to γ > 0. The survival function for GG(µ, σ, λ), if
λ > 0, is given by
S(t) = 1 Γλ2exp (λ[log(t)µ] /σ) ; λ2(2)
= 1 Γhλ2eµtλ/σ;λ2i(3)
and, if λ < 0, replacing z= (log(t)µ)/σby z,
take the following form
S(t) = Γ hλ2eµtλ/σ;λ2i(4)
Finally, the probability density function is
f(t) = |λ|
σtΓ (λ2)×
×hλ2eµtλ/σiλ2
exp hλ2eµtλ/σi(5)
and the hazard function, for λ > 0, is given by
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2022.21.27
Gustavo Soutinho, Luís Meira-Μachado
E-ISSN: 2224-2880
209
Volume 21, 2022
h(t) = f(t)
S(t)=|λ|
σtΓ (λ2)hλ2eµtλ/σiλ2
×
exp hλ2eµtλ/σi×
1Γhλ2eµtλ/σ;λ2i1
(6)
and for λ < 0, has the form
h(t) = |λ|
σtΓ (λ2)hλ2eµtλ/σiλ2
×
exp hλ2eµtλ/σi×
Γhλ2eµtλ/σ;λ2i1
(7)
Figures 2, 3 and 4 plot the GG density, survival and
hazard functions for different choices of the parame-
ters µ,σand λ.
In case of λ=σthe GG distribution becomes the
two-parameter gamma distribution G(µ, σ). The den-
sity for the weibull distribution can be obtained with
λ= 1 and the exponential distribution corresponds to
the case λ=σ= 1. Another special case is when
λ= 0 which correspond to the log-normal distribu-
tion, with S(t) = 1 Φlog (eµt)1/σ, where Φ(·)
is the standard normal cumulative density function.
Therefore, the GG family contains almost all the most
commonly used parametric distributions [16, 17].
2.3.2 Maximum likelihood estimation
Let’s consider the probability density function of a
general parameter model, for an event at time t,
given by f(t|µ, α),t0, where µcorrespond
to the mean or location parameter and the vector
α= (α1,· · · , αR)represents other parameters such
as shape, variance or higher moments. A standard
approach for the estimation of parameters of a fully
parametric model is to use the likelihood function.
Thus, let ti,i= 1,, n be a sample of times from
individuals i, with ci= 1 if tiis an observed time-to-
event, or ci= 0 if this is censored. Also let sibe cor-
responding left-truncation (or delayed-entry) times,
meaning that the ith survival time is only observed
conditionally on the individual’s having survived up
to si, thus si= 0 if there is no left-truncation. In the
case of right-censoring, the likelihood of the proba-
bility of density function for the previous parameters
θ= (µ, α)is [16]:
l(θ|t, c, s) = Qi:ci=1 fi(ti|µ, α)Qi:ci=0 Si(ti|µ, α)
QiSi(si)(8)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0123456
x
f(t)
µ = 0.2 ,σ = 0.5 ,λ = 2
µ = 0.2 ,σ = 2 ,λ = 2
µ = 0.4 ,σ = 0.3 ,λ = 0.5
µ = 0.4 ,σ = 0.3 ,λ = 2
Figure 2: Plots of the GG density for some values of
µ,σand λparameters.
In the case of the GG distribution, the likelihood func-
tion is obtained by replacing, respectively, the density
and survival functions by the expressions (5) and (3
or 4, according to the value of λ). The log-likelihood
function is given from (6) as follow [16]:
log l(θ|t, c, s) = X
i:ci=1
{log (hi(ti)Hi(ti))}
X
i:ci=0
Hi(ti) + X
i
Hi(si)(9)
where Hdenote the cumulative hazard function.
The parameters can be then estimated by maximiz-
ing the log-likelihood function with respect to θusing
the maximum likelihood estimation (MLE) method.
If the likelihood function is differentiable, the deriva-
tive test for determining the maximum can be applied.
In some cases, the first-order conditions of the like-
lihood function can be solved explicitly. However,
in most cases, numerical methods will be necessary
to find the maximum of the likelihood function [18].
In this paper, the maximization was done using the
flexsurvreg package, a software application for R,
through the optimization methods available in the op-
tim function.
2.3.3 Parametric estimators of the transition
probabilities
The parametric landmark estimators are based on the
landmark ideas as used in (1). This estimator can be
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2022.21.27
Gustavo Soutinho, Luís Meira-Μachado
E-ISSN: 2224-2880
210
Volume 21, 2022
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0 0.2 0.4 0.6 0.8 1.0
x
S(t)
µ = 0.2 ,σ = 0.5 ,λ = 2
µ = 0.2 ,σ = 2 ,λ = 2
µ = 0.4 ,σ = 0.3 ,λ = 0.5
µ = 0.4 ,σ = 0.3 ,λ = 2
Figure 3: Plots of the GG survival functions for some
values of µ,σand λparameters.
obtained by applying the procedure based on (differ-
ences between) parametric survival estimators intro-
duced in Section 2.3.1 to the same specific portions
of data.
3 Simulation studies
In this section, we report the results from simula-
tion in order to evaluate the performance of the pro-
posed parametric estimators. In particular, we com-
pare the estimator described in Section 2.3, using the
GG distribution (labeled GG, to the Landmark estima-
tor introduced by [8], labeled LM). Data were simu-
lated to represent a progressive illness-death model,
assuming that all individuals begin in the initial state
(State 1) at time t= 0. The movement of the
individuals was defined according to the scenario
described by [21] and [22] in which the subjects
may pass through the intermediate state (State 2)
at some time or go directly to the absorbing state
(State 3). For the individuals who visit the inter-
mediate state, we generated replicates of (Z, T
Z)through the bivariate distribution F12(x, y) =
F1(x)F2(y) [1 + {1F1(x)} {1F2(y)}]with ex-
ponential marginal distribution functions with rate pa-
rameter 1. For individuals that experience a direct
transition to the absorbing state (Z=T), the value of
Zis simulated according to an exponential with rate
parameter 1. Random censoring times were indepen-
dently generated from uniform distributions U[0,3]
and U[0,4] to introduce heavier or lighter censor-
ing. To be more specific, in the case of the first sce-
nario, the censoring proportion in the first gap time
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0 10 20 30 40
x
f(t)
µ = 0.2 ,σ = 0.5 ,λ = 2
µ = 0.2 ,σ = 2 ,λ = 2
µ = 0.4 ,σ = 0.3 ,λ = 0.5
µ = 0.4 ,σ = 0.3 ,λ = 2
Figure 4: Plots of the GG hazard functions for some
values of µ,σand λparameters.
is 32% while for the second gap time, this percentage
increases to 57%. The second model decreases these
censoring levels to 24% and 47%, respectively. For
each simulated scenario, we fixed the values of (s, t)
pairs for four different points, corresponding to com-
binations of the percentiles of 20%, 40%, 60% and
80% of the exponential marginal distribution func-
tions with rate parameter 1 to summarize the results of
the transition probabilities. In each simulation, 1000
samples were generated with sample sizes of 50, 100,
and 250. From these samples, we obtained the mean
for all the generated data sets. As a measure of effi-
ciency, we took the Mean Squared Error (MSE) but
we also computed the standard deviations (SD) and
the Bias for each point (s, t).
Tables 1 and 2 show the results of the LM and
GG estimators for the transition probabilities bp12(s, t),
bp13(s, t)and bp23(s, t)from the simulation studies de-
scribed previously. The parametric estimators, la-
beled as GG, were implemented by fitting a general-
ized gamma distribution to the multi-state model un-
der the landmark approach through the flexsurvreg
function of the flexsurv of the Rpackage [16]. Re-
sults reveal that the accuracy of the estimators for all
transition probabilities decreases as sand tgrow due
to the censoring effects that are higher at the right tail
of the lifetime distribution. As would be expected,
the SD decreases as the sample sizes increase, and
with a decrease in the censoring percentage. From
the results we can observe that the performance of
the proposed parametric estimators is better than the
nonparametric counterpart for cases with higher lag
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2022.21.27
Gustavo Soutinho, Luís Meira-Μachado
E-ISSN: 2224-2880
211
Volume 21, 2022
times ts, and consequently can be an alternative in
case with small sample sizes when comparing with
the LM estimator. This can be observed by the rel-
ative efficiency that is represented by the ratio be-
tween the corresponding MSEs of the nonparametric
and parametric estimators. For the transition proba-
bilities bp12(s, t)and bp23(s, t)the results are quite sim-
ilar, revealing the better accuracy of the parametric
estimator as the difference between times increases.
In the case of bp13(s, t), for almost all transition times
the MSE ratio is above 1, confirming the advantages
of the parametric estimator compared to the LM esti-
mator. For completeness purposes, we show in Fig-
ures 5 and 6 the boxplots of the estimates of the tran-
sition probabilities based on the 1000 Monte Carlo
replicates for the two sets of estimators, with differ-
ent sample sizes. The boxplots shown in these figures
reveal some results which agree with our findings re-
ported in Tables 1 and 2. From these plots, it can be
seen that all methods have small biases and confirm
the lower variability of the parametric estimators in
some cases. As expected, the lower variability of the
parametric estimator comes at the cost of a small in-
crease in the bias. This is more clear when the dif-
ference tsis small or for small sample sizes, and
consequently, in these cases, the LM nonparametric es-
timator may be preferable.
4 Example of application
In this section, we analyse the results of the applica-
tion of the proposed methods using data from a colon
cancer cancer study. This is a study from a large clin-
ical trial on Duke’s stage III patients [24] that focuses
on 929 patients who were followed after suffering cu-
rative surgery for colorectal cancer until death or cen-
soring. Of the initial sample, 423 remained alive dur-
ing the course of the follow-up period. 468 of the pa-
tients had a recurrence, and among these, 414 ended
up dying. Finally, 38 individuals died without expe-
riencing a recurrence.
Figure 7 reports, for each row, the estimated tran-
sition probabilities, bp11(s, t),bp12(s, t),bp13(s, t)and
bp23(s, t), for fixed values of s= 365 (left hand side)
and s= 1095 (right hand side). Each plot shows the
estimated curves based on the parametric GG (black
line) and the LM estimators (red line). Plots reported
in the first row report the estimated transition prob-
abilities for p11(s, t), for fixed values s= 365 and
s= 1095 (days), along with time t. The estimated
curves report the survival fraction over time among
the individuals ‘alive and without recurrence’. Plots
in the second row allow for an inspection over time of
the probability of being alive with recurrence for in-
dividuals who are disease free 1 year (left hand side)
and 3 years (right hand side) after surgery. Since the
recurrence state is transient, this curve first increases
and then decreases. It is also evident that the probabil-
ities for s= 1095 are smaller than those for s= 365.
Finally, plots shown in the third and fourth rows report
the estimated transition probabilities for p13(s, t)and
p23(s, t). They report one minus the survival fraction
along time, among the individuals ‘alive and without
recurrence’ and those ‘alive and with recurrence’, re-
spectively. Curves depicted in Figure 7 reveal that the
parametric landmark estimators based on the general-
ized gamma distribution provide, in all cases, curves
with the expected behavior, similar to those obtained
from the nonparametric landmark estimators but with
less variability.
5 Conclusion
In this article, the relative performance of the pro-
posed parametric landmark estimators for the tran-
sition probabilities was investigated through simula-
tions. Attained results suggest that the use of the gen-
eralized gamma distribution combined with the idea
of subsampling can lead to competitive estimators
that may outperform the original landmark estimators
in some cases, providing estimators with less variabil-
ity. It is worth mentioning that, in some cases, para-
metric estimation can introduce some bias in estima-
tion while reducing the variance. The risk of intro-
ducing a large bias can be controlled in practice by
comparing the obtained estimated curves with those
obtained by the nonparametric estimator.
Acknowledgements: This research was financed by
Portuguese Funds through FCT “Fundação para a
Ciência e a Tecnologia”, within the research grants
PD/BD/142887/2018 and UID/BIA/04050/2019.
References:
[1] Andersen PK, Borgan O, Gill RD, Keiding N.
Statistical Models Based on Counting Processes.
Springer-Verlag, New York; 1993.
[2] Hougaard P. Analysis of multivariate survival
data. Springer-Verlag. New York; 2000.
[3] Meira-Machado L, de Uña-Álvarez J, Cadarso-
Suárez C, Andersen PK. Multi-state models
for the analysis of time to event data. Statis-
tical Methods in Medical Research; 2009; 18:
195–222.
[4] Meira-Machado L, Sestelo M. Estimation in
the progressive illness-death model: a nonex-
haustive review. Biometrical Journal; 2019; 61:
245–263.
[5] Andersen PK, Keiding N. Multi-state models
for event history analysis. Statistical Methods in
Medical Research; 2002; 11: 91–115.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2022.21.27
Gustavo Soutinho, Luís Meira-Μachado
E-ISSN: 2224-2880
212
Volume 21, 2022
[6] Putter H, Fiocco M, Geskus RB. Tutorial in
biostatistics: Competing risks and multi-state
models. Statistics in Medicine; 2007: 26,
2389–2430.
[7] Aalen OO, Johansen S. An empirical transition
matrix for non homogeneous Markov and chains
based on censored observations. Scandinavian
Journal of Statistics; 1978; 5: 141–150.
[8] de Uña-Álvarez J, Meira-Machado L. Nonpara-
metric estimation of transition probabilities in
the non-Markov illness-death model: A compar-
ative study. Biometrics; 2015; 71: 141–150
[9] Putter H, Spitoni C. Non-parametric estimation
of transition probabilities in non-Markov multi-
state models: The landmark Aalen-Johansen
estimator. Statistical Methods in Medical Re-
search; 2018; 27(7): 2081–2092.
[10] Aalen O, Borgan O, Gjessing H. Survival and
Event History Analysis: A Process Point of
View. Statistics. Springer-Verlag, New York.
2008.
[11] Benaglia T, Jackson C, Sharples L. Survival Ex-
trapolation in the Presence of Cause Specific
Hazards. Statistics in Medicine; 2015. 34 (5):
796–811.
[12] Datta S, Satten GA. Validity of the Aalen-
Johansen estimators of stage occupation prob-
abilities and Nelson Aalen integrated transition
hazards for non-Markov models. Statistics &
Probability Letters; 2001; 55: 403–411.
[13] Borgan O. Aalen-Johansen Estimator. Encyclo-
pedia of Biostatistics; 1988; 1: 5–10.
[14] Meira-Machado L,; de Uña-Álvarez J, Cadarso-
Suárez C. Nonparametric estimation of tran-
sition probabilities in a non-Markov illness-
death model. Lifetime Data Analysis; 2006; 12:
325–344.
[15] Wei LJ. The accelerated failure time model: a
useful alternative to the Cox regression model in
survival analysis. Statistics in Medicine; 1992;
11: 1871–1879.
[16] Jackson C. flexsurv: A Platform for Paramet-
ric Survival Modeling in R. Journal of Statistical
Software; 2016; 70 (8): 1–33.
[17] Cox C, Chu H, Schneider MF, Muñoz A. Para-
metric survival analysis and taxonomy of haz-
ard functions for the generalized gamma dis-
tribution. Statistics in Medicine; 2007; 26:
4352–4374.
[18] Kiche J, Oscar N, Orwa G. On Generalized
Gamma Distribution and Its Application to Sur-
vival Data. International Journal of Statistics and
Probability; 2019; 8 (5).
[19] Stacy AM. A Generalization of the Gamma
Distribution. Annals of Mathematical Statistics;
1962; 33 (3): 1187–1192.
[20] Prentice RL. A Log Gamma Model and Its Max-
imum Likelihood Estimation. Biometrika; 1974;
61 (3): 539–544.
[21] Moreira A, de Uña-Álvarez J, Meira-
Machado L. Presmoothing the Aalen-Johansen
estimator in the illness-death model. Electronic
Journal of Statistics; 2013; 7: 1491–1516.
[22] Araújo A, Meira-Machado L, Roca-Pardiñas J.
TPmsm: Estimation of the transition probabili-
ties in 3-state models. Journal of Statistical Soft-
ware; 2014; 62(4), 1–29.
[23] Soutinho G, Meira-Machado L. Estimation of
the Transition Probabilities in Multi-state Sur-
vival Data: New Developments and Practical
Recommendations. WSEAS TRANSACTIONS
on MATHEMATICS; 2020; 19, 353–366.
[24] Moertel CG, Fleming TR, Macdonald JS,
Haller DG, Laurie JA, Tangen CM, Ungerlei-
der JS, Emerson WA, Tormey DC, Glick JH,
Veeder MH, Mailliard JA. Fluorouracil Plus
Levamisole as Effective Adjuvant Therapy af-
ter Resection of Stage III. Colon Carcinoma: A
Final Report, The Annals of Internal Medicine;
1995; 122 (5): 321–326.
Contribution of individual authors to
the creation of a scientific article
(Ghostwriting Policy)
Gustavo Soutinho and Luís Meira-Machado con-
tributed equally to this work in terms of writing of
the paper, methodological developments, and imple-
mentation of computer codes to obtain results from
simulation studies and a real data set.
Creative Commons Attribution
License 4.0 (Attribution 4.0
International , CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/li-
censes/by/4.0/deed.en_US
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2022.21.27
Gustavo Soutinho, Luís Meira-Μachado
E-ISSN: 2224-2880
213
Volume 21, 2022
Table 2: Bias and standard deviation (SD) for estimators of p23(s, t).
The relative MSEs are also given. Scenario 1: illness-death model with correlated
exponential gap times.
bpLM
23(s, t)bpGG
23(s, t)Relative MSE bpLM
23(s, t)bpGG
23(s, t)Relative MSE
bias (SD) bias (SD) LM/GG bias (SD) bias (SD) LM/GG
(s,t)=(0.2231,0.5108) (s,t)=(0.5108,0.9163)
n C
50 U[0,4] -0.0085 (0.2378) -0.0340 (0.2640) 0.7991 -0.0053 (0.1969) -0.0093 (0.2238) 0.7734
U[0,3] -0.0130 (0.2509) -0.0320 (0.2782) 0.8051 -0.0010 (0.1961) 0.0065 (0.2248) 0.7607
100 U[0,4] 0.0029 (0.1678) -0.0022 (0.2080) 0.6507 -0.0034 (0.1343) 0.0053 (0.1468) 0.8367
U[0,3] -0.0114 (0.1653) -0.0253 (0.1932) 0.7237 0.0013 (0.1363) 0.0053 (0.1418) 0.9223
250 U[0,4] 0.0020 (0.0976) 0.0018 (0.1091) 0.7997 -0.0019 (0.0829) 0.0177 (0.0813) 0.9926
U[0,3] 0.0012 (0.1036) 0.0042 (0.1107) 0.8740 0.0015 (0.0835) 0.0199 (0.0812) 0.9967
(s,t)=(0.2231,1.6094) (s,t)=(0.5108,1.6094)
n C
50 U[0,4] -0.0112 (0.1676) 0.0486 (0.1189) 1.7101 -0.0022 (0.1835) 0.0622 (0.1790) 0.9376
U[0,3] -0.0395 (0.1920) 0.0402 (0.1478) 1.6382 -0.0090 (0.1912) 0.0700 (0.1780) 1.0021
100 U[0,4] -0.0064 (0.1190) 0.0328 (0.0878) 1.6177 0.0023 (0.1188) 0.0482 (0.1151) 0.9070
U[0,3] -0.0104 0.1290 0.0282 (0.0993) 1.5693 0.0085 (0.1284) 0.0544 (0.1208) 0.9426
250 U[0,4] 0.0025 (0.0723) 0.0016 (0.0578) 1.5665 -0.0010 (0.0738) 0.0136 (0.0619) 1.3597
U[0,3] -0-0038 (0.0790) -0.0008 (0.0626) 1.5985 0.0016 (0.0798) 0.0094 (0.0713) 1.2329
Table 1: Bias and standard deviation (SD) for estimators of p12(s, t)and p13(s, t).
The relative MSEs are also given. Scenario 1: illness-death model with
correlated exponential gap times.
bpLM
12(s, t)bpGG
12(s, t)Relative MSE bpLM
13(s, t)bpGG
13(s, t)Relative MSE
bias (SD) bias (SD) LM/GG bias (SD) bias (SD) LM/GG
(s,t)=(0.2231,0.5108)
n C
50 U[0,4] 0.0023 (0.0593) 0.0021 (0.0632) 0.8812 -0.0009 (0.0507) -0.0150 (0.0437) 1.2049
U[0,3] 0.0022 (0.0582) 0.0032 (NA) 0.7368 -0.0005 (0.0529) -0.0105 (0.0470) 1.2055
100 U[0,4] 0.0013 (0.0404) 0.0009 (0.0407) 0.9884 0.0011 (0.0352) -0.0122 (0.0287) 1.2781
U[0,3] 0.0000 (0.0406) 0.0032 (0.0455) 0.7925 -0.0001 (0.0380) -0.0095 (0.0330) 1.2191
250 U[0,4] -0.0004 (0.0256) -0.0016 (0.0244) 1.0939 -0.0009 (0.0223) -0.0132 (0.0168) 1.0874
U[0,3] 0.0012 (0.0272) 0.0049 (0.0300) 0.8014 0.0006 (0.0226) -0.0088 (0.0194) 1.1235
(s,t)=(0.2231,1.6094)
n C
50 U[0,4] 0.0002 (0.0823) -0.0025 (0.0701) 1.3766 0.0009 (0.0940) 0.0164 (0.0838) 1.2110
U[0,3] 0.0018 (0.0908) -0.0055 (NA) 1.2650 -0.0012 (0.0981) 0.0112 (0,0881) 1.2197
100 U[0,4] 0.0010 (0.0583) -0.0024 (0.0480) 1.4762 0.0001 (0,0659) 0.0171 (0.0582) 1.1783
U[0,3] -0.0015 (0.0660) -0.0101 (0.0539) 1.4504 -0.0006 (0.0741) 0.0109 (0.0627) 1.3551
250 U[0,4] -0.0013 (0.0385) -0.0051 (0.0282) 1.8054 0.0010 (0.0408) 0.0184 (0.0348) 1.0729
U[0,3] -0.0003 (0.0406) -0.0110 (0.0325) 1.3962 0.0007 (0.0427) 0.0138 (0.0362) 1.2130
(s,t)=(0.5108,0.9163)
n C
50 U[0,4] 0.0005 (0.0837) 0.0102 (0.0987) 0.7124 0.0001 (0.0733) -0.0139 (0.0700) 1.0559
U[0,3] 0.0013 (0.0859) 0.0097 (NA) 0.7640 0.0033 (0.0775) -0.0086 (0.0691) 1.2400
100 U[0,4] 0.0000 (0.0558) 0.0172 (0.0681) 0.6310 -0.0015 (0.0519) -0.0123 (0.0483) 1.0840
U[0,3] 0.0023 (0.0581) 0.0185 (0.0653) 0.7330 -0.0037 (0.0514) -0.0056 (0.0516) 0.9844
250 U[0,4] -0.0005 (0.0357) 0.02970.0484) 0.3967 -0.0003 (0.0334) -0.0123 (0.0318) 0.9580
U[0,3] 0.0003 (0.0386) 0.0264 (0.0478) 0.4994 -0.0003 (0.0332) -0.0035 (0.0352) 0.8779
(s,t)=(0.5108,1.6094)
n C
50 U[0,4] 0.0018 (0.1025) 0.0127 (0.1053) 0.9340 0.0019 (0.1103) 0.0156 (0.1061) 1.0591
U[0,3] -0.0001 (0,1142) 0.0119 (NA) 1.0281 -0.0020 (0.1201) 0.0085 (0.1063) 1.2683
100 U[0,4] -0.0014 (0.0707) 0.0039 (0.0654) 1.1671 0.0031 (0.0743) 0.0179 (0.0691) 1.0846
U[0,3] 0,0027 0,0800 0-0001 (0.0686) 1-3597 -0.0025 (0.0825) 0.0131 (0.0731) 1.2356
250 U[0,4] 0.0032 (0.0437) 0.0034 (0.0406) 1.1583 -0.0030 (0.0472) 0-0194 (0.0460) 0.8972
U[0,3] 0.0002 (0.0477) -0.0112 (0.0435) 1.1324 0.0009 (0.0508) 0.0188 (0.0455) 1.0635
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2022.21.27
Gustavo Soutinho, Luís Meira-Μachado
E-ISSN: 2224-2880
214
Volume 21, 2022
LM GGam LM GGam LM GGam
−0.1 0.0 0.1 0.2 0.3 0.4
p
^12(0.2231, 0.5108)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^23(0.2231, 0.5108)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.1 0.2 0.3 0.4 0.5
p
^11(0.2231, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.1 0.2 0.3 0.4 0.5 0.6
p
^12(0.2231, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^23(0.2231, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.4 0.5 0.6 0.7 0.8 0.9 1.0
p
^11(0.5108, 0.9163)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
−0.2 0.0 0.2 0.4
p
^12(0.5108, 0.9163)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^23(0.5108, 0.9163)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8
p
^11(0.5108, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8
p
^12(0.5108, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^23(0.5108, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^11(0.9163, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
−0.2 0.0 0.2 0.4 0.6 0.8
p
^12(0.9163, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^23(0.9163, 1.6094)
n = 50 n = 100 n = 250
Figure 5: Boxplot of the M= 1000 estimates of the transition probabilities. Horizontal solid red line correspond
to the true value of the transition probability. Censoring times uniformly distributed between 0 and 3.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2022.21.27
Gustavo Soutinho, Luís Meira-Μachado
E-ISSN: 2224-2880
215
Volume 21, 2022
LM GGam LM GGam LM GGam
−0.1 0.0 0.1 0.2 0.3 0.4
p
^12(0.2231, 0.5108)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^23(0.2231, 0.5108)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.1 0.2 0.3 0.4 0.5
p
^11(0.2231, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.1 0.2 0.3 0.4 0.5
p
^12(0.2231, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^23(0.2231, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.4 0.5 0.6 0.7 0.8 0.9 1.0
p
^11(0.5108, 0.9163)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6
p
^12(0.5108, 0.9163)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^23(0.5108, 0.9163)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
p
^11(0.5108, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6
p
^12(0.5108, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^23(0.5108, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^11(0.9163, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
−0.2 0.0 0.2 0.4 0.6 0.8
p
^12(0.9163, 1.6094)
n = 50 n = 100 n = 250
LM GGam LM GGam LM GGam
0.0 0.2 0.4 0.6 0.8 1.0
p
^23(0.9163, 1.6094)
n = 50 n = 100 n = 250
Figure 6: Boxplot of the M= 1000 estimates of the transition probabilities. Horizontal solid red line correspond
to the true value of the transition probability. Censoring times uniformly distributed between 0 and 4.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2022.21.27
Gustavo Soutinho, Luís Meira-Μachado
E-ISSN: 2224-2880
216
Volume 21, 2022
500 1000 1500 2000 2500 3000
0.6 0.7 0.8 0.9 1.0
Time (days)
P11(s,t)
Generalized Gamma
LM
1500 2000 2500
0.80 0.85 0.90 0.95 1.00
Time (days)
P11(s,t)
Generalized Gamma
LM
500 1000 1500 2000 2500 3000
0.00 0.05 0.10 0.15
Time (days)
P12(s,t)
Generalized Gamma
LM
1500 2000 2500
0.00 0.02 0.04 0.06 0.08
Time (days)
P12(s,t)
Generalized Gamma
LM
500 1000 1500 2000 2500 3000
0.0 0.1 0.2 0.3 0.4
Time (days)
P13(s,t)
Generalized Gamma
LM
1500 2000 2500
0.00 0.05 0.10 0.15 0.20
Time (days)
P13(s,t)
Generalized Gamma
LM
500 1000 1500 2000 2500 3000
0.0 0.2 0.4 0.6 0.8
Time (days)
P23(s,t)
Generalized Gamma
LM
1500 2000 2500
0.0 0.2 0.4 0.6 0.8 1.0
Time (days)
P23(s,t)
Generalized Gamma
LM
Figure 7: Estimated transition probabilities for s= 365 and s= 1095. ColonIDM data.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2022.21.27
Gustavo Soutinho, Luís Meira-Μachado
E-ISSN: 2224-2880
217
Volume 21, 2022