Bayesian Inference for a New Negative Binomial-Samade Model for
Time Series Data Counts with Its Properties and Applications
SIRINAPA ARYUYUEN, ISSARAPORN THAIMSORN, UNCHALEE TONGGUMNEAD*
Department of Mathematics and Computer Science,
Rajamangala University of Technology Thanyaburi,
39 Moo 1, Klong 6, Khlong Luang, Pathum Thani 12110,
THAILAND
*Corresponding Author
Abstract: - A new distribution was developed that mixed the negative binomial (NB) and Samade distributions,
called the negative binomial-Samade (NB-SA) distribution. The properties of this distribution were studied, and
the newly created distribution was applied using the framework of generalized linear models to build a time
series data count model. The characteristics of overdispersion and heavy-tailed distribution of the count
response variables were applied in the actual dataset modeling. Distribution parameters and the regression
coefficient were estimated using a Bayesian approach. Results showed that the NB-SA model had significantly
the highest efficiency compared with the classical NB and Poisson models for analyzing factors influencing the
daily number of COVID-19 deaths in Thailand.
Key-Words: Mixed NB model, Time series data, Bayesian inference, Count data, regression model, COVID-19.
Received: December 27, 2022. Revised: June 9, 2023. Accepted: June 28, 2023. Published: July 27, 2023.
1 Introduction
Time series data counts occur as daily occurrences
whenever several events are observed over time.
Increased understanding of such data and extraction
of information requires statistical analysis or
modeling. Different data counts may possess
characteristics that cannot be used with particular
models. During the past decades, researchers have
developed models to derive better conclusions from
the data. An important framework to handle time
series data counts involves generalized linear
models (GLM). When serial dependence is
incorporated through the so-called link function, this
leads to a more flexible class of models where
covariates are easily included, enabling the models
to better explain the dynamics of available
information and provide trustworthy predictions. A
famous special case of the GLM is integer-valued
generalized autoregressive conditional
heteroscedastic (INGARCH) models, also known as
a linear order model, [1], [2], [3]. Later, researchers
developed log-linear order models
(,)pq
, [4]. In
most traditions, the distribution assumption
t
Y
given
1t
assumes a Poisson distribution where the mean
is equal to the variance, called equidispersion.
However, this distribution usually leads to the
problem that the variance of random variables is
greater than the mean, called overdispersion or
underdispersion if the variance of the random
variable is less than the mean. The researchers
solved the problem using the negative binomial
(NB) distribution for overdispersion and
underdispersion, respectively, [5], [6]. For cases of
overdispersion, the NB distribution may solve the
problem but some cases have a high probability of
no event of interest, resulting in a higher frequency
of zero data values. As a result, excessive
overdispersion is aggravated, making the NB
distribution unsuitable for this data type. Various
methods have been employed to develop a new class
of discrete distributions, such as mixed Poisson, [7],
generalized Poisson, [8], zero-inflated generalized
Poisson, [9], mixed Poisson-inverse-Gaussian, [10],
and mixed NB distributions. Many mixed NB
distributions have also been proposed, such as the
NB-Lindley (NB-L), [11], NB-generalized
exponential, [12], NB-gamma, [13], NB-Sushila,
[14], NB-generalized Lindley, [15], NB-Quasi
Lindley, [16], and NB-modified Quasi Lindley, [17]
distributions. Mixed NB distributions are applied to
statistical model events for data counts in real life,
such as actuarial and insurance models, [11], [13],
[18], medical or industrial models, [16], [17], [18],
and ecology and biodiversity, [19].
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
586
Volume 22, 2023
This paper developed a new mixed negative
binomial distribution that was then applied as the
GLM for time series data counts. When developing
the GLM for a proposed distribution, one important
aspect is the methods used for parameter estimation.
One method for modeling the GLM is the maximum
likelihood (ML) method which estimates an
assumed probability distribution given observed
data. The ML estimation is achieved by maximizing
a likelihood function so that the observed data is
most probable under the assumed statistical model.
The GLM has been developed for mixed NB
distribution using the ML method to estimate
parameters, such as the NB-beta Weibull, [20], and
NB-inverse Gaussian, [21], regression models.
However, the ML method provides only a point
estimate which may need to be more robust or may
even fail to converge when the sample size is small
or when the dispersion parameter is much larger
than the mean.
The GLM does not consider prior information,
which may be helpful in the case of missing
observations. As an alternative, Bayesian inference
can account for prior expert knowledge on variables
of interest, especially in a small sample set, by
providing a sample of estimators that may be helpful
for uncertainty analysis, [22]. Practical advantages
of the Bayesian approach are its flexibility and
generality to better cope with complex problems,
[23], [25]. Recently, some researchers have favored
the Bayesian approach over the ML method, [22],
[25], while the GLM for mixed NB distribution
using the Bayesian method has been developed such
as the NB, [24], NB-Quasi Lindley, [16], NB-
modified Quasi Lindley, [17], and NB-Sushila, [25],
regression models. In the past, the most commonly
used GLM model in regression analysis was the
Poisson and NB regression model. However, the
Poisson regression model has a limitation: if the
data is overdispersion, it will result in such a model
being inappropriate. Although the NB regression
model was developed to solve the problem of
overdispersion, in some events, there may be a very
high probability that the event of interest will not
occur at all, and this makes the data value zero has a
higher frequency. As a result, the problem of
overdistribution is more serious. These situations
make the Pois and NB distributions unsuitable for
data of this nature. At the same time, if the
numerical data is collected continuously over time
in the form of a time series, there are better
approaches than GLM modeling based on regression
analysis. Because the limitations of regression
analysis are that the observations of each dependent
variable must be independent, this study extended
the GLM model to the GLM for time series count
data with a new mixed negative binomial. This
study proposed a new mixed NB distribution for
time series data counts as a flexible alternative for
analyzing heavy-tailed data with overdispersion.
The Bayesian approach was used to estimate model
parameters with the GLM framework applied to
build the time series data counts using both internal
and external covariates. This proposed model was
constructed for time series data counts of COVID-
19 deaths in Thailand from 1 January 2021 to 30
September 2022, [26].
2 Materials and Methods
2.1 Preliminary about Some Distribution
The Poisson distribution: Let
Y
be a random
variable with a Poisson distribution with a
parameter
,
µ
denoted by
Y
Its
probability mass function (pmf) as follows:
exp( )
(; ) !
=
y
fy y
µµ
µ
or
0,1, 2,...=y
and
0.>
µ
(1)
Its mean and variance are, respectively
E( ) =Y
µ
and
Var( ) .=Y
µ
(2)
The NB distribution: Let
Y
be a random variable
as distributed the NB distribution with parameters
r
and
,m
denoted by
Y
NB( , ).rm
Its pmf is given
by:
1
( ; , ) (1 )
+−

=


ry
yr
f yrm m m
y
for
0,1, 2,...=y
, (3)
where
0r>
and
0 1.m<<
Its mean and variance
are respectively:
1
E( ) m
Yrm

=

and
2
1
Var( ) .
m
Yr
m

=

(4)
The Samade distribution: In 2021, the Samade
(SA) distribution was proposed by [27], for
analyzing lifetime data. The SA distribution is
derived from a mixture of the gamma (Gam) and
exponential (Exp) distributions, i.e.,
Gam( , )ab
and
Exp( ).b
The SA probability density function (pdf)
is obtained as:
4
Exp Gam
44
6
(;,) () ()
66
ba
g ab g g
ba ba
λ λλ
 
= +
 
++


(5)
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
587
Volume 22, 2023
where
0,
λ
>
Exp
()g
λ
and
Gam ()g
λ
are the pdf of
the Exp and Gam distributions. Its pdf can be
written as:
( )
4
3
4
(;,) 6
b
b
g ab b a e
ba
= +
+
λ
λλ
(6)
where
0,>
λ
a
and
b
are the shape and scale
parameters, respectively. When
1a=
and
0,=a
the
SA distribution reduces to the one-parameter Pranav
and Exp distributions, respectively, [27].
2.2 Notations and Model Specifications for
Time Series Data Counts
Time series data counts occur whenever several
events are observed over time. The GLM works out
the appropriate linear predictor link functions (mean
function) and inverse link functions, [28], to
describe the covariate effects attached to the
dependence observations. If
t
Y
is a time series data
count, the GLM of the time series data will be
composed of random components, system
components, and link functions. A random
component is the conditional distribution of the
response given the past belongs to the exponential
family of distributions having canonical form as
follows:
1
()
( | ; , ) exp ( ; ),
()

= +


tt t
tt t t
t
yw
f y cy
θθ
θφ φ
αφ
(7)
where
()
t
w
θ
denotes the function of
,
t
θ
( )
=
t
αφ
( )
,
t
w
φθ
φ
is a dispersion parameter and
( ;)
t
cy
φ
is a prior weight. The systematic component is a
monotone function that can be represented as:
()
1
()
=
=
k
p
t j ti
k
gY
µβ
for
0,1,2,..., ,=jp
(8)
where
k
i
represents a positive integer,
12
0<<<ii
,< <∞
p
i
and
( )
t
g
µ
is a link
function, which represents the linear predictor of the
model. The GLM equation can be written as
follows:
1
( | ; , ) 1.
−∞
=
tt t t
f y dy
θφ
(9)
The mean and variance of
1
|
tt
Y
are as follows:
[ ]
1
E | ()
= =
t tt t
Yw
µθ
(10)
[ ]
1
Var | ( ) ( ) ( )Var( ),
′′
= =
tt t t t t
Yw
αφ θ αφ µ
where
()
t
w
θ
and
( )
′′ t
w
θ
are the first and second
derivative of
()
t
w
θ
respectively. Because
1
Var( | ) 0,
>
tt
Y
it follows that
()
t
w
θ
is
monotone. Therefore,
( )
1
=
tt
w
θµ
and
()
t
w
θ
is
the monotone function of
.
t
µ
Thus, the link
function can be defined as Eq. (8).
The general form for time series data counts
follows a generalized linear model as:
0
11
() ( ) ( )
−−
= =
=+++
∑∑
X
pq
T
t k t ik l t jl t
kl
g gY g
µ β β αµ η
(11)
where the response variable
t
Y
represents a time
series data count,
t
µ
represents the mean process,
1t
represents history up to time
,t
and
t
θθ
=
represents the regression coefficients:
( )
1
01 1
, ,..., , ,..., , T
T pqs
pq
θ ββ βα αη
+++
= ∈Θ
(12)
where
()g
represents the link function,
g : +→ℝ,
g
represents the transformation
function, and
η
represents the parameter vector,
( )
1
,,=
s
ηη η
corresponding to the effects of
covariates. A popular special case from Eq. (11) is
known as INGARCH models of order
p
and
,q
abbreviated as the INGARCH
( , ).pq
In this model,
the distribution assumption on
t
Y
given
1t
is
distributed as Poisson, NB, etc., Eq. (11) with the
logarithmic link function
1
()
=
t
gy
1
log( ),
t
y
and
( ) log( 1),
−−
= +
tk tk
gy y
can be written as [29], i.e.,
( ) log
tt
g
µµ
=
:
0
11
( ) log( 1) .
pq
t k tk l tl
kl
gY
µ β β αµ
−−
= =
= + ++
∑∑
(13)
From the above equation, the internal and external
regressor effects of covariates
η
can be determined
by adding internal regressors
()
,X
I
t
such as
intervention,
cos(.)
and
sin(.)
functions to represent
seasonal cycles. The conditional mean of Eq. (13)
can then be expressed as:
()
0
11
( ) log( 1) .
−−
= =
= + ++ +
∑∑
X
pq
TI
t k tk l tl t
kl
gY
µ β β αµ η
(14)
To further include the external regressor
()
,
E
t
X
the
mean of Eq. (14) can be expressed as:
0
11
( ) log( 1)
−−
= =
= + ++
∑∑
pq
t k tk l tl
kl
gY
µ β β αν
() ( )
.++
TI TE
tt
ξη
XX
(15)
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
588
Volume 22, 2023
2.3 The GLM for Time Series Data Counts
Suppose there are discrete time series count data
t
Y
with t
t
. The conditional mean
[ ]
1
E|
tt
Y
from
time series counts data, for example,
t
µ
. Then the
general GLM for time series count data is as
follows:
0
11
() ( ) ( ) .
pq
T
t k t ik l t jl t
kl
g gY g
µ β β αµ η
−−
= =
=+++
∑∑
X
With: g: ℝ→ + is the link function, g : 0 →ℝ is a
transformation function, and a vector parameter is
( )
1
,,=
s
ηη η
. In GLM
()
t
g
µ
called the linear
predictor, The regression can be used for the past
time response variable, defined as
12
{ , ,..., }
p
P ii i=
and
i
is an integer where
12
0 ,...,
p
ii i< < < <∞
. In
the GLM for time series count data, it is possible to
regressor observed lag
12
−−
p
ti ti ti
Y ,Y ,...,Y
. The same
analogy with lag in observation defined
Q
where
12
{ , ,..., }
q
Q jj j=
and
j
is an integer where
12
0 ,...,
p
ii i< < < <∞
. For the regressor variable on
the lag for the conditional mean
12
−−
q
tl tl tl
, ,...,
µµ µ
Assume the time series data count
t
Y
to be a
random variable with the Poisson distribution
denoted by
1
|tt
Y
~
Pois( ).
t
µ
The Poisson model
for the time series data count can then be written as:
1
exp( )
(| ;) ,
!
=t
y
tt
tt t
t
fy y
µµ
µ
for
0,1, 2,...=
t
y
(16)
where
1
( ).
=
tt
g
µµ
The mean and variance of
{ }
1
|;
tt t
Y
µ
are, respectively
1
E[ | ; ]
=
tt t t
Y
µµ
and
1
Var[ | ; ] .
=
tt t t
Y
µµ
(17)
Hence in the case of a conditional Poisson
response model, the conditional mean is identical to
the conditional variance of the observed process.
However, if the response variable occurs the
overdispersion problem arises and the Poisson
response is unsuitable for modeling.
The NB distribution is an alternative when
dependent variable problems arise, such as
overdispersion. Let
Y
be a random variable with the
NB distribution and pmf as shown in Eq. (3). We
can parameterize
m
in the term of
r
as
( )
= +mr r
µ
for
µ
as the mean response variable
Y
and
r
as the reciprocal (or inverse of a
dispersion parameter
:1=r
φφ
). Therefore, the pmf
of the NB distribution can be rewritten as:
()
(;, ) ( ) ( 1)

Γ+
=
ΓΓ+ + +

ry
ry r
f yr ry r r
µ
µµµ
(18)
where
()Γ⋅
as a complete gamma function, denoted
by
Y
~
NB( , ).r
µ
Let
t
Y
be a response random variable in the NB
distribution, denoted as
1
|
tt
Y
~
NB( , ).
t
r
µ
The NB
model for time series data counts then
become
1
()
( | ;, ) ( ) ( 1)

Γ+
=
ΓΓ + + +

t
ry
tt
tt t
tt t
ry r
fy r ry r r
µ
µµµ
(19)
where
0,1,2,...,=
t
y
and
1
( ).
=
tt
g
µµ
The mean
and variance of
{ }
1
| ;,
tt t
Yr
µ
are:
1
E[ | ; , ]
=
tt t t
Yr
µµ
(20)
and
2
1
Var[ | ; , ] .
= +
t
tt t t
Yt
r
µ
µµ
However, even if the NB distribution can solve
the problem of overdispersion, in some cases there
are problems of heavy-tailed distribution and the
NB response is not suitable for modeling.
2.4 Criteria for Model Evaluation
Three criteria were used to compare the
performance of model suitability as the deviance,
,
D
p
and the deviance information criterion (DIC).
The DIC is a hierarchical modeling generalization
of the Akaike information criterion, which is often
and widely used as a goodness-of-fit measure using
the Bayesian approach. The DIC is beneficial to
Bayesian model comparison problems when
posterior distributions have been obtained by the
Markov chain Monte Carlo (MCMC) simulation.
The model has the smallest value of DIC, and
,
D
p
is the best model, [30], [31], [32]. Let
D( ) 2log L( | )y= ΩΩ
be the deviance, where
L( | )Ωy
are the likelihood function and the
conditional joint pdf of observations are given by
unknown parameters. We have
DIC ( ) ,= +
D
DpΩ
[ ]
( ) E 2 log L( | )= ΩΩDy
and
=
D
p
[ ]
Var D( ) 2.Ω
The probability integral transform (PIT) is a tool
for assessing the probability calibration of the
predictive distribution. This follows a uniform
distribution if the predictive distribution is correct.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
589
Volume 22, 2023
The shape of the PIT histograms suggests the
calibration accuracy of the predictive distribution. A
convex shape indicates an underdispersed predictive
distribution, whereas concave histograms refer to
overdispersed predictive distributions, [14].
3 Results and Discussion
3.1 A New Mixed NB Distribution
Definition 1: Let
Y
be a random variable
distributed as the NB distribution with parameters
0r>
and
exp( )= m
λ
where
λ
is distributed as
the SA distribution with parameters
0a>
and
0,b>
i.e.,
Y
NB( , exp( ))= rm
λ
and
λ
SA( , ).ab
Then a random variable
|Y
λ
follows
a negative binomial-Samade (NB-SA) distribution
with parameters
,ra
and
,b
denoted by
Y
NB-SA( , , ).rab
Theorem 1: Let
Y
NB-SA( , , ),rab
then its pmf is
0
1
(;,,) (1)
=
+− 

=




y
j
j
yr y
f yrab yj
5 34
44
()
,
( )( )
++ +
×+ ++
b b r j ab
b ab b r j
(21)
where
0,1,2,...,y=
0, 0ra>>
and
0.b>
Proof: Let
Y
NB( , )rm e
λ
=
and
λ
SA( , ),ab
then the marginal pmf of
Y
can be obtained using
the expression:
0
(;,,) ( |, )(;,) ,
=
f yrab f y r g abd
λλ λ
where
(;,)g ab
λ
is the pdf of the SA distribution as
Eq. (5), and
( |,)f yr
λ
is the NB’s pmf as Eq. (3),
( )
1
( |,) 1 ,
y
r
yr
f yr e e
y
λλ
λ
−−
+−

=


where
( )
0
1 ( 1) ,
y
yjj
j
y
ee
j
λλ
−−
=

−=


and
[ ]
()
00
0
(;,,)
1( 1) ( ; , )
1( 1) M ( ) ,
−λ +
=
λ
=
+− 

= λλ




+− 

= −+




y
j rj
j
y
j
j
f yrab
yr y e g abd
yj
yr y rj
yj
where
[ ]
M( )−+rj
λ
is the moment generating
function (mgf) of the SA distribution, which is
shown in Lemma 1.
Lemma 1: Let
λ
SA( , ),ab
then its mgf is:
5 34
44
()
M(; , ) ( 6 )( )
−+
=+−
b b s ab
sab b ab s
λ
for
0,1,2,...,=s
(22)
0>a
and
0.b>
Proof: If
λ
SA( , )ab
, then its mgf is
[ ]
4
() 3
4
0
4
() 3
()
4
00
43
4
4 44 4
M( ) ( )
6
6
() .
6 () (6)()
−−
∞∞
−− −−
−+ = +
+

= +

+

−+


= +=

+− +−

∫∫
bs
bs bs
b
r j e ba d
ba
bbedaed
ba
b bb s a
bb a
b abs bs b abs
λ
λ
λλ
λλ
λλ λ
Thus, the pmf of the NB-SA distribution can be
written as Eq. (21). The NB-SA distribution sub-
models follow the NB-Pranav (for
1)=a
and the
NB-Exp (for
0)=a
distributions. The pmf’s shapes
for the NB-SA distributions are provided in Figure
1. Some basic properties of the proposed
distribution are obtained from the factorial moment
as follows.
Theorem 2: Let
Y
NB-SA( , , ),rab
then its
th
k
factorial moment is
5 34
[] 44
0
() ( )
( 1) , (23)
( ) ( )( )
k
j
k
j
k
r k b b k j ab
j
r b ab b k j
µ
=

Γ+ −+ +
=

Γ + −+

where
1, 2,3,...=k
and
, , 0.>rab
Proof: Let
Y
NB( , ),rm
then its factorial moment
is
[ ]
[]
(1 ) ( )
E ( 1) ( 1) ,
()
Γ+
= −+ = Γ
k
kk
m rk
YY Y k mr
µ
where
1, 2,3,...=k
. For
|Y
λ
NB( , )rm e
=
λ
and
λ
SA( , )ab
, we have
[]
()(1 ) ()
E E ( 1) ,
() ()
k
k
kk
rk e rk e
re r
λλ
λλ
λ
µ

Γ+ Γ+

= =


ΓΓ

Using a binomial expansion for the term
( 1) ,
k
e
λ
the
[]k
µ
can be written as:
()
[]
0
() ( 1) E
()
k
j kj
k
j
k
rk e
j
r
=

Γ+

=
 
Γ
λ
λ
µ
()
00
() ( 1) ( ; , )
()
k
j kj
j
k
rk e g abd
j
r
=

Γ+
=

Γ
λ
λλ
[ ]
0
() ( 1) M
()
k
j
j
k
rk kj
j
r=

Γ+
= −−

Γ
λ
.
From
[ ]
Mkj
λ
is the mgf of the SA distribution in
(22) for
sk j=
, we have the
th
k
factorial moment
as follows
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
590
Volume 22, 2023
5 34
[] 44
0
() ( )
( 1)
( ) ( )( )
k
j
k
j
k
r k b b k j ab
j
r b ab b k j
=

Γ+ −+ +
=

Γ + −+

µ
.
Based on the
th
k
factorial moment in Theorem 2,
the first two factorial moments can be written as:
( )
[1] 1 0
E( ) ,
= = Yr
µ δδ
( )
2
[2] [1] 2 1 0
E( ) ( 1) 2 ,
′′
= =+ −+Y rr
µ µ δ δδ
where
5 34
44
()
( )( )
−+
=+−
j
b b j ab
b ab b j
δ
for
0,1, 2.=j
Its mean and variance are:
( )
10
E( )Yr
δδ
=
and
[ ]
2
2 11 1 0 0 2 1
Var ( ) (2 ) (1 2 ) ( ).= + ++ + Yr r
δ δδ δ δδ δ δ
The proposed distribution can fit overdispersed
and underdispersed data or both and the index of
dispersion (ID) needs to be found. The ID is
[ ]
( )
2 11 1 0 0 2 1
10
(2 ) (1 2 )
ID .
+ ++ +
=
r
δ δδ δ δδ δ δ
δδ
3.2 A New Mixed NB Distribution
Let
t
Y
NB-SA( , , )rab
then the GLM for the time
series data count is developed with
{ }
1
|;
tt
YΩ
where
(, ,, )=
rab
θ
Ω
as:
( )
1
P| ;
Ω
tt
Y
0
43
4
0
()
( ; ,)(;,) ( 1) ( )
() ,
6
Γ+
= = Γ


+
×

++ +


t
t
tt
t
y
r
b
t
tt
yr
f y rg abd yr
r bb a ed
r rb
λ
λµ λ λ
µλλ
λµ λµ
where
0,>r
1()0
= >
tt
g
µµ
and
θ
defines as Eq.
(12). Its mean and variance are
( )
1
E| ;
=Ω
tt
Y
E( ),
t
µλ
(24)
and
( ) ( )
22
1
Var | ; E( ) (1 ) E( )
tt t t
Y rr
µλµ λ
= ++Ω
[ ]
2
E( ) ,
t
µλ
where
E( )
λ
and
2
E( )
λ
are the first
and second moments of the original SA random
variable, [27],
4
5
24
E( ) 6
+
=+
ab
ab b
λ
and
4
2
36
120 2
E( ) .
6
+
=+
ab
ab b
λ
(25)
A flow diagram of the steps creating the time
series model for the NB-SA distribution is shown in
Figure 2 (Appendix). The vector of unknown
parameters
Ω
is customarily estimated using the
Bayesian approach, which allows consideration of
previous information for parameter estimation.
3.3 Bayesian Inference of the NB-SA Model
for Time Series Data Counts
The Bayesian framework for the NB-SA model was
proposed for time series data counts, and created
based on the likelihood function, prior distribution,
and posterior distribution, denoted respectively by
1
L( | ; ),
Ω
tt
Y
( )
,
π
Ω
and
( )
|.pyΩ
The Bayesian
NB-SA model is
|
tt
y
µ
~
NB-SA( , , , ),
rab
θ
where
1()
=
tt
g
µµ
and
()
t
g
µ
denoted in Figure 2
(Appendix).
The likelihood function of
{ }
1
|;
Ω
tt
Y
is:
( )
1
10
()
L| ; ( 1) ( )
=

Γ+
=
Γ +

Ω
r
n
t
tt
ttt
yr r
Yyr r
γµ
( )
43
4.
6
t
y
b
t
t
bba ed
r ba
λ
λ
µλ
γµ

+


×
++



(26)
This function can be executed using the
representation of the hierarchical model implicit in
the integral and the definition of the SA distribution.
The SA distribution is a mixture between the
Exp( )b
and
Gam( , )ab
distributions with the pdf as
Eq. (5), while the NB-SA distribution is conditional
on the unobserved site-specific frailty term
,
λ
which describes the additional heterogeneity, [33].
Consequently, the hierarchical framework can
be represented as:
( )
1 NB
P | ; , | ( | ,)
=
tt t t t
Y r fy r
µ λ λµ
,
λ
~
4
Exp Gam
44
6
() ().
66
ba
gg
ba ba
λλ
 
+
 
++


In Bayesian inference, the prior distribution
plays a defining role in estimating the unknown
parameters in any distribution. This model contains
all unknown parameters. Accordingly, under the
squared error loss function, the Bayesian estimator
of
Ω
will be
( )
E|.
t
yΩ
In the GLM context, the
most frequently used informative prior distribution
is the normal distribution, [34]. We define the prior
distribution of parameters
,ra
and
b
as the gamma
distribution,
r
Gam( , ),
rr
τγ
a
Gam( , ),
aa
τγ
and
b
Gam( , ).
bb
τγ
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
591
Volume 22, 2023
Fig. 1: The pmf plots of the NB-SA distribution with specified parameter values
Let prior distribution for
θ
be the normal (N)
distribution, denoted by
θ
2
N( , )

θθ
µσ
for the
positive real values of
,
r
τ
,
r
γ
,
a
τ
,
a
γ
,
b
τ
,
b
γ
,
θ
µ
and
2
θ
σ
are known or fixed. Suppose that
θ
µ
is a
( 1)×k
hyper-parameter vector and
2
θ
σ
is a
()×kk
known
non-negative specific matrix, while
k
is the number
of the regression coefficient. Each parameter is
supposed to be independently distributed, and the
joint prior distribution of all unknown parameters
can be written as:
( ) ()()()().=
rab
π π π π πθ
Ω
(27)
The posterior distribution will integrate the
sample information from the likelihood function as
Eq. (26), with accessible parameter information
from the prior distribution as Eq. (27). The posterior
distribution can then be derived as follows:
( )
1
( |) | ; ()()()().
tt
p LY r a b
π π π πθ
Ωy Ω
(28)
Since the posterior distribution does not have an
explicit form, a computational method called the
Gibbs sampler is used in this study. The best known
MCMC sampling algorithm was applied to find
( )
E |.Ωy
By setting some initial value, the Gibbs
sampler algorithm will randomly walk through
parameter space. The basic scheme Gibbs sampler is
given as follows, [35], [36].
Step 0. Choose an arbitrary starting point
(0).Ω
Step 1. Generate
( j 1)+
Ω
as follows:
Generate
( 1)+j
r
() () () () ()
12
(|,,,,,,);
jjjj j
k
pr a b
θθ θ
y
Generate
( 1)+j
a
() () () () ()
12
(|,,,,,,);
jjjj j
k
pa r b
θθ θ
y
Generate
( 1)+j
b
() () () () ()
12
(|,,,,,,);
jjjj j
k
pb r a
θθ θ
y
Generate
( 1)
1
+j
θ
() () () () ()
12
(|,,,,,,);
jjj j j
k
p rab
θ θθ
y
Generate
( 1)+j
k
θ
() () () () ()
11
(|,,,,,,);
y
jjj j j
kk
p rab
θ θθ
• Step 2. Set
1= +jj
and go to step 1.
In this study, the model parameters of
Ω
can be
estimated from the Bayesian method using the
JAGS (Just Another Gibbs Sampler) as an
implementation of an MCMC algorithm called
Gibbs sampling to sample the posterior distribution
of a Bayesian model. In this paper, the expected
posterior of parameters is calculated using the JAGS
function in the R2jags package of the R language,
[32], [37]. Factors influencing the number of
COVID-19 cases resulting in death included the
daily number of COVID-19 cases in Thailand from
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
592
Volume 22, 2023
1 January 2021 to 30 September 2022, comprising
607 days, [26]. Figure 3 shows the number of daily
COVID-19 deaths in Thailand.
In Figure 3(a) the time series motion is in the form
of sine and cosine waves. When the time series
reaches its peak, the value of each cycle decreases in
the form of a transient shift. Therefore, this study
considered internal covariates as sine and cosine
functions and transient shift interventions. The
external covariate was the number of new COVID-
19 infection cases. Random variables used in this
study were as follows:
t
Y
is time series data of the number of daily
COVID-19 deaths (unit: people) where
1,...,607.t=
t
X
is the number of new COVID-19 infection
cases (unit: people), with the mean and standard
deviation as 7638.90 and 7548.74, respectively
(minimum, median, and maximum as 26, 4563, and
28379). From the actual data, the mean and variance
of the number of COVID-19 deaths (unit: people)
were 52.88 and 3,646.61, respectively (minimum,
median, maximum, and standard deviation as 0, 31,
312, and 60.39, respectively). Since the variance of
COVID-19 deaths was greater than the mean, this
dataset had an overdispersion problem. The
histogram (Figure 3 (b)) and normal Q-Q plot
(Figure 3(c)) also showed that the data had a heavy-
tailed distribution. Therefore, the new NB-SA
distribution was developed to solve these problems
and a GLM was created for the NB-SA distribution.
The procedure for creating the model is shown in
Figure 2.
Fig. 3: Empirical data: (a) Plot of the daily number
of COVID-19 deaths in Thailand, (b) Bar chart of
the daily number of COVID-19 deaths in Thailand,
and (c) The normal Q-Q plot of the number of
COVID-19 deaths in Thailand.
3.4 Data Analysis Results
From the time series data of the number of daily
COVID-19 deaths,
P
and
Q
initial values were
defined by forming an ACF and PACF plot
specifying the optimal combination based on the
smallest AIC value. The results showed that NB-
SA, NB, and Poisson gave the minimum AIC when
P=1 and Q=1. Figure 4(c) shows the optimal
combination of P and Q of the NA-SA. The result
shows that when P=1, Q=1 yields an AIC of 322.30.
These
P
and
Q
values were used for modeling,
using the NB-SA, Poisson, and NB distributions by
adding internal and external covariate effects and
estimating model parameters with the Bayesian
approach by defining the model as:
( )
{
0 1 1 11 1
ˆˆ ˆ
ˆˆ
exp log 1
−−
= + ++ +
t t tt
Y YX
β β αυ ξ
}
11223142
ˆˆ ˆ ˆ
++ + +I I FF
ηη η η
(29)
where
1
I
is the intervention with transient shift
for
the time
230t
for
230
1 230
0.8 ,
=t
II
for
230
1=I
if
(t 230)
, otherwise
230 0.=I
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
593
Volume 22, 2023
2
I
is the intervention with transient shift
for
the time
483t
where
483
2 483
0.8 ,
t
II
=
for
483
1=I
if
(t 483)
, otherwise
483
0.=I
1
F
is the internal covariate
( )
sin 2 365 .t
π
2
F
is the internal covariate
( )
cos 2 365 .t
π
After developing a new distribution, the NB-SA
distribution was studied. Its GLM framework was
applied with time series data counts of the daily
number of COVID-19 deaths to build the GLM
derived from the NB-SA distribution. The time
series data count of
t
Y
was provided in the NB-SA
model. The Bayesian approach estimated the model
parameters as the posterior means (estimates),
standard error (s.e.), 95% credible intervals (Cr.I.)
of each parameter, and statistics for comparing
model performance, such as the deviance, DIC, and
D
p
of the Poisson, NB, and NB-SA models, as
shown in Table 1. The study variable
t
X
was
standardized to a standard score,
i.e.,
7638.90 7548.74,()
tt
Z X=
represents the
external covariate influenced by the daily number of
COVID-19 deaths. Transient intervention shifts for
time
230t
and
483.t
The
1
F
and
2
F
are
internal covariate effects due to the data because
there was a surge in the number of daily COVID-19
deaths in Thailand (
t
Y
).
Based on these prior distributions, three parallel
independent MCMC chains were generated with
300,000 iterations for each parameter, discarding the
first 150,000 iterations as a burn-in for computation.
Results indicated that the deviance, DIC and
D
p
values of the NB-SA model were smallest
compared with the Poisson and NB models. The
density plots and trace plots of the three MCMC
chains with the MCMC plots package in R, [37],
from the NB-SA model are shown in Figure 5 and
Figure 6, respectively. Results showed that the
density plots of all parameters in three parallel
chains overlapped well after the burn-in period,
while the trace plots showed that graphs of the
values of simulated parameters against the drawn
lines were almost vertical and dense. The motion of
the trace plots revealed characteristics of a
converged manner, and the sequence was stable.
Therefore, the NB-SA model could be fitted for this
dataset. Results of the GLM for time series data
count models are shown in Table 1, with the
estimated parameters
,ra
and
b
for the NB-SA
model
ˆ(s.e. 0.149),31.846r= =
ˆ1.678a=
0.2626)(s.e. =
and
.
ˆ(s.e ).11.183 0.802b= =
From Eq. (25), we have
E( ) 0.0896.=
λ
The daily number of COVID-19
deaths in Thailand with NB-SA distribution can be
represented as:
{
,NB-SA 1
ˆexp 1.619 0.940log( 1)0.0896
= ++
tt
YY
1 21
0.001 0.0 0.076 0 0 08.01
tt
I IZ
υ
+−++
12
0.059 0.074 }.FF−−
Fig. 4: (a) Partial Autocorrelation (b)
Autocorrelation (c)The optimal combination based
on the smallest AIC value for the NB-SA model.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
594
Volume 22, 2023
Table 1. Parameter estimates and important statistics of the Poisson, NB and NB-SA models
Models
Estimate (s.e.)
95% Cr.I.
Poisson
0
β
0.450 (0.0022) (0.344, 0.556)
1
β
0.852 (0.0006) (0.823, 0.882)
1
α
0.001 (0.0001) (0.000, 0.002)
1
ξ
0.120 (0.0005) (0.097, 0.142)
1
η
0.047 (0.0017) (-0.032, 0.130)
2
η
0.123 (0.0026) (-0.003, 0.241)
3
η
-0.096 (0.0005) (-0.120, -0.072)
4
η
-0.100 (0.0006) (-0.129, -0.071)
Deviance = 4,566.2, DIC = 9,054.7,
=
D
p
4,488.6
NB
0
β
0.219 (0.0038) (-0.047, 0.322)
1
β
0.940 (0.0010) (0.887, 0.987)
1
α
0.001 (0.00004) (0.000, 0.002)
1
ξ
0.075 (0.0009) (0.035, 0.116)
1
η
-0.013 (0.0049) (-0.247, 0.230)
2
η
0.011 (0.0054) (-0.244, 0.282)
3
η
-0.059 (0.0009) (-0.105, -0.016)
4
η
-0.074 (0.0009) (-0.118, -0.032)
Deviance = 4,127.7, DIC = 4,137.4,
=
D
p
9.7
NB-SA
0
β
1.619 (0.0562) (-0.868, 4.403)
1
β
0.940 (0.0010) (0.893, 0.988)
1
α
0.001 (0.00004) (0.000, 0.002)
1
ξ
0.076 (0.0008) (0.036, 0.115)
1
η
-0.010 (0.0049) (-0.243, 0.231)
2
η
0.008 (0.0054) (-0.257, 0.264)
3
η
-0.059 (0.0009) (-0.104, -0.014
4
η
-0.074 (0.0009) (-0.116, -0.029)
Deviance = 4,127.6, DIC = 4,136.4,
=
D
p
8.7
The deviance, DIC, and
D
p
of the NB-SA
model were 4127.6, 4136.4, and 8.7, respectively
(Table 1). Results indicated that the average of
t
Y
was influenced by the number of
t
Y
on the previous
day. The average daily number of COVID-19 deaths
in Thailand was also influenced by the number of
1t
υ
on the previous day. To consider the traditional
NB distribution, we have
)
ˆ(s. .31.909 0e .152r= =
and
1
E( | ) .
=
it t
Y
µ
Consequently, the GLM for the
time series data count approach with the NB
distribution can be represented as:
,NB 1 1
ˆexp{0.129 0.940log(1 ) 0.001
t tt
YY
υ
−−
= + ++
2
12
1
0.011
0.059 0.074 }.
0.075 0.013
t
I
FF
ZI+−+
−−
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
595
Volume 22, 2023
Fig. 5: Density plots of three MCMC chains for values of
,,,rab
deviance,
0
=
β
beta[1],
1
=
β
beta[2],
1=
α
beta[3],
1
=
ξ
beta[4],
1
=
η
beta[5],
2=
η
beta[6],
3=
η
beta[7], and
4=
η
beta[8] from the NB-SA model.
Fig. 6: Trace plots of three MCMC chains for values of
,,,rab
deviance,
0
=
β
beta[1],
1=
β
beta[2],
1
=
α
beta[3],
1
=
ξ
beta[4],
1
=
η
beta[5],
2
=
η
beta[6],
3
=
η
beta[7], and
4
=
η
beta[8] from the NB-SA model.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
596
Volume 22, 2023
Fig. 7: PIT histograms for the NB-SA, NB, and
Poisson models
Fig. 8: Comparison of actual data and the NB-SA
model
The deviance, DIC, and
D
p
of the generalized
linear models of the NB model were 4127.7, 4137.4,
and 9.7, respectively (Table 1). In the same way,
the GLM for the time series data count approach
with the Poisson distribution can be represented as:
,Pois 1
212
1
1
ˆ
.
exp{0.450 0.852log(1 ) 0.001
0.120 10 2. 00 1. 3 0.096 0.47 10 }
−−
= +
++
+++
t tt
t
YY
ZII FF
υ
The deviance, DIC, and
D
p
of the generalized
linear models of the Poisson model were 4566.2,
9054.7, and 4488.6, respectively (Table 1).
3.5 Comparison of Model Performance
When comparing model performance based on
deviance, DIC, and
D
p
the NB-SA model had the
highest efficiency compared with the NB and
Poisson models. PIT histograms for the NB-SA,
Poisson, and NB distributions are shown in Figure
7. The Poisson model did not have a uniform
distribution, while the NB-SA and NB models
approached uniformly, and the NB-SA model was
significantly more uniform than the NB model.
Therefore, the NB-SA model performed better
compared to the Poisson and NB models. Figure 8
compares the data plot and the NB-SA model.
Results showed that the proposed model had
reasonably good values that followed the actual data
pattern. The NB-SA model had a value closer to the
actual data plot, with the smallest BIC, DIC, and
D
p
values, while the PIT histogram approached
uniform distribution.
4 Conclusion
A new mixed NB distribution was proposed called
the negative binomial-Samade (NB-SA)
distribution. Its properties were studied using a
GLM framework to build the regression model.
Data used in modeling comprised actual datasets of
daily numbers of COVID-19 deaths in Thailand
from 1 January 2021 to 30 September 2022 covering
607 days with data overdispersion and heavy-tailed.
Comparing the accuracy of the proposed distribution
with the Poisson and an NB model showed that the
NB-SA model had the highest efficiency. One
reason why the NB-SA model was more effective
than NB was that the NB-SA distribution was
developed from the NB distribution. The NB-SA
model described the data better than the NB model
for time series data counts with overdispersed or
heavy-tailed distribution since the NB-SA model
was more flexible with two shape parameters. By
contrast, the NB model had one shape parameter.
The deviance, DIC, and
D
p
of the new model were
lower than the Poisson model at 10.6240, 118.9029,
and 51,493.1035%, respectively. When comparing
the new model and the NB model, the deviance,
DIC, and
D
p
values of the NB-SA model were
lower than the NB model at 0.0010, 0.0242, and
11.49%, respectively. When considering model
performance using the PIT histogram, results
showed that the NB-SA model approached uniform
distribution. Based on deviance, DIC,
D
p
and the
PIT histogram the proposed model was also suitable
for forecasting the number of daily COVID-19
deaths in Thailand. Results indicated that the NB-
SA model for time series data count was an efficient
alternative to overcome overdispersion and heavy-
tailed problems. Therefore, the NB-SA model was a
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
597
Volume 22, 2023
suitable alternative to create or develop a model
related to overdispersion and heavy-tailed
distribution of time series data counts. An advantage
of constructing the GLM for time series count data
with a new mixed NB is that the developed
distribution can fit with the actual data. It is robust
to overdispersion problems and data with heavy-
tailed distributions. As a result, the GLM for time
series count data with a new mixed NB model is
efficient and accurate in forecasting. But the
challenge that may become a limitation of the
development of new mixed NB distributions is the
complexity of forming the moment-generating
function of the distribution combined with the NB
distribution into close form. When considering an
application, the developed model can be applied to
the actual count data in many fields such as medical,
insurance and financial, industrial ecology and
biodiversity, etc. Especially when the data have
overdispersion and are heavy-tailed, the GLM for
time series count data with a new mixed NB model
is more efficient and accurate in forecasting. Further
future work worth considering refers to a new mixed
NB distribution for the multivariate time series
analysis with more than one time-dependent
variable. Each variable depends not only on its past
values but also has some dependency on other
variables. A study on Vector Auto Regression (VAR)
describes the relationships between variables based
on their past values.
Acknowledgment:
This research was supported by The Science,
Research, and Innovation Promotion Funding
(TSRI) (Grant no. FRB660012/0168). This research
block grant was managed under Rajamangala
University of Technology Thanyaburi
(FRB66E0641N.2). The authors gratefully
acknowledge this funding. We are also thankful to
those who could not be mentioned here for their
kindness and encouragement. And finally, the
authors would like to thank the anonymous
reviewers for their comments and suggestions.
References:
[1] Heinen, A. Modelling time series count data:
an autoregressive conditional Poisson model.
Available at SSRN 1117187, 2003.
[2] Ferland, R., Latour, A., and Oraichi, D.
Integer-valued GARCH process. Journal of
time series analysis, Vol.27, No.6, 2006,
pp. 923-942.
[3] Fokianos, K., Rahbek, A., and Tjøstheim, D.
Poisson autoregression. Journal of the
American Statistical Association, Vol. 104,
No.488, 2009, pp.1430-1439.
[4] Fokianos, K., and Tjøstheim, D. Log-Linear
Poisson autoregression. Journal of
Multivariate Analysis, Vol.102,
No.3,2011, pp. 563-578.
[5] Genest, C., amd Nešlehová, J. A primer on
copulas for count data. ASTIN Bulletin: The
Journal of the IAA, Vol.37, No.2, 2007, pp.
475-515.
[6] Lawless, J. F. Negative binomial and mixed
Poisson regression. The Canadian Journal
of Statistics/La Revue Canadienne de
Statistique, 1987, pp. 209-225.
[7] Karlis, D., and Xekalaki, E. Mixed
poisson distributions. International Statistical
Review/Revue Internationale de Statistique,
2005, pp. 35-58.
[8] Joe, H., and Zhu, R. Generalized Poisson
distribution: the property of mixture of
Poisson and comparison with negative
binomial distribution. Biometrical
Journal: Journal of Mathematical
Methods in Biosciences, Vol.47, No.2,
2005, pp. 219-229.
[9] Wagh, Y. S., and Kamalja, K. K. Zero-
inflated models and estimation in zero-
inflated Poisson distribution.
Communications in Statistics- Simulation
and Computation, Vol.47, No.8, 2018, pp.
2248-2265.
[10] Dean, C., Lawless, J. F., and Willmot, G. E. A
mixed poisson–inverse‐gaussian regression
model. Canadian Journal of Statistics,
Vol.17, No.2, 1989, pp. 171-181.
[11] Zamani, H., and Ismail, N. Negative
binomial- Lindley distribution and its
application. Journal of mathematics and
statistics, Vol.6, No.1, 2010, pp. 4-9.
[12] Aryuyuen, S., and Bodhisuwan, W. The
negative binomial-generalized exponential
(NB-GE) distribution. Applied Mathematical
Sciences, Vol.7, No.22, 2013, pp. 1093-1105.
[13] Gençtürk, Y., and Yiğiter, A. Modelling claim
number using a new mixture model:negative
binomial gamma distribution. Journal of
Statistical Computation and Simulation,
Vol.86, No.10, 2016, pp. 1829-1839.
[14] Yamrubboon, D., Bodhisuwan, W.,
Pudprommarat, C., and Saothayanun, L.
The negative binomial-Sushila distribution
with application in count data analysis.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
598
Volume 22, 2023
Thailand Statistician, Vol.15, No.1,
2017, pp. 69-77.
[15] Aryuyuen, S. Bayesian inference for the
negative binomial-generalized Lindley
regression model: properties and applications.
Communications in Statistics-Theory and
Methods, 2012, pp.1-19.
[16] Aryuyuen, S., and Tonggumnead, U.
Bayesian Inference for the Negative
Binomial- Quasi Lindley Model for Time
Series Count Data on the COVID-19
Pandemic. Trends in Sciences, Vol.19,
No.21, 2022, pp.3171-3171.
[17] Aryuyuen, S., and Tonggumnead, U. A new
mixed negative binomial regression model to
analyze factors influencing the number of
patients with respiratory disease and long-
term effects of lung cancer.
Communications in Mathematical
Biology and Neuroscience., 2022, Article-ID.
[18] Aryuyuen, S. The negative binomial-new
generalized Lindley distribution for count
data: properties and application. Pakistan
Journal of Statistics and Operation Research,
2022, pp.167-177.
[19] Stoklosa, J., Blakey, R. V., and Hui, F. K.
An overview of modern applications of
negative binomial modelling in ecology and
biodiversity. Diversity, Vol.14, No. 5, 2022,
pp.320.
[20] Ortega, E. M., Cordeiro, G. M., and Kattan,
M. W. The negative binomialbeta weibull
regression model to predict the cure of
prostate cancer. Journal of Applied Statistics,
Vol.39 No.6, 2012, pp.1191-1210.
[21] Tzougas, G., Hoon, W. L., and Lim, J. M.
The negative binomial-inverse Gaussian
regression model with an application to
insurance ratemaking. European Actuarial
Journal, NO.9, 2019, pp. 323-344.
[22] Fu, S. A hierarchical Bayesian approach to
negative binomial regression. Methods and
Applications of Analysis, Vol.22, No.4, 2015,
pp. 409-428.
[23] Gelman, A., Carlin, J. B., Stern, H. S.,
Dunson, D. B., Vehtari, A., and Rubin, D. B.
Bayesian data analysis. CRC press, 2013.
[24] Fu, S. Hierarchical Bayesian LASSO for a
negative binomial regression. Journal of
Statistical Computation and Simulation, Vol.
86, No,11, 2016, pp.2182-2203.
[25] Yamrubboon, D., Thongteeraparp, A.,
Bodhisuwan, W., Jampachaisri, K., and
Volodin, A. Bayesian inference for the
negative binomial-Sushila linear model.
Lobachevskii Journal of Mathematics, No. 40,
2019, pp. 42-54.
[26] Department of Disease Control. Daily covid-
19 report, Thailand information. Daily
COVID-19 Report, Available at:
https://data.go.th/dataset/covid-19-daily.
[accessed January 2023].
[27] Aderoju, S. Samade probability distribution:
its properties and application to real lifetime
data. Asian Journal of Probability and
Statistics, Vol.14, No.1, 2021, pp. 1-11.
[28] Cameron, A. C., and Trivedi, P. K.
Regression analysis of count data (Vol. 53).
Cambridge university press, 2013.
[29] Liboschik, T., Fokianos, K., and Fried, R.
tscount: An R package for analysis of count
time series following generalized linear
models. Journal of Statistical Software,
No.82, 2017, pp. 1-51.
[30] Lunn, D., Jackson, C., Best, N., Thomas, A.,
and Spiegelhalter, D. The BUGS book: A
practical introduction to Bayesian analysis.
CRC press, 2013.
[31] Spiegelhalter, D. J., Best, N. G., and Carlin,
B. P. Linde, A. Bayesian measures of model
complexity and fit. Journal of the Royal
Statistical Society: Series B (Statistical
Methodology), Vol.64, No.4, 2002, pp.583-
639.
[32] Su, Y. S., and Yajima, M. R2jags: Using R to
run ‘JAGS’. R package version 0.5-7, 2015.
[33] Geedipally, S. R., Lord, D., and Dhavala, S.
S. The negative binomial-Lindley generalized
linear model: Characteristics and application
using crash data. Accident Analysis &
Prevention, No.45, 2012, pp. 258-265.
[34] Dey, D. K., Ghosh, S. K., and Mallick, B. K.
Generalized linear models: A Bayesian
perspective. CRC Press, 2000.
[35] Wongrin, W., Srianomai, S., and Klomwises,
Y. Bayesian Unit-Lindley Model:
Applications to Gasoline Yield and Risk
Assessment Data. Naresuan University
Journal: Science and Technology (NUJST),
Vol.28, No.2, 2000, pp. 41-51.
[36] Bar-Joseph, Z., Gifford, D. K., and Jaakkola,
T. S. Fast optimal leaf ordering for
hierarchical clustering. Bioinformatics,
17(suppl_1), 2001, S22-S2.
[37] R Core Team. R: a language and
environment for statistical computing.
Vienna: R Foundation for Statistical
Computing, 2021.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
599
Volume 22, 2023
Appendix
Fig. 2: Flow diagram creating the time series model for
{ }
1
|;
tt
YΩ
Contribution of Individual Authors to the
Creation of a Scientific Article (Ghostwriting
Policy)
The authors equally contributed in the present
research, at all stages from the formulation of the
problem to the final findings and solution.
Sources of Funding for Research Presented in a
Scientific Article or Scientific Article Itself
No funding was received for conducting this study.
Conflict of Interest
The authors declare that there is no conflict of
interests.
Creative Commons Attribution License 4.0
(Attribution 4.0 International, CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/deed.en
_US
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2023.22.65
Sirinapa Aryuyuen, Issaraporn Thaimsorn,
Unchalee Tonggumnead
E-ISSN: 2224-2880
600
Volume 22, 2023