beAccepted 16 January 2011
MSC:
60E05
62E99
60J80
Keywords:
Claim
Compound
Geometric distribution
Overdispersion
q-series
Recursion
Unimodality
the claim severities are discrete is derived. The particular case obtained when α tends to zero is reduced
to the geometric distribution. Thus, the geometric distribution can be considered as a limiting case of
the new distribution. After reviewing some of its properties, we investigated the problem of parameter
estimation. Expected frequencies were calculated for numerous examples, including short and long tailed
count data, providing a very satisfactory fit.
© 2011 Elsevier B.V. All rights reserved.
1. Introduction
The development of parametric families of discrete distribu-
tions, which describe count phenomena, and the study of their
properties have been a persistent theme of statistical literature
in recent years, due perhaps to advances in computational meth-
ods which enable us to compute, straightforwardly, the numeri-
cal value of special functions such as hypergeometric series. Count
data occur in many practical problems, for example the number of
events such as insurance claims, the number of kinds of species in
ecology, etc.
Most frequencies of event occurrence can be described initially
by a Poisson distribution. Nevertheless, a major drawback of this
distribution is the fact that the variance is restricted to be equal to
the mean, a situation that may not be consistent with observation.
In view of this, alternative probability distributions, such as the
negative binomial and generalized Poisson, among others, are
preferred for modeling the phenomena under study.
Many of these phenomena, such as individual automobile insu-
rance claims, are characterized by two features: (1) Overdisper-
sion, i.e., the variance is greater than the mean; (2) Zero-inflated,
∗ Corresponding author. Tel.: +34 928451803; fax: +34 928451829.
E-mail address: egomez@dmc.ulpgc.es (E. Gómez-Déniz).
i.e. the presence of a high percentage of zero values in the empiri-
cal distribution. In view of this, many attempts have been made in
statistical literature, and particularly in the actuarial field, to find a
probabilistic model for the distribution of the number of counts.
In this paper, a new discrete distribution is introduced and
obtained from the cumulative distribution function (cdf) defined
as follows. For a random variable N taking non-negative integer
{0, 1, . . .}, the expression
Pn = Pr(N ≤ n) = 1− log(1− αθ
n+1)
log(1− α) , (1)
is a genuine cdf, depending on two parameters, α < 1, α ≠ 0
and 0 < θ < 1, since it is simple to see that it is non-negative and
strictly increasing and goes to one when n goes to infinity.
This new distribution can be considered as an alternative one
to the negative binomial, Poisson-inverse Gaussian, hyper-Poisson
and generalized Poissondistributions, amongothers. Aswewill see
later, the new distribution is unimodal with a zero vertex.
By taking in (1) limit when the parameter α tends to zero and
applying L’Hospital’s rule it is easy to derive that (1) is reduced
to Pn = 1 − θn+1, i.e. the cdf of the geometric distribution with
parameter 0 < θ < 1 and probability mass function (henceforth,
pmf) Pr(N = n) = (1 − θ)θn. Therefore, since the geometric
distribution is obtained as a limiting case, it constitutes anotherInsurance: Mathematics and E
Contents lists availa
Insurance: Mathema
journal homepage: www
A new discrete distribution with actuaria
Emilio Gómez-Déniz a,∗, José María Sarabia b, Enriqu
a Department of Quantitative Methods, University of Las Palmas de Gran Canaria, Gran Ca
b Department of Economics, University of Cantabria, Santander, Spain
a r t i c l e i n f o
Article history:
Received September 2010
Received in revised form
January 2011
a b s t r a c t
A new discrete distribution d
this paper. The new distribut
variance) and underdispersio
of its parameters. Besides, an0167-6687/$ – see front matter© 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.insmatheco.2011.01.007conomics 48 (2011) 406–412
le at ScienceDirect
tics and Economics
.elsevier.com/locate/ime
l applications
Calderín-Ojeda a
naria, Spain
epending on two parameters, α < 1, α ≠ 0 and 0 < θ < 1, is introduced in
ion is unimodal with a zero vertex and overdispersion (mean larger than the
n (mean lower than the variance) are encountered depending on the values
equation for the probability density function of the compound version,when
E. Gómez-Déniz et al. / Insurance: Mathem
distribution related to the geometric distribution; see, for example
Jain andConsul (1971), Philippou et al. (1983), Tripathi et al. (1987),
Kemp (2008), Makc˘utek (2008) and Gómez-Déniz (2010).
The new discrete distribution proposed here presents the two-
fold characteristic stated above: it has a zero vertex and can be
overdispersed; therefore, it is a candidate for fitting phenomena of
this nature. Both the examples of real data provided here and the
comparisonwith other distributions in the literature show that the
distribution has an outstanding performance.
One of the advantages of the new distribution is its simplicity
(its pmf contains no special function) and flexibility.
To the best of our knowledge, the discrete distribution
presented here has not been previously addressed in statistical and
actuarial literature.
The structure of the paper is as follows. In Section 2, we present
the newdiscrete distribution and some of its properties. Also, some
results obtained for the particular case 0 < α < 1 are presented,
including an equation for the pmf of the compound version, when
the claim severities are discrete. Section 3 addresses the estimation
of the two parameters of the new distribution. In Section 4 the
expected frequencies are calculated for numerous examples, and
the distribution was found to provide a very satisfactory fit. In the
final section, some conclusions are drawn.
2. The new distribution and its properties
2.1. The general case
In this section we define the new probability mass function and
its properties. We begin by taking into account that for a random
variable N taking non-negative integers {0, 1, . . .} with cdf as in
(1), then the pmf is given by
pn = Pr(N = n) = log(1− αθ
n)− log(1− αθn+1)
log(1− α) . (2)
From (1)wehave that the survival function of a randomvariable
N with pmf (2) is given by
P¯n = 1− Pn−1 = log(1− αθ
n)
log(1− α) . (3)
It is well known that the unimodality property is a significant
feature in many statistical distributions. The following result
shows that the new discrete distribution is unimodal with a zero
vertex.
Proposition 1. The pmf given in (2) is unimodal with a modal value
at n = 0.
Proof. Letting (2) define pn also for non-integer values of n, then
we obtain that for n ≥ 0
p′n log(1− α) =
d
dn
(log(1− αθn)− log(1− αθn+1))
= −αθ
n log θ
1− αθn +
αθn+1 log θ
1− αθn+1
= −αθn log θ
1
1− αθn −
θ
1− αθn+1
= −αθn log θ 1− θ
(1− αθn+1)(1− αθn) ,
that is,
p′n =
−α
log(1− α)
(1− θ)θn log θ
(1− αθn+1)(1− αθn) < 0.
Hence, pn is decreasing.
For further results concerning the unimodality of discrete
distributions, see also Keilson and Gerber (1971), Medgyessy
(1972) and Abouammoh (1987).atics and Economics 48 (2011) 406–412 407
The mean value can be written as
µ(α, θ) = E(N) = 1
log(1− α)
∞−
n=1
log(1− αθn). (4)
Observe that (4) can be rewritten as
E(N) = log((αθ; θ)∞)
log(1− α) ,
where (a; q)∞, the q-Pochhammer symbol (see Askey, 1980, p. 347
and Andrews et al., 1999, p. 488), is defined as
(a; q)∞ =
∞∏
k=0
(1− aqk), 0 < q < 1, (5)
(a; q)∞ is also called q-Pochhammer symbol, since it is a q analog
of the usual Pochhammer symbol. The q-Pochhammer symbol is
quickly evaluated and readily available in standard software such
as Mathematica 7.0.
The second moment around the origin is
E(N2) = 1
log(1− α)
∞−
n=1
(2n− 1) log(1− αθn). (6)
Therefore, the variance can be written as
Var(N) = 1
log(1− α)
∞−
n=1
(2n− 1) log(1− αθn)− (µ(α, θ))2.
Now, we have that
∂µ(α, θ)
∂θ
= −α
log(1− α)
∞−
n=0
(n+ 1)θn
1− αθn+1 > 0. (7)
On the other hand,
∂µ(α, θ)
∂α
= 1
(1− α) log(1− α)
µ(α, θ)− (1− α)
∞−
n=0
θn+1
1− αθn+1
= 1
(1− α) log2(1− α)
×
∞−
n=0
(1− αθn+1) log(1− αθn+1)− (1− α)θn+1 log(1− α)
1− αθn+1 . (8)
Let us now consider the following function g(α) = (1−αθn+1)
log(1 − αθn+1) − (1 − α)θn+1 log(1 − α) for fixed 0 < θ < 1. It
is easy to see that dg(α)dα = θn+1(log(1−α)− log(1−αθn+1)), this
expression is positive for α < 0 and negative for 0 < α < 1.
Therefore, g(α) is a decreasing function for 0 < α < 1 and
an increasing function for α < 0. Now, having into account that
g(0+) = g(0−) = 0, g(1−) = (1 − θn+1) log(1 − θn+1) < 0,
g(−∞) = −∞ as it is easily verified, we conclude that (8) is
always negative.
In conclusion, we have that ∂µ(α,θ)
∂α
< 0 for α < 1, α ≠ 0.
Therefore, the mean increases with θ and decreases with α.
Furthermore, the mean value tends to 0 when α tends to 1 to the
left and to θ/(1− θ), the mean of the geometric distribution with
parameter 0 < θ < 1, when α tends to zero.
The quantile nγ of (2) can be obtained from (1) and it is given
by
nγ =
[
1
log θ
log
1− (1− α)1−γ
α
− 1
]
,
where [·] denotes the integer part. In particular, the median is
1
1−√1− α
n0.5 = log θ log α − 1 .
inferred that all of them seem to be positively skewed. Moreover,
the modal value is always at n = 0.
2.2. The special case 0 < α < 1
In this subsection, we study the particular case 0 < α < 1.
Mixtures of distributions are usually used in actuarial statistics
for modeling the number of claims incurred during a given
period for an insurance portfolio. In this regard, in automobile
insurance it is usually assumed that the number of claims of aFig. 1. Some examples of probability masswith parameter 1− θβ is given by θβn, we have that
∞−
β=1
θβn
−1
log(1− α)
αβ
β
= −1
log(1− α)
∞−
β=1
(αθn)β
β
= log(1− αθ
n)
log(1− α) ,
which coincides with (3).
It is well known that the geometric distribution can be obtained
from the mixture of a Poisson distribution with the exponential408 E. Gómez-Déniz et al. / Insurance: Mathematics and Economics 48 (2011) 406–412
Table 1
Mean (above) and variance (below) of the new probability distribution for different
values of α and θ . Observe that for α = 0 we obtain the geometric distribution.
α −50 −25 −5 −1 −0.1 0 0.1 0.5 0.9
θ
0.10 0.572 0.461 0.256 0.257 0.116 0.111 0.105 0.082 0.082
0.509 0.421 0.258 0.165 0.128 0.123 0.118 0.093 0.053
0.25 1.230 1.036 0.660 0.439 0.346 0.333 0.319 0.253 0.144
1.437 1.202 0.791 0.560 0.459 0.444 0.428 0.351 0.211
0.50 2.918 2.521 1.740 1.253 1.032 1.000 0.965 0.791 0.471
5.796 4.869 3.282 2.424 2.053 2.000 1.943 1.650 1.075
0.75 7.707 6.747 4.844 3.641 3.082 3.000 2.910 2.454 1.544
33.721 28.362 19.227 14.360 12.296 12.000 11.682 10.087 6.935
0.90 21.896 19.270 14.067 10.768 9.228 9.000 8.753 7.485 4.890
251.498 211.574 143.580 107.442 92.182 90.000 87.661 75.955 53.157
To study the behavior of the distribution with varying values of α
and θ , themean and the variance of the newdistribution have been
calculated. These values are shown in Table 1. Thus, the mean of
the generalized geometric distribution is smaller or greater than
the variance, depending on the values of α and θ . It seems that for
0 < θ < 0.10 and α > −25 the distribution is underdispersed
and the mean increases faster than the variance for all parameter
values.
In order to study the behavior of the distribution for different
values of α and θ , the pmf has been calculated. Furthermore, the
histograms have been plotted (see Fig. 1). From the graphs it can be
policyholder follows a Poisson distribution with parameter λ > 0.
Nevertheless, in practice, this assumption is usually rejected, since
the behavior of policyholders is heterogeneous. This means that
the Poisson parameter varies between the policyholders reflecting
the different underlying risks and hence its value cannot be the
same for each insured. Therefore, it is natural to assume that
this parameter varies between the policyholders, representing his
individual risk characteristics, as a random variable following a
certain structure distribution. The unconditional distribution then
becomes a mixed Poisson distribution.
The new pmf presented in this paper for the special case 0 <
α < 1, is useful to be applied in these settings. In fact the pmf (2) is
not only a mixed Poisson distribution but also a mixed geometric
distribution. Some particular results can also be derived, and they
are shown below.
The pmf (2) represents a general family of distributions where
particular case 0 < α < 1 can be obtained bymixing the geometric
distribution with parameter 1 − θβ and allowing parameter β to
follow a logarithmic series distributionwith parameter 0 < α < 1.
Proposition 2. Let N be a random variable which follows the
geometric distribution with parameter 1 − θβ , with 0 < θ < 1
and β = 1, 2, . . . and let us also assume that β is a random variable
following the logarithmic series distributionwith parameter 0 < α <
1, then the unconditional distribution of N has the pmf given in (2).
Proof. Since the survival function of the geometric distributionfunction (2) for different values of α and θ .
E. Gómez-Déniz et al. / Insurance: Mathem
distribution. For that reason, it can be observed that if N follows
the Poisson distribution with parameter λ > 0 and λ follows an
exponential distribution with parameter (1 − θβ)/θβ , then we
have that the unconditional distribution of N is obtained from the
mixture
pn =
∫ ∞
0
e−λ
λn
n!
1− θβ
θβ
exp
−1− θ
β
θβ
λ
dλ = (1− θβ)θβn,
which is the geometric distribution with parameter 1− θβ .
Therefore, the pmf (2) can also be viewed as a Poisson mixture.
That is, if the randomvariableN follows a Poisson distributionwith
parameter λ > 0, then
pn =
∫ ∞
0
e−λ
λn
n!
∞−
β=1
1− θβ
θβ
e−
1−θβ
θβ
λ −1
log(1− α)
αβ
β
dλ.
Therefore, for this particular case the distribution is overdis-
persed (see Karlis and Xekalaki, 2005; Sundt and Vernic, 2009,
p. 66).
Unfortunately, it seems that closed form does not exist for the
continuous probability density function (pdf) f (λ) obtained by the
mixture
f (λ) =
∞−
β=1
1− θβ
θβ
e−
1−θβ
θβ
λ −1
log(1− α)
αβ
β
= −e
λ
log(1− α)
∞−
j=1
1− θ j
j
α
θ
j
e−λθ
−j
, λ > 0. (9)
Observe that (9) is a convex sum of exponential distributions
where the weights are the terms of a logarithmic series distribu-
tion.
Now, the pmf (2) is again obtained easily by mixing the Poisson
distribution with parameter λ > 0 and the continuous distribu-
tion (9).
Let us consider now the following actuarial model. Let N be
the number of claims in a portfolio of policies in a time period.
Let Xi, i = 1, 2, . . . be the amount of the ith claim and S =
X1+X2+· · ·+XN the aggregate claims generated by the portfolio
in the period under consideration. As usual, two fundamental
assumptions are made in risk theory: (1) The random variables
X1, X2, . . . are independent and identically distributed and follows
a discrete (continuous) distribution with pmf (pdf) h(x) and (2)
The random variablesN, X1, X2, . . . aremutually independent. The
distribution of the aggregate claims S is called the compound
distribution and assuming that Xi, i = 1, 2, . . .N are discrete
random variables, the pdf of S is fS(x) = ∑∞n=0 pnh∗n(x), where
h∗n(·) denotes the n-fold convolution of h(·) and pn is given in (2).
There exists an extensive literature dealing with compound
mixture Poisson distributions (Willmot, 1986, 1993; Antzoulakos
and Chadjiconstantinidis, 2004). An extensive review of the topic
can be found in Sundt and Vernic (2009). Based on the recursion
provided by Panjer (1981) for the Poisson distribution, Sundt and
Vernic (2009, chapter 3, p. 68) developed a simple algorithm
to provide the probabilities of the random variable S when the
amount of the single claim follows a discrete distributionwith pmf
h(x).
From the Panjer (1981) recursion for the total claim amount
when the pmf of the Poisson distribution is assumed as the
distribution of the number of claims is given by
fS(x|λ) = λx
x−
y=1
yh(y)fS(x− y|λ), x = 1, 2, . . . (10)while fS(0|λ) = e−λ. Following Sundt and Vernic (2009, p. 68) we
get, aftermultiplying in (10) in both sides by λif (λ) and integratingatics and Economics 48 (2011) 406–412 409
we have that
f iS(x) =
1
x
x−
y=1
yh(y)f i+1S (x− y), i = 0, 1, . . . ; x = 1, 2, . . .
where f iS(x) =
∞
0 λ
ifS(x|λ)f (λ)dλ. Now, starting with fS(0) =
p0 = 1− log(1−αθ)/ log(1−α), the probabilities fS(1), fS(2), . . .
can be evaluated by the algorithm described in Sundt and Vernic
(2009, p. 68).
Let us prove that the new discrete distribution presented in this
paper is infinitely divisible for 0 < α < 1. In order to verify that
result, the following Lemma is previously needed.
Lemma 1. It is satisfied that
θ(1− α) log
1− α
1− αθ
− (1− αθ2) log
1− αθ
1− αθ2
> 0. (11)
Proof. Firstly, we have that log( 1−α1−αθ ) = log(1 + α(θ−1)1−αθ ) and
log( 1−αθ
1−αθ2 ) = log(1 + αθ(θ−1)1−αθ2 ). Now, using the following chain
of inequalities xx+1 < log(1 + x) < x, x > −1, we have that
α(θ−1)
1−α < log(
1−α
1−αθ ) and log(
1−αθ
1−αθ2 ) <
αθ(θ−1)
1−αθ2 , from which
θ(1− α) log
1− α
1− αθ
− (1− αθ2) log
1− αθ
1− αθ2
> αθ(θ − 1)− αθ(θ − 1) = 0.
Now, infinite divisibility is established in the following result.
Proposition 3. For 0 < α < 1 the discrete distribution with pmf
given in (2) is infinitely divisible.
Proof. Firstly, we have that p0 ≠ 0, p1 ≠ 0. Then, we must
prove that {pj/pj−1}, j = 1, 2, . . . forms a monotone increasing
sequence. If we define pn also for non-integer values of n, we have
that for n ≥ 1
d
dn
pn
pn−1
= α(1− θ)θ
n−1 log θ
(pn−1)2 log(1− α)(1− αθn−1)(1− αθn)(1− αθn+1)
× ((1− αθn+1)pn − θ(1− αθn−1)pn−1),
which is positive in (1,∞) if the function φn = (1 − αθn+1)pn −
θ(1 − αθn−1)pn−1 is positive. Now, from (11) it is satisfied that
φ1 = (1− αθ2)p1 − θ(1− α)p0 > 0. On the other hand, φ∞ = 0
and since (1 − αθn+1)p′n − θ(1 − αθn−1)p′n−1 = 0, as it can be
easily verified, we have that φ′n = αθn(pn−1 − θpn) log θ < 0.
Consequently, φn is a decreasing function in (1,∞) and, finally, φn
is positive.
Now, the result follows by applying Theorem 2.1 in Warde and
Katti (1971).
The fact that {pj/pj−1}, j = 1, 2, . . ., forms amonotone increas-
ing sequence requires that {pn} be a decreasing sequence (see John-
son and Kotz, 1982, p. 75), which is congruent with the zero vertex
of the new distribution. Moreover, as any infinitely divisible dis-
tribution defined on non-negative integers is a compound Poisson
distribution (see Proposition 9 in Karlis and Xekalaki, 2005), we
conclude that the new pmf presented in this paper is a compound
Poisson distribution.
Furthermore, the infinitely divisible distribution plays an
important role inmany areas of statistics, for example, in stochastic
processes and in actuarial statistics. When a distribution G is
infinitely divisible then for any integer n ≥ 2, there exists a
distribution Gn such that G is the n-fold convolution of Gn, namely,
G = G∗nn .
410 E. Gómez-Déniz et al. / Insurance: Mathem
Since the newdistribution is infinitely divisible, an upper bound
for the variance can be obtainedwhen 0 < α < 1 (see Johnson and
Kotz, 1982, p. 75), which is given by
Var(N) ≥ p1
p0
= log(1− αθ)− log(1− αθ
2)
log(1− α)− log(1− αθ) .
3. Estimation
Let us consider the method of maximum likelihood estimation.
To do so, assume that x = (x1, . . . , xt) is a random sample of size
t from the distribution (2). The log-likelihood function is
ℓ = −t log(log(1− α))+
t−
i=1
log
[
log
1− αθni
1− αθni+1
]
.
After simplifying, the likelihood equations become
t
1− α − (1− θ) log(1− α)
×
t−
i=1
θni log(1− αθni+1)
(1− αθni)(1− αθni+1) log(1− αθni) = 0, (12)
t−
i=1
θni [θ(1− αθni)− ni(1− θ)] log(1− αθni+1)
(1− αθni)(1− αθni+1) log(1− αθni) = 0. (13)
The solutions of the two log-likelihood Eqs. (12) and (13)
provide the maximum likelihood estimators of α and θ , which can
be obtained easily by a numerical method or by direct numerical
search for the global maximum of the log-likelihood surface.
The second partial derivatives are given by
∂2ℓ
∂α2
= t(1+ log(1− α))
(1− α)2 log2(1− α) − (1− θ)
×
t−
i=1
θ2ni [−1+ θ + (2αθni+1 − 1− θ)(log(1− αθni )− log(1− αθni+1))]
(1− αθni )2(1− αθni+1)2(log(1− αθni )− log(1− αθni+1))2 ,
∂2ℓ
∂θ2
= 1
θ2
t−
i=1
log2(1− αθni+1)
(1− αθni )2(1− αθni+1)2 log2(1− αθni)
×{αθni [αθni(ni(1− θ)
− θ(1− αθni))2 + α2θ2(1+ni) − (xi(1+ 4θ)+ θ2(1+ ni))
+αθni+1(ni(3+ ni)− θ(2+ ni + n2i ))(log(1− αθni)
− log(1− αθni+1))+ ni(1+ θ + ni(1+ θ))]},
∂2ℓ
∂α∂θ
= −1
θ
t−
i=1
θni log2(1− αθni+1)
(1− αθni)2(1− αθni+1)2 log2(1− αθni)
×{α(1− θ)θni(αθni+1 + ni − θ(1+ ni))
+αθni+1(2+ αniθni+1)+ ni − θ(1+ ni)
× (1+ α2θ2ni+1)(log(1− αθni)− log(1− αθni+1))},
from which Fisher’s information matrix can be computed by
using the approximations E(− ∂2ℓ
∂α2
) ≈ − ∂2ℓ
∂α2
|(αˆ,θˆ ), E(− ∂
2ℓ
∂α∂θ
) ≈
− ∂2ℓ
∂α∂θ
|(αˆ,θˆ ) and E(− ∂
2ℓ
∂θ2
) ≈ − ∂2ℓ
∂θ2
|(αˆ,θˆ ), where αˆ and θˆ are the
maximum likelihood estimators of α and θ , respectively.
4. Applications with real data
Data sets shown in Tables 2–5 are used in this section to
illustrate the use of the proposed distribution (ND). We provide
the observed and expected values of the empirical distribution and
compare the latter values with those obtained from the negativeatics and Economics 48 (2011) 406–412
Table 2
Fit of automobile claim data in Great Britain, 1968 (Willmot, 1987).
No. of claims Observed Fitted
ND NB PIG
0 370412 370413.00 370.438.99 370435.00
1 46545 46538.30 46451.28 46476.40
2 3935 3942.39 4030.50 3995.76
3 317 318.57 297.82 307.671
4 28 25.64 20.09 23.12
5 3 2.06 1.28 1.74
Total 421240 421240 421240 421240
Parameters αˆ = −1.349 rˆ = 0.131 φˆ = 0.338
θˆ = 0.080 pˆ = 0.338 µˆ = 0.131
χ2 3.11 7.94 2.74
d.f. 2 2 2
p-value 0.841 0.018 0.254
Lmax −171133.0 −171136.9 −171134.4
Table 3
Fit of automobile claim data in Zaire, 1974 (Willmot, 1987).
No. of claims Observed Fitted
ND NB PIG
0 3719 3719.06 3719.22 3718.58
1 232 228.65 229.90 234.54
2 38 41.85 39.91 34.86
3 7 8.32 8.42 8.32
4 3 1.68 1.93 2.45
5 1 0.40 0.46 0.80
Total 4000 4000 4000 4000
Parameters αˆ = 0.952 rˆ = 0.216 φˆ = 0.017
θˆ = 0.202 pˆ = 0.714 µˆ = 0.086
χ2 3.11 1.17 0.54
d.f. 2 2 2
p-value 0.2111 0.557 0.762
Lmax −1183.97 −1183.55 −1183.52
binomial (NB) whose pmf is given by
Pr(N = n) =
r + n− 1
n
pr(1− p)n, n = 0, 1, . . . ;
r > 0; 0 < p < 1,
and Poisson-inverse Gaussian distribution (PIG) (Willmot, 1987)
with pmf
Pr(N = n) = 1
n!
2φ
π
eφ/µφ−
1
4+ n2
2+ φ
µ2
1
4 (1−2n)
× K 1
2−n
2φ + φ2
µ2
,
where n = 0, 1, . . .,φ > 0, µ > 0 and Ka(·) is themodified Bessel
function of the third kind.
By using Eqs. (12) and (13), we have estimated the two
parameters of the distribution proposed in this paper. Also, the
parameters of the negative binomial and Poisson-inverse Gaussian
distributions were estimated by using the maximum likelihood
method. All tables include the estimators (obtained by maximum
likelihood method), the χ2 statistics, the p-values and the log-
likelihood function values (Lmax in the tables). The χ2 statistic was
computed according to the expression χ2 = ∑li=1(Oi − Ei)2/Ei,
where Oi and Ei denote the observed and expected frequencies of
ni, respectively and l is the number of classes intowhich the sample
was divided.
The first two data sets, shown in Tables 2 and 3, are taken
from Willmot (1987) and they concern the number of automobile
insurance claims per policy in two portfolios from Great Britain
and Zaire, respectively. These data present overdispersion, i.e., the
E. Gómez-Déniz et al. / Insurance: Mathem
Table 4
Distribution of a number of claims of automobile liability policies (Gómez-Déniz
et al., 2008; Klugman et al., 1998).
Count Observed Fitted
ND NB PIG
0 99 96.56 93.07 88.04
1 65 73.26 75.27 81.93
2 57 50.79 50.43 52.65
3 35 32.48 31,46 30.49
4 20 19.54 18.90 17.31
5 10 11.27 11.09 9.91
6 4 6.33 6.41 5.77
7 0 3.50 3.65 3.42
8 3 1.92 2.07 2.06
9 4 1.05 1.16 1.26
10 0 0.57 0.65 0.78
11 1 0.31 0.36 0.48
12 0 0.17 0.20 0.30
Total 298 298 298 298
Parameters αˆ = −2.203 rˆ = 1.522 φˆ = 2.443
θˆ = 0.543 pˆ = 0.468 µˆ = 1.725
χ2 3.36 3.63 6.61
d.f. 4 4 4
p-value 0.4994 0.4583 0.1579
Lmax −528.395 −525.337 −526.496
Table 5
Hospitalizations, per family member, per year (Klugman et al., 1998).
No. of hospitalizations
per family member
Observed Fitted
ND NB PIG
0 2659 2659.02 2659.06 2658.97
1 244 243.79 243.64 244.02
2 19 19.52 19.65 19.24
3 2 1.54 1.51 1.61
≥ 4 0 0.11 0.12 0.14
Total 2924 2924 2924 2924
Parameters αˆ = −0.341 rˆ = 1.314 φˆ = 0.127
θˆ = 0.079 pˆ = 0.93 µˆ = 0.098
χ2 0.08 0.09 1.02
d.f. 1 1 1
p-value 0.7773 0.7641 0.3125
Lmax −969.060 −969.064 −969.067
variance is greater than the mean, and they are heavily skewed.
Furthermore, in these sets there is a high proportion of zero values
(the observed value has a zero vertex). The Poisson distribution is
not suitable to fit these data sets because there is overdispersion.
The negative binomial and Poisson-inverse Gaussian distributions
were used to fit these data by Willmot (1987), concluding that
the Poisson-Inverse Gaussian distribution works better than the
negative binomial.
As mentioned above, the distribution proposed in this paper is
overdispersed and has a zero vertex; therefore it is a candidate for
fitting these data. The new distribution provides a fairly good fit,
being competitive with the Poisson-inverse Gaussian and negative
binomial distributions (see Willmot, 1987).
The following example, shown in Table 4, deals with the
number of claims of automobile liability policies (Gómez-Déniz
et al., 2008; Klugman et al., 1998, pp. 244). Observations are
displayed in the first and second columns. These data are also
overdispersed and right skewed. This sample distribution has a
thicker tail than the one presented above.
By comparing these results with the fit obtained in Gómez-
Déniz et al. (2008) and Klugman et al. (1998, pp. 244), we observe
that the new distribution provides significant improvement over
the Poisson, negative binomial and negative binomial-inverse
Gaussian distributions as judged by its low χ2 value.atics and Economics 48 (2011) 406–412 411
Our last example is related to another line of insurance and it
concerns the number of hospitalizations per family member and
year taken from Klugman et al. (1998, p. 340). They are shown in
Table 5 together with the fit obtained from the new, the negative
binomial and the Poisson-inverse Gaussian distribution. Now, the
new distribution provides the best result if we choose the χ2
test and the value of the log-likelihood function as criteria of
comparison.
This set of data and the distribution chosen can be used to
determine the pricing of a group hospitalization policy given the
expected payment made by the insurer.
It can be seen that the present two-parameter distribution
seems to give a satisfactory fit in all cases, on the basis of the
χ2 statistics and the corresponding p-values. In conclusion, and
taking into account the simple expressions given in Eqs. (12) and
(13) in which the estimates of the parameters are obtained, we
believe the proposed distribution presents an excellent means of
fitting an empirical distribution that presents inflated-zero and
overdispersion.
5. Concluding comments
This paper offers a new two-parameter family of univariate dis-
crete distributions as a possible alternative to the negative bino-
mial, hyper-Poisson, generalized Poisson, Poisson-inverse Gaus-
sian distributions, and of course, to the different generalizations
of the geometric distribution that have been discussed in statisti-
cal literature. We have studied several properties of the distribu-
tion and observed thatmany of its properties are similar to those of
the generalized geometric distribution inGómez-Déniz (2010). The
new distribution has proved to be very useful for modeling count
data which present inflated-zero and/or overdispersion and short
and long tailed count data.
Acknowledgements
We are indebted to the referee for helpful comments which
improved an earlier version of the work.
The authors thank Ministerio de Ciencia e Innovación (projects
SEJ2006-12685 356 and ECO2009-14152, MICINN (EGD) and
SEJ2007-65818 and ECO2010-15455 (JMS)) for partial support of
this work.
References
Abouammoh, A.M., 1987. On discrete α-unimodality. Statistica Neerlandica 41,
239–244.
Andrews, G.E., Askey, R., Roy, R., 1999. Special Functions (Encyclopedia of
Mathematics and its Applications). Cambridge University Press.
Antzoulakos, D., Chadjiconstantinidis, S., 2004. On mixed and compound mixed
Poisson distributions. Scandinavian Actuarial Journal 3, 161–188.
Askey, R., 1980. Ramanujan’s extensions of the gamma and beta functions Richard
Askey. The American Mathematical Monthly 87 (5), 346–359.
Gómez-Déniz, E., 2010. Another generalization of the geometric distribution. Test
19, 399–415.
Gómez-Déniz, E., Sarabia, J.M., Calderín, E., 2008. Univariate and multivariate ver-
sions of the negative binomial-inverse Gaussian distributionswith applications.
Insurance: Mathematics and Economics 42, 39–49.
Jain, G.C., Consul, P.C., 1971. A generalized negative binomial distribution. Siam
Journal of Applied Mathematics 21, 501–513.
Johnson, N.L., Kotz, S., 1982. Developments in discrete distribution, 1969–1980.
International Statistical Review 50, 71–101.
Karlis, D., Xekalaki, E., 2005. Mixed Poisson distributions. International Statistical
Review 73, 35–58.
Keilson, J., Gerber, H., 1971. Some results for discrete unimodality. Journal of the
American Statistical Association 66 (334), 386–389.
Kemp, A.W., 2008. The discrete half-normal distribution. In: Advances in
Mathematical and Statistical Modeling. Ed.. Birkhäuser, pp. 353–365.
Klugman, S., Panjer, H.,Willmot, G., 1998. LossModels. FromData to Decisions. John
Wiley and Sons, New York.
Makc˘utek, J., 2008. A generalization of the geometric distribution and its application
in quantitative linguistics. Romanian Reports in Physics 60 (3), 501–509.Medgyessy, P., 1972. On the unimodality of discrete distributions. Periodica
Mathematica Hungarica 2 (1–4), 245–257.
412 E. Gómez-Déniz et al. / Insurance: Mathem
Panjer, H.H., 1981. Recursive evaluation of a family of compounddistributions. Astin
Bulletin 12, 22–26.
Philippou, A.N., Georghiou, C, Philippou, G.N., 1983. A generalized geometric
distribution and some of its properties. Statistics & Probability Letters 1,
171–175.
Sundt, B., Vernic, R., 2009. Recursions for Convolutions and CompoundDistributions
with Insurance Applications. Springer-Verlag, New York.
Tripathi, R.C., Gupta, R.C., White, T.J., 1987. Some generalizations of the geometric
distribution. Sankhya¯, Series B 49 (3), 218–223.atics and Economics 48 (2011) 406–412
Warde, W.D., Katti, S.K., 1971. Infinite divisibility of discrete distributions II. The
Annals of Mathematical Statistics 42 (3), 1088–1090.
Willmot, G.E., 1986. Mixed compound Poisson distributions. Astin Bulletin 16,
S59–S79.
Willmot, G.E., 1993. On recursive evaluation of mixed Poisson probabilities and
related quantities. Scandinavian Actuarial Journal 114–133.
Willmot, G.E., 1987. The Poisson-inverse Gaussian distribution as an alternative to
the negative binomial. Scandinavian Actuarial Journal 113–127.
学霸联盟