r代写-3D|学霸联盟

r代写-3D

时间：2021-11-03

02.11.21, 09:58
Page 1 of 3https://outlook.office.com/mail/inbox/id/AAQkADg3MGEwN2Rk…iMGE5LWIzZmVlYzM4MWFkNQAQALini%2FXmUGVKlC%2FOuKPcY%2B0%3D
homework #3
OLAT Learning Platform
Sat 30/10/2021 06:02
3 attachments (2 MB)
The Poisson Inverse Gaussian distribution as an alternative to the negative binomial.pdf; The negative binomial-
inverse Gaussian regression model .pdf; A new discrete distribution with actuarial applications.pdf;
Dear all,
Ready for homework #3? Lets make it due November 11th, midnight.

Before reading below, please note: This is how assignments might be posed in industry, or while
doing your PhD, so I want to accustom you to it. I try to write as clearly as possible, but if you
have questions, resist the temptation to immediately ask me, and instead, spend some time, and
figure it out yourself.

Here is the plan: We will fit some distributions to count data, as used for modeling, for
example, the distribution of insurance claims.
First, if needed, quickly review the geometric, negative binomial, and Poisson distributions.
You can use my class notes, see the 4th entry, "LECTURE Slides" on https://www.marc-
paolella.com/fundamental-probability
Next, read this short page to learn what "over-dispersion" means. In particular, the Poisson
has equal mean and variance, but we typically want a distribution such that the mean is
greater than the variance. The negative binomial is very popular for this.
http://sherrytowers.com/2018/04/10/negative-binomial-likelihood-fits-for-overdispersed-
count-data/
Next, we will need some data. Let's use the data sets in tables 2-5 in the following paper, "A
new discrete distribution with actuarial applications", Gómez-Déniz, Sarabia, Calderín-Ojeda,
which is attached to this email. Note that the data set is just the first two columns. In their
tables 2-5, "NB" denotes Negative Binomial, "ND" is their New Distribution, and "PIG" is
Poisson-Inverse Gaussian.
Skim their paper, skipping most of the technical material --- we do not need it. You need the
PMF, which is given (as a difference of the CDF).
Check your favorite programming language for a multivariate optimizer: We want to maximize
the likelihood (in two parameters). In Matlab, their optimization functions only minimize, so
you minimize the negative of the (possibly log) likelihood. Read ahead in my book, chapter 4,
for codes to do this (in Matlab), along with imposing simple parameter constraints. You should
use constraints to avoid the algorithm from trying invalid parameter values. In Matlab, you can
use fmincon, or use my way in chapter 4.
02.11.21, 09:58
Page 2 of 3https://outlook.office.com/mail/inbox/id/AAQkADg3MGEwN2Rk…iMGE5LWIzZmVlYzM4MWFkNQAQALini%2FXmUGVKlC%2FOuKPcY%2B0%3D
For the PIG, I attach the article: Willmot, 1987, "The Poisson-inverse Gaussian distribution as
an alternative to the negative binomial", Scandinavian Actuarial Journal. Again, you can skip
most of it. Note that the PMF is given in his equation 8. You do not need to read how he does
MLE --- we just send the PMF function to the general optimizer.
So, program the MLE for NB, ND, and PIG, and try to get the same MLE parameter values
reported in tables 2-5 in Gómez-Déniz. See if you can get the chi-square statistics also, it
looks very simple to calculate. For NB (negative binomial), there are different ones that people
use, usually called NegBin1 and 2. Figure out which one those authors use. See also
Winkelmann's book mentioned below -- he discusses them both, and others.
Next, see the attached paper "The negative binomial inverse Gaussian regression model with
an application to insurance ratemaking", by Tzougas, Hoon, Lim. They give an EM algorithm,
which is a super example of EM, and avoids numeric integration to get the PMF. But, let's not
do that until we discuss it later in the class. For now, their equation 5 is the integral expression
for the PMF. Program it. Matlab handles \infty in its numeric integration routines -- I hope R
and Python do, else you need to do it manually, i.e., write code to find the upper limit such
that the integral does not change by some tolerance, e.g., 1e-9. This obviously requires
monotonicity at some point in the tail, which is surely the case for a density.
Program also the MLE for the "NBIG" distribution, and use it also for the above data sets.
Obviously, you make a nice report, with tables similar to those in the above paper, and with 5
distributions. The fifth is the Poisson. We do not expect it to perform well at all.
Notice we are in the IID framework. If you wanted to have the parameters depend on
covariates, we are in the realm of the generalized linear model. Classic references are:
Dobson, A. J. An Introduction to Generalized Linear Models. New York: Chapman & Hall, 1990.
McCullagh, P., and J. A. Nelder. Generalized Linear Models. New York: Chapman & Hall, 1990.
and see also "Econometric Analysis Of Count Data", by Zurich colleague Rainer Winkelmann,
obtainable here,
https://1lib.ch/book/781779/8f07d5
see notably section 3.2 on MLE.
The GLM is built into R, and Matlab has it too, glmfit and fitglm.

So, as BONUS, optional material, get your favorite program (R or Matlab, I did not check if
Python has GLM) to do the Poisson and whatever else it can. We do not have covariates, just a
constant term, so we do not need the GLM machinery, but it is good to know about it, and the
basics of using these canned routines. Matlab, pathetically, does not have NB, but it does have
inverse Gaussian. See here,

https://www.mathworks.com/help/stats/glmfit.html

02.11.21, 09:58
Page 3 of 3https://outlook.office.com/mail/inbox/id/AAQkADg3MGEwN2Rk…iMGE5LWIzZmVlYzM4MWFkNQAQALini%2FXmUGVKlC%2FOuKPcY%2B0%3D
I presume the function in R is more versatile -- they have had GLM since the beginning, and were
famous for it. SAS has it too, so if want EVEN MORE BONUS (don't be childish and ask me how
much...), see my book Time Series, the large chapter on using SAS, and do it. You have access
to SAS via the university.

Go nuts, and have fun!
Marc

Sender: Marc Paolella (marc.paolella@bf.uzh.ch), uzh.ch
This message has been sent via the learning platform OLAT: https://lms.uzh.ch/url/RepositoryEntry/17073866381