QRM 9 - Laptop session
Introduction
The objective of the session is
(a). To understand how to use the ghyp package to undertake parametric estimation,
and to obtain risk measures, in both the case of a single asset portfolio, and a
multiple asset portfolio, for non-Normal risk factors.
The ghyp package
As before, to load the package, so that we can use the functions and data it supplies,
we type
library(ghyp)
In session 8 we saw how to use the package to facilitate working with random vari-
ables, in particular Normally distributed ones, and we also saw how to estimate risk
measures for the case of Normal return. Today we look at relaxing the restriction
of Normality. The focus is on estimating risk measures for a more general class of
distributions.
Estimation
In addition to the Normal, we can fit six possible parametric models to the data. They
are called:
(a). The Generalised Hyperbolic (GH)
(b). The Hyperbolic (H)
(c). The Normal Inverse Gaussian (NIG)
(d). The Variance Gamma (VG)
(e). The t distribution (t)
Each one has a symmetric and an asymmetric (skewed) variant. We can fit univariate
or multivariate data in each case. The models are also hierarchically connected, in
that they share the same set of parameters, but for some of the models certain of
the parameters are fixed, and for others they can vary. The parameters are:
Parameter Role
lambda Shape parameter. Free in GH. Constrained to be
positive for VG. Constrained to be less than -1
for t. Fixed for H. Fixed at -0.5 for NIG.
alpha.bar Shape parameter. In all cases, constrained to
be non-negative. Otherwise free for GH, H and
NIG. Fixed at zero for VG and t.
mu Location parameter. Free in all cases.
sigma Dispersion parameter. Free in all cases.
gamma Skew parameter. Zero for symmetric, free for
asymmetric distributions.
The point about all this is not that one needs to learn the details, but to point out the
connections among these different members of the same family. For some parameter
values, the GH and the t are the same. For other parameter values, the GH and the
VG are the same. This is important because other things equal we would prefer a
simple model (one with fewer free parameters) over a more complex model, if the
fit of the former was not much worse than the fit of the latter. In econometrics this
is known as efficiency. This gives a reason for wanting to fit several alternatives and
select the best according to competing criteria of goodness of fit and parsimony.
We now explore how to fit the parameters of a model using returns data. ghyp
comes with some data on Swiss stock returns. They are stored in the data set called
smi.stocks, which we first have to load into the workspace with the data command,
see below. Then I pick out columns 4 and 5 (Nestle and Swisscom) and store in d.
Then I pick out the first column of the result and store in d1 for Univariate estimation.
are already
data(smi.stocks)
d <- smi.stocks[,c(4,5)]
d1 <- d[,1]
Univariate estimation
Possibly the most useful approach is to estimate all possible models (at least, within
a specific class of models), and then select the one which ”fits best”. This judgement
can be made using a statistic called Akaike’s Information Criterion (AIC). Lower is
better. AIC falls, when the goodness of fit, as measured by the Likelihood, rises1. It
rises, when the model has more free parameters, so in this way models with fewer
free parameters are favoured.
The function which does this estimation is stepAIC.ghyp.
stepAIC.ghyp(d1)
After running this you will get something like below, though I have rounded and cutoff
the last two columns
1AIC is calculated as AIC = 2(k − LL), where k is the number of parameters, and LL is the logarithm
of the likelihood.
model symmetric lambda alpha.bar mu sigma gamma aic llh
8 NIG TRUE -0.50 0.65 0.00032 0.013 0.00000 -10658 5332
6 ghyp TRUE -0.84 0.60 0.00032 0.013 0.00000 -10656 5332
3 NIG FALSE -0.50 0.65 0.00048 0.013 -0.00024 -10656 5332
1 ghyp FALSE -0.84 0.60 0.00045 0.013 -0.00021 -10655 5332
10 t TRUE -1.67 0.00 0.00030 0.014 0.00000 -10654 5330
5 t FALSE -1.67 0.00 0.00039 0.014 -0.00016 -10652 5330
7 hyp TRUE 1.00 0.39 0.00035 0.013 0.00000 -10645 5325
2 hyp FALSE 1.00 0.39 0.00059 0.013 -0.00035 -10643 5326
9 VG TRUE 1.23 0.00 0.00029 0.013 0.00000 -10639 5322
4 VG FALSE 1.23 0.00 0.00064 0.013 -0.00040 -10637 5323
11 gauss TRUE NA Inf 0.00024 0.013 0.00000 -10362 5183
This shows that the symmetric NIG is "best" according to AIC. Although the likelihoods
are similar to the next three, AIC ranks it higher as it has more fixed parameters.
Having identified the best model, we can fit it and store the results of the fitting for
diagnostics and for obtaining risk measures.
nf <- fit.NIGuv(d1,symmetric=TRUE)
The fit.NIGuv command fits the data to the parameters and stores the results in nf.
We specify that we want a symmetric distribution to be fitted. For the other fitting
commands, see ?fit.ghypuv.
Some commands to confirm the suitability of the fitted model include
nf
plot(nf)
hist(nf)
qqghyp(nf,gaussian=FALSE)
which should be self explanatory, though note that the last command is a QQ plot of
the data against the NIG. The option is so we do not get a Normal QQ plot as well, as
it clutters the graph in my view.
We should also check the adequacy of the fit - the Kolmogorov-Smirnov test is a
sensible choice for univariate models. The Peacock test is useful for bivariate models.
In each case the cdf of the data is compared to that of a reference distribution. The
statistic is the largest difference between the two cdfs. "Large" values indicate an
inadequate model. We can represent the reference distribution by using a large
random sample from the fitted distribution. Implementing this for our best fitting
model:
s1 <- rghyp(10000, nf)
ks.test(s1,d1)
A p-value greater than 0.05 indicates an adequate fit at the 5% significance level.
So we now have a fitted object which seems to be a satisfactory representation of
the returns to Nestle.
We can easily use this as we did above using transform to get the loss distribution
and the risk measures. As before, I suppose the portfolio value is 1000
pv <- 1000
ld <- transform(nf,multiplier=-pv)
ld
ESghyp(0.95,ld,distr="loss")
qghyp(0.95,ld)
Exercise: Find out what would VaR and ES have been if we had fitted a Normal
distribution to the returns? By how much do they differ, in absolute and precentage
terms. Does this strike you as negligible? What kind of error would you call this?
Bivariate Estimation
In the case of Bivariate returns, the process is similar. We can use stepAIC.ghyp
to identify the best fitting model, this time using the full dataset d containing Nestle
and Swisscom. It takes some time, so maybe you should do that later, and take if
from me that the winner is the bivariate symmetric NIG. We can fit that using (see
?fit.ghypmv) the following:
mf <- fit.NIGmv(d)
mf
Now we have out fitted object mf. Plots are similar as before, but we have to select
which variable we want the plot of, 1 for Nestle, 2 for Swisscom. Eg
plot(mf[1], type = 'l')
plot(mf[2], type = 'l')
hist(mf[1])
qqghyp(mf[1])
hist(mf[2])
qqghyp(mf[2])
and so on.
There is also a bivariate plot which we can see using the pairs function.
pairs(mf)
To check if the fit is adquate, we can use the Peackock test, comparing a sample
drawn from the fitted model, with the actual returns data.
library(Peacock.test) # install first if necessary
s <- rghyp(1000, mf) #should use 10000 but too slow for class
peacock2(d,s)
The model is adequate (at 5% significance level) if the test statistic is less than than
1.83, for effective sample size greater than 50. To check the effective sample size:
n1 <- dim(d)[1]
n2 <- dim(s)[1]
eff <- (n1*n2)/(n1+n2)
eff
So the effective sample size is greater than 50
Now we are happy with the fitted model, let us finish by obtaining the risk measures.
Suppose again the portfolio value is 1000, and the weights are 75% Nestle and 25%
Swisscom. As in the previous session, we transform the Bivariate object into the loss
distribution.
pv <- 1000
wa <- 0.75
wb <- 0.25
mult <- -pv*c(wa,wb)
ld <- transform(mf,multiplier=mult)
ld
and finally to the risk measures:
ESghyp(0.95,ld,distr="loss")
qghyp(0.95,ld)
Exercise: Compare with what you would have found for VaR and ES had you used
the Normal distribution.
Thanks for taking this tutorial.
Resources
NB these are quite complete, and therefore quite technical, references.
The ghyp manual: http://cran.r-project.org/web/packages/ghyp/ghyp.pdf