1MATH5340-ee-复购代写
时间:2024-05-02
1MATH5340 Risk Management
2023/24
Lecture notes - early draft!!!
Jan Palczewski
University of Leeds
April 19, 2024
Contents
1 Introduction 3
1.1 Module Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Introduction to Risk Management . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Modelling Risks and Losses . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Risk measures 13
2.1 Modeling Risks and Losses . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Example: The case of a portfolio of stocks . . . . . . . . . . . . . . . . . . . . 14
2.3 Example: The case of a portfolio of one option . . . . . . . . . . . . . . . . . 16
2.4 Risk measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.1 Value at Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.2 Expected Shortfall (or Average Value at Risk) . . . . . . . . . . . . . . 20
2.5 Transformations of random variables . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 Approaches for computing VaR and ES in practice . . . . . . . . . . . . . . . 23
2.6.1 Variance-covariance method . . . . . . . . . . . . . . . . . . . . . . . 23
2.6.2 Historial estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6.3 Monte Carlo simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.7 Risk measures in practice: Scaling . . . . . . . . . . . . . . . . . . . . . . . . 25
2.8 Back-testing and stress-testing . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3 Copulas 28
3.1 Modelling dependence with Copulas . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.1 Multi-variate distributions - revision . . . . . . . . . . . . . . . . . . . . 28
3.1.2 Copulas: what are they? . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.1.3 2-dimensional copulas . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1.4 d-dimensional copulas . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.1.5 Sklar’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Properties of copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2.1 Invariance properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2.2 Fréchet-Hoeffding bounds . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Further examples of copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.1 Archimedian copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.2 Implicit Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Meta distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 How to "extract" a copula from a given joint distribution? . . . . . . . . . . . . 40
3.6 Simulation from a meta distribution . . . . . . . . . . . . . . . . . . . . . . . 41
3.6.1 Simulation from meta distributions with an implicit copula . . . . . . . . 41
3.6.2 Simulation from an Archimedian copula . . . . . . . . . . . . . . . . . 42
1
CONTENTS 2
3.7 Measuring dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.7.1 Linear correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.7.2 Rank correlation coefficients . . . . . . . . . . . . . . . . . . . . . . . 45
3.7.3 Tail dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.8 Calibration of copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4 Credit risk 51
4.0.1 Types of credit risk models . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1 Mixture models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.1.1 Bernoulli mixture model . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 Structural models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.1 Threshold models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.2 Merton’s model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Lecture 1
Introduction
1.1 Module Information
• Lecturer: Dr. Jan Palczewski
• E-mail: J.Palczewski@leeds.ac.uk
• Position: Associate Professor in Financial Mathematics
• Affiliation: School of Mathematics, University of Leeds
• Research Areas: stochastic analysis, stochastic control, optimal stopping, game theory,
financial mathematics
• Office hours: Thursdays 10-11 am (1 Feb–21 Mar, 25 Apr–9 May) in MATH 8.15
Main text book:
• McNeil, Frey & Embrechts (2005 or 2015), Quantitative Risk Management, Princeton
University Press
Additional Reading:
• Meucci (2007), Risk and Asset Allocation, Springer.
• cf. also reading list on Minerva
Assessment:
• Written Exam (70% of the module mark) and
• 1 assignment (project) during the course (30% of the module mark).
For the assignment:
– Individual assignment
– Deadline: 1 May 2024, 2 pm (extension up to 14 calendar days)
– Feedback within 3 weeks from submission.
Resit: Written Exam (100% of the module mark)
3
LECTURE 1. INTRODUCTION 4
Structure and feedback
Structure:
• 10 x 2-hour lectures
• 10 x 1-hour seminars (sometimes longer)
• 3 x 2-hour practicals in computer cluster
Feedback:
• Individual assignent in response to student feedback
• The delivery of practicals shaped by students (you can influence it this year too!)
• Responsive to feedback during semester (preferably through a student rep and reflective
of views of more than 1-2 people)
• If something unclear in a class please ASK!
Main Topics Covered
• Risk factors and loss distributions
• Risk measures Value at Risk, Expected Shortfall,...
• Methods for measurement of Market Risk Variance-Covariance, Historical Simulation,
Monte Carlo and more...
• Multivariate models
• Copulas definition, applications, simulation methods...
• Credit Risk types of models, structure, commercial models...
• Liquidity and Operational Risk if time allows
1.2 Introduction to Risk Management
What is Risk Management?
• Process of identifying sources of risk, quantifying risk and developing strategies to man-
age risk.
• The focus is on risks that result in negative consequences for economic activity
In this module, we will learn about quantitative risk management, or the mathematics
of risk management for finance and insurance.
Some examples:
• an investor holding a stock with uncertain future price
• an insurance company selling an insurance policy
• a variable-rate mortgage
LECTURE 1. INTRODUCTION 5
Overview of types of risks
• Market Risk
• Credit Risk
• Operational Risk
• Other types of risk in current research agenda include Liquidity Risk and Model Risk.
Market Risk
Market risk is the risk of loss in a financial position due to asset price and interest rate
fluctuations, foreign exchange rate changes, and changing markets. We talk about:
• Equity risk: risk that one’s investments will depreciate because of the stock market
dynamics.
• Interest rate risk: risk that the value of a security, especially a bond, will decrease due
to an increase in the interest rate.
• Currency risk: the returns on foreign investments depend on exchange rates.
Credit Risk
Credit risk is the risk carried by the lender that a debtor may not repay the debt, or, more
generally, that a counterparty in a financial agreement may not fulfil the commitments.
• One of the underlying causes of the 2007/2009 financial crisis.
• Complex derivative products.
• Dependence structure modelling is important.
Operational Risk
Operational risk is the risk of losses resulting from inadequate or failed internal processes,
people, and systems, or from external events. This includes people risks such as incompetence
and fraud, and technology risk such as system failure or programming errors.
Operational Risk Events (according to Basel II regulatory framework):
• Internal Fraud - misappropriation of assets, tax evasion, intentional mismarking of posi-
tions, bribery
• External Fraud - theft of information, hacking damage, third-party theft and forgery
• Employment Practices and Workplace Safety - discrimination, workers compensation,
employee health and safety
LECTURE 1. INTRODUCTION 6
Operational Risk (continued)
• Clients, Products, & Business Practice - market manipulation, antitrust, improper
trade, product defects, fiduciary breaches, account churning
• Damage to Physical Assets - natural disasters, terrorism, vandalism
• Business Disruption & Systems Failures - utility disruptions, software failures, hard-
ware failures
• Execution, Delivery & Process Management - data entry errors, accounting errors,
failed mandatory reporting, negligent loss of client assets
Operational risk excludes strategic risk, arising from poor business decisions. Operational
risk excludes reputational risk, damage due to loss of reputation.
Liquidity risk
Liquidity risk is the risk that an order cannot be executed by the market quickly enough (to
prevent/minimise losses)
• Market failure: a party interested in selling an asset cannot do it because it cannot find
anyone that would be interested in buying the asset.
• Liquidity problems may stem from falling credit ratings or large cash-out flow.
• Compounds with other risks (credit risk)
Model Risk
Model risk is the risk of using an inappropriate model. It is always present to some extent.
Our task in this module: measure and manage risk
• Measure risk: includes investigating the (probability) distribution of risky positions
• Manage risk:
– determine the capital to hold in order to absorb future losses
– ensure portfolios are well diversified
– manage counterparty risk exposure to important trading partners
– ... and more
A Brief history of Risk Management...
• First examples date back to Babylon
• Before 1950s and after 1950s: Harry Markowitz (1952) and his theory of portfolio selec-
tion.
• After Markowitz: rapid development in the theory of risk management
• 1990s: academic innovations, information technology, financial crises...
LECTURE 1. INTRODUCTION 7
Some well known financial disasters
• Barings bank, in 1995, was bankrupted by $1.4 billion of unauthorized trading losses
by a single trader, Leeson, failing to simultaneously make his arbitrage trades on Nikkei
futures between 2 exchanges.
• Long-term Capital Management (LTCM) was a hedge fund founded and assembled by
a star team of traders and academics. In 1998 as Russia defaulted on its government
bonds, a massive switch from Japanese and European to US Treasury bonds caused
prices to diverge and resulted in a loss of $1.85 billion in a few weeks.
• Lehman Brothers filed for bankruptcy in September 2008 as a result of its huge expo-
sure to the sub-prime mortgage market. In the financial crisis, governments had to bail
companies out by injecting capital or acquiring their distressed assets (e.g. US Troubled
Asset Relief Program).
Regulators
• Regulation aims to ensure that financial institutions have enough capital to remain sol-
vent.
• Financial institutions put aside regulatory capital to be able to cover most financial
losses.
• The amount of regulatory capital needed is related to the amount of risk taken.
• International standards and methods for computing regulatory capital: Basel Committee
on Banking Supervision (established in 1974). Basel Committee formulates supervisory
standards and guidelines (the Basel accords) that single national authority may use
according to national rules, to compute regulatory capital.
• For the UK: Financial Conduct Authority and Prudential Regulation Authority (former
FSA).
Basel Accords
• Basel I, 1988:
– International minimum capital standard
– focus on credit risk
• Basel II, 1999: 3 Pillar system of regulation
– Pillar 1: quantification of regulatorty capital ( Standardized model for market risk
(internal VaR models) and capital requirements for operational risk.)
– Pillar 2: imposes regulatory oversight of modeling process
– Pillar 3: defines a comprehensive set of disclosure requirements
• Basel 2.5, 2004: to deal with arbitrage by transferring risk
• Basel III, 2011 with amendments up to 2016 (targeted end date of implementation 2022):
Capital, leverage and liquidity requirements.
• Comprehensive range of reports: http://www.bis.org/bcbs/publications.htm
LECTURE 1. INTRODUCTION 8
Solvency Accords
Parallel developments in insurance regulation.
• Solvency I: simple and crude rules (minimum guaranteed capital proportional to volume
of transactions).
• Solvency II: more risk-sensitive rules. Solvency II was adopted by the Council of the
European Union and the European Parliament in 2009 and implemented by 2016
• 2020 Review of Solvency II
Why is Risk Management necessary?
• to protect investors and shareholders’ values
• to guarantee the stability of the financial system (regulation)
• to protect against extreme losses
• to be used as a management tool, ...
Challenges
• address unexpected, abnormal or extreme outcomes
• risk has a mutivariate nature: Interconnected markets that help strengthen stability but
also risk contagion (e.g., subprime mortgage crisis in 2007-2009, Euro-zone debt crisis).
• Increase in trade volume and complexity in financial markets.
• Increased complexity of financial products,...
1.3 Modelling Risks and Losses
The Model
• Let (Ω,F ,P) be the underlying probability triplet, where
– Ω is the set of all possible scenarios
– F is a sigma-algebra
– P is a probability measure.
• There is a finite number of instants of time (which are equidistant).
• The distance between two consecutive time instants is fixed, given by ∆, where ∆ > 0.
(Vocabulary: ∆ is called the fixed time step) −→ In the applications, ∆ is equal to: one
day (∆= 1/365 or ∆≈ 1/250), ten days, one month, or one year.
LECTURE 1. INTRODUCTION 9
The Model
Consider a portfolio such as, e.g.,
• a collection of stocks or bonds
• a book of derivatives
• a collection of risky loans
• a financial institution’s overall position in risky assets.
Let Vt denote the value of the portfolio at time t. We assume that Vt is a random variable
on (Ω,F ,P). We assume that Vt is observable at time t.
The Model
• Vt+∆ is the value of the portfolio at time t+∆. → Vt+∆ is a random variable (r.v.). The
value of Vt+∆ is known at time t+∆. But, it is typically not known at time t.
• The difference Vt+∆−Vt corresponds to the profit over the time period from t to t+∆.
• The probability distribution of the random variable Vt+∆−Vt is called the profit-and-loss
distribution (abridged as P&L distribution).
Loss and Loss distribution
• We set Lt+∆ :=−(Vt+∆−Vt).
• Note that Lt+∆ is a r.v. on (Ω,F ,P).
The r.v. Lt+∆ corresponds to the loss over the time period from t to t+∆. This loss is
unknown at time t (and known at time t+∆).
• With this convention, positive values of Lt+∆ correspond to losses, and negative values
of Lt+∆ correspond to negative losses (that is, to profits).
• The probability distribution of the r.v. Lt+∆ is called the loss distribution.
Risk factors
• We assume that there are d risk factors with d ∈ {1,2, . . . ,} having an influence on the
portfolio value.
• The value of risk factor i at time t is modelled by a random variable, denoted by Zit . −→
Hence, Z1t ,Z
2
t , . . . ,Z
d
t are the values of the d risk factors at time t.
Examples of risk factors: (logarithmic) prices of financial assets; yields; (logarithmic)
exchange rates;
Notation: Recall that, if we are given a vector, we denote by ′ the transposed vector.
We denote by Zt the random vector Zt = (Z1t ,Z2t , . . . ,Zdt )′. Note that Zt is a random vector
of d random variables.
LECTURE 1. INTRODUCTION 10
• The model assumes that the value Vt of the portfolio at time t can be expressed as a
deterministic function f of time and of the risk factors (at time t).
With mathematical notation,
Vt = f (t,Zt) = f (t,Z1t ,Z
2
t , . . . ,Z
d
t ),
where f is a given deterministic (that is, non-stochastic) function from R+×Rd to R.
Risk factors
Assumption 1.3.1. • f is a known differentiable function,
• Zt is a d-dimensional random vector observable at time t, but Zt+∆ is unknown at time
t.
• We set: Xt+∆ := Zt+∆−Zt . Note that Xt+∆ is a random vector.
The random vector Xt+∆ corresponds to the change in the risk factors between time
instants t and t+∆.
With the above notation, the portfolio loss Lt+∆ at time t+∆ can be expressed as follows:
Lt+∆ =−(Vt+∆−Vt)
=−
(
f (t+∆,Zt+∆)− f (t,Zt)
)
=−
(
f (t+∆,Zt +Xt+∆)− f (t,Zt)
)
.
We will now explain how to approximate Lt+∆ by using linear approximation (that is, Taylor
approximation of order 1).
To this purpose, we will introduce the so-called loss operator.
Loss operator
We define the loss operator at time t by:
l[t](δ,x) :=−( f (t+δ,Zt+x)− f (t,Zt)) , for δ ∈ R+, for x ∈ Rd .
Note that for a fixed value Zt = z, the loss operator l[t] is a function from R+×Rd to R.
We can express the portfolio loss Lt+∆ by using the loss operator l[t]. We have
Lt+∆ =−( f (t+∆,Zt +Xt+∆)− f (t,Zt)) = l[t](∆,Xt+∆).
−→ As f is by assumption differentiable (cf. Assumption), we can use Taylor expansion (of
order 1) to approximate f .
LECTURE 1. INTRODUCTION 11
Linear approximation of portfolio loss
• Recall the definition of the loss operator:
l[t](δ,x) :=−( f (t+δ,Zt +x)− f (t,Zt)) , for δ ∈ R+, for x ∈ Rd .
• We approximate l[t](δ,x) for small δ and small x (1st -order Taylor approximation). We
get
l[t](δ,x)≈−
[
∂ f
∂t
(t,Zt)δ+
d

i=1
∂ f
∂zi
(t,Zt)xi
]
. (1.1)
• From this and from the expression of the loss Lt+∆ as Lt+∆ = l[t](∆,Xt+∆) (cf. above),
we get an approximation for Lt+∆:
Lt+∆ ≈−
(
∂ f
∂t
(t,Zt)∆+
d

i=1
∂ f
∂zi
(t,Zt)X it+∆
)
.
Linear approximation of portfolio loss
• For notational convenience, we will introduce notation for the linear approximation of the
loss, by setting:
Llint+∆ :=−
(
∂ f
∂t
(t,Zt)∆+
d

i=1
∂ f
∂zi
(t,Zt)X it+∆
)
.
Note that Llint+∆ is a random variable, whose realization becomes known at time t +∆.
Terminology: We will call Llint+∆ the linearized loss.
• Remember: The loss Lt+∆ is approximated by the linearized loss Llint+∆.
Linearisation
• Why do we use linearisation? −→ Linearisation can be convenient because linear func-
tions of the risk factor changes may be easier to handle analytically.
• We note that linearisation is important to the so-called variance-covariance method (cf.
later in the lectures).
• When to use linearisation? −→ To use this linear approximation, the time step ∆ has to
be small and the changes in the risk factors have to be small as well.
Particular case: No explicit time-dependence in function f
We consider the particular case where the function f (which connects the value of the
portfolio to the risk factors) does not depend explicitly on the time t. Hence,
Vt = f (Zt)
and
Vt+∆ = f (Zt+∆).
In this case, the expression for the linearized loss is simpler.
More precisely, we have :
LECTURE 1. INTRODUCTION 12
• The random vector Xt+∆ of risk-factor changes: Xt+∆ := Zt+∆−Zt (same definition as
before).
• The portfolio loss Lt+∆ at t+∆:
Lt+∆ =−(Vt+∆−Vt) =−( f (Zt +Xt+∆)− f (Zt)).
Particular case: No explicit time-dependence in function f
• The loss operator becomes: l[t] : Rd → R
l[t](x) =−( f (Zt +x)− f (Zt)), for x ∈ Rd .
• We note that Lt+∆ = l[t](Xt+∆).
• First order approximation of l[t](x) :
l[t](x)≈−
d

i=1
∂ f
∂zi
(Zt)xi. (1.2)
Linear approximation of portfolio loss
The linearized loss is then given by
Llint+∆ =−
d

i=1
∂ f
∂zi
(Zt)X it+∆.
Lecture 2
Risk measures
2.1 Modeling Risks and Losses
The Model
Consider a portfolio.
• Let Vt denote the value of the portfolio at time t. −→ We assume that Vt is a random
variable on (Ω,F ,P).
• Vt+∆ is the value of the portfolio at time t+∆. −→ Vt+∆ is also a random variable.
• The difference Vt+∆−Vt corresponds to the profit over the time period from t to t+∆.
• We set Lt+∆ :=−(Vt+∆−Vt).
−→ The r.v. Lt+∆ corresponds to the loss over the time period from t to t+∆.
• Terminology: The probability distribution of the r.v. Lt+∆ is called the loss distribution.
Risk factors
• We assume that there are d risk factors (where d ∈ {1,2, . . . ,}) having an influence on
the portfolio value (hence, on the loss).
• The value of risk factor i at time t is modelled by a random variable, denoted by Zit .
• We denote by Zt the random vector Zt = (Z1t ,Z2t , . . . ,Zdt )′.
Examples of risk factors: (logarithmic) prices of financial assets; yields; (logarithmic)
exchange rates;...
• The model assumes: The value Vt of the portfolio at time t can be expressed as a
deterministic function f of time and of the risk factors (at time t).
With mathematical notation,
Vt = f (t,Zt) = f (t,Z1t ,Z
2
t , . . . ,Z
d
t ),
where f is a given deterministic (that is, non-stochastic) function from R+×Rd to R.
13
LECTURE 2. RISK MEASURES 14
Risk factors
Assumption 2.1.1. • f is a known differentiable function,
• Zt is a d-dimensional random vector observable at time t, but Zt+∆ is unknown at time
t.
• We set: Xt+∆ := Zt+∆−Zt .
The random vector Xt+∆ corresponds to the change in the risk factors (or risk factors
change) between time instants t and t+∆.
With the above notation, the portfolio loss Lt+∆ at time t+∆ can be expressed as follows:
Lt+∆ =−(Vt+∆−Vt)
=−
(
f (t+∆,Zt+∆)− f (t,Zt)
)
=−
(
f (t+∆,Zt +Xt+∆)− f (t,Zt)
)
= l[t](∆,Xt+∆).
Remark: The first equality comes from the definition of the loss, the second comes from
the model, and the third comes from the definition of the risk factors change. The last equality
recalls the loss operator.
Linear approximation of portfolio loss
• Using 1-order Taylor expansion, we obtained the following linear approximation of the
portfolio loss:
Llint+∆ :=−
(
∂ f
∂t
(t,Zt)∆+
d

i=1
∂ f
∂zi
(t,Zt)X it+∆
)
.
Note that Llint+∆ is a random variable, whose realization becomes known at time t +∆.
Terminology: We will call Llint+∆ the linearised loss.
• Remember: The loss Lt+∆ is approximated by the linearised loss Llint+∆.
2.2 Example: The case of a portfolio of stocks
Example: The case of a portfolio of stocks
• We consider a portfolio of d stocks.
• We denote by λi the number of shares of stock i (where i= 1, . . . ,d).
• We denote by Sit the price of stock i at time t.
• Then, we have Vt = ∑di=1λiSit and Vt+∆ = ∑
d
i=1λiSit+∆.
• Hence, the loss Lt+∆ can be expressed as
Lt+∆ =−(Vt+∆−Vt)
=−(
d

i=1
λiSit+∆−
d

i=1
λiSit) =−(
d

i=1
λi(Sit+∆−Sit)).
LECTURE 2. RISK MEASURES 15
• It is standard practice in risk management to use the logarithmic stock prices as risk
factors. So, in this example, Zit := lnS
i
t .
• The value of the portfolio at t in this example is thus
Vt =
d

i=1
λiSit =
d

i=1
λieln(S
i
t ) =
d

i=1
λieZ
i
t .
• The value of the portfolio at t+∆ in this example satisfies
Vt+∆ =
d

i=1
λiSit+∆ =
d

i=1
λieln(S
i
t+∆) =
d

i=1
λieZ
i
t+∆ .
• The change of risk factor i between t and t+∆:
X it+∆ = Z
i
t+∆−Zit = lnSit+∆− lnSit .
−→ Hence, in this example, X it+∆ corresponds to the log-return of stock i.
• Portfolio loss Lt+∆ expressed in terms of the factor changes
Lt+∆ =−
d

i=1
λieZ
i
t (eX
i
t+∆ −1) =−
d

i=1
λiSit(e
X it+∆ −1).
• Question: What is the expression for the function f in this example?
• Recall that the value of the portfolio at t in this example is:
Vt =
d

i=1
λiSit =
d

i=1
λieln(S
i
t ) =
d

i=1
λieZ
i
t .
• The function f in this example is equal to
f (z1,z2, . . . ,zd) =
d

i=1
λiez
i
.
• Question: What is the expression for the linearised loss Llint+∆ in this example?
For this, we need first to compute the partial derivatives of the function f (cf. whiteboard).
Llint+∆ =−
d

i=1
λieZ
i
tX it+∆
=−
d

i=1
λiSitX
i
t+∆.
• The linearised loss Llint+∆ is an approximation of the loss Lt+∆.
LECTURE 2. RISK MEASURES 16
2.3 Example: The case of a portfolio of one option
Example: A portfolio of one call option
Portfolio consisting of one call option on a stock S with maturity date T and exercise price
K
• It can be shown that, the price of the call option at time t is:
c(t,St ,rt ,σt ;K,T ),
where c is a deterministic function (this is called Black-Scholes formula).
• Risk factors: log-price of the stock, interest rate and volatility, that is Zt = (lnSt ,rt ,σt)′
• Vt+∆ = c(t+∆,St+∆,rt+∆,σt+∆;K,T )
Example: A portfolio of one call option
• Risk-factor changes:
Xt+∆ = (X1t+∆,X
2
t+∆,X
3
t+∆)
′ = (lnSt+∆− lnSt ,rt+∆− rt ,σt+∆−σt)′
• Loss:
Lt+∆ =−(c(t+∆,St+∆,rt+∆,σt+∆;K,T )− c(t,St ,rt ,σt ;K,T ))
• Here, the function f equals to the function c.
• Linearised loss:
Llint+∆ =?
Where do we get the loss distribution from?
There are different approaches (or methods) used in practice.
• Analytical approach: choose a model for the risk factor change Xt+∆ and a mapping
function f , so that Lt+∆ can be expressed analytically. Example: the so-called variance-
covariance method
• Historical estimation: estimating the loss distribution by using the empirical distribution
of past risk factor changes. Hypotheses?
• Monte Carlo approach: simulation of an explicit parametric model for risk factor changes.
We will have a closer look at these approaches in the later lectures.
2.4 Risk measures
Risk measures
Idea: A risk measure is a mapping which associates to a loss L a real number (which
indicates its riskiness).
Why are risk measures useful?
LECTURE 2. RISK MEASURES 17
• Determination of risk capital: determine the amount of capital a financial institution needs
to cover unexpected losses.
• Management tool: risk measures are used by management to limit the amount of risk a
unit within the firm may take.
• Insurance premiums: compensation paid by the insured to the insurance company are
based on measure of risk of the insured claims.
We will now introduce the notions of risk measure, monetary risk measure, convex risk
measure and coherent risk measure. An important theoretical contribution is that of Artzner,
Delbaen, Eber, Heath (1999), Mathematical Finance, 9: 203-228
We consider a one-period framework. We thus have two dates: today (which we denote t)
and a future date t+∆.
Recall that the loss between t and t+∆ is denoted by Lt+∆.
We will simply write L.
Recall that this is a random variable on (Ω,F ,P).
We denote by M be the set of all random variables.
Note that M is a linear vector space.
Risk measure
Definition 2.4.1. A mapping ρ :M → R is called risk measure if it satisfies:
• (A1) (Monotonicity) If L1 ≤ L2, then ρ(L1)≤ ρ(L2).
What is the financial interpretation of the property of monotonicity?
Monetary Risk measure
Definition 2.4.2. A mapping ρ :M → R is called a monetary risk measure if it satisfies:
• (A1) (Monotonicity) If L1 ≤ L2, then ρ(L1)≤ ρ(L2).
• (A2) (Translation invariance) For every L ∈M , for every m ∈ R, ρ(L+m) = ρ(L)+m.
Terminology: The property of translation invariance is also known as cash invariance.
What is the financial interpretation of the property of translation invariance?
Convex risk measure
Definition 2.4.3. A mapping ρ :M → R is called convex risk measure if it satisfies :
• (A1) (Monotonicity) If L1 ≤ L2, then ρ(L1)≤ ρ(L2).
• (A2) (Translation invariance) For every L ∈M , for every m ∈ R, ρ(L+m) = ρ(L)+m.
• (A3) (Convexity) For every L1,L2 ∈M and every λ ∈ [0,1],
ρ(λL1+(1−λ)L2)≤ λρ(L1)+(1−λ)ρ(L2).
What is the financial interpretation of the property of convexity?
LECTURE 2. RISK MEASURES 18
Coherent risk measure
Definition 2.4.4. A mapping ρ :M → R is called coherent risk measure if it satisfies:
• (A1) (Monotonicity) If L1 ≤ L2, then ρ(L1)≤ ρ(L2).
• (A2) (Translation invariance) For every L ∈M , m ∈ R, ρ(L+m) = ρ(L)+m.
• (A3) (Convexity) For every L1,L2 ∈M and every λ ∈ [0,1],
ρ(λL1+(1−λ)L2)≤ λρ(L1)+(1−λ)ρ(L2).
• (A4) (Positive homogeneity) For every L ∈M , for every positive number c,
ρ(cL) = cρ(L).
What is the financial interpretation of positive homogeneity?
First (very easy) example
Let us consider ρ :M → R defined by
ρ(L) = E(L).
In order for ρ to be well-defined in this example, we restrict ourselves to random variables
L having a finite expectation.
Questions: Is the mapping ρ in this example a risk measure? Is ρ a monetary risk measure?
Is ρ convex? Is ρ coherent?
2.4.1 Value at Risk
Value at Risk (VaR)
Let us fix a given level α ∈ (0,1) (known as the confidence level).
In practice, α = 95%, α = 99%, α = 99.9%. The Value at Risk at confidence level α
(denoted by VaRα) is the smallest number l such that the probability that the loss L exceeds l
is no larger than (1−α). More precisely, we have the following definition:
VaRα(L) = inf{l ∈ R : P(L> l)≤ 1−α}.
We will now provide a connexion between VaR of L and the quantile function of L. For
this, let us recall the definition of the quantile function and some properties.
LECTURE 2. RISK MEASURES 19
Quantile function (Revision)
Let X be a random variable. Let FX be a the cumulative distribution function of the r.v. X .
Definition
The quantile function qα(X) of X is defined by:
qα(X) = inf{x ∈ R : FX (x)≥ α}, for all α ∈ (0,1).
Terminology 1: The function α 7→ qα(X) is also known as the LOWER quantile function
of X , or the LEFT-Continuous quantile function of X .
Terminology 2: Fix α ∈ (0,1). The number qα(X) is known as the quantile of X at level
α, or the α-quantile of X .
Terminology 3: We will sometimes write qX (α) instead of qα(X) to emphasise that the
quantile function is a function of α.
Back to the VaR
Recall the definition of the VaR at level α:
VaRα(L) = inf{l ∈ R : P(L> l)≤ 1−α}.
We have
VaRα(L) = inf{l ∈ R : P(L> l)≤ 1−α}
= inf{l ∈ R : FL(l)≥ α}
= qα(L),
where FL denotes the CDF of L, that is, FL(x) = P(L≤ x), for all x ∈ R.
Conclusion: The VaR of the loss L at level α is equal to the α-quantile of the loss L.
−→ Practical advice: It is useful to know properties of quantile functions, as we can easily
deduce properties of the VaR.
Quantile function (Revision continued)
Some properties:
• The quantile function α 7→ qα(X) is non-decreasing.
• The quantile function α 7→ qα(X) is left-continuous with right limits.
Quantile function (Revision continued)
Proposition (The case where F is increasing and continuous)
If a CDF F is strictly increasing and continuous, then
• F is an invertible function.
• Moreover,
qα(F) = F−1(α), for all α ∈ (0,1),
where F−1 denotes the (usual) inverse function of F .
LECTURE 2. RISK MEASURES 20
Question: Is VaRα a risk measure?
Theorem
Let α ∈ (0,1) be a fixed confidence level.
We have:
• (Monotonicity) If L1 ≤ L2, then VaRα(L1)≤VaRα(L2).
• (Translation invariance) For every r.v. L, for every m ∈ R, VaRα(L+m) =VaRα(L)+m.
• (Positive Homogeneity) For every r.v. L, for every positive number c,
VaRα(cL) = cVaRα(L).
Proof: on the board.
Caveat: VaRα is not convex! A counter-example will be given in the next lecture.
Remember: VaRα is a monetary risk measure which satisfies positive homogeneity. But,
VaRα is not a convex risk measure (as it does nor satisfy the property of convexity).
• Advantage: VaRα is simple to use (it is just the α-quantile of the distribution of the loss).
• Drawback: VaR does not provide information about the size of the losses which might
occur with probability less than 1−α.
• Drawback: Negative diversification effects can arise (as VaRα is not convex).
VaRα is widely used by practitioners. The EU insurance regulatory framework Solvency II
prescribes the use of Value at Risk with a confidence level α = 99.5% for the computation of
the so-called Solvency Capital Requirement (SCR).
Particular case: VaR in the case of normal distribution
Let m ∈ R and σ2 6= 0.
Property
If L∼ N(m,σ2), then
VaRα(L) = m+σΦ−1(α).
Here (as usual), Φ denotes the CDF of the standard normal N(0,1).
2.4.2 Expected Shortfall (or Average Value at Risk)
Expected shortfall (also Average VaR)
Let α∈ (0,1) be a fixed confidence level. Let L be a random variable such that E(|L|)<∞.
The expected shortfall of L at level α is defined by
ESα(L) =
1
1−α
∫ 1
α
VaRu(L)du.
Note that ESα is well-defined for random variables which have a finite first moment.
LECTURE 2. RISK MEASURES 21
Remark: ESα is very popular among practitioners. ESα is recommended in the Swiss
Solvency Test (SST), which is the insurance regulatory framework in Switzerland.
Theorem
Let α ∈ (0,1) be a fixed confidence level.
We have:
• (Monotonicity) If L1 ≤ L2, then ESα(L1)≤ ESα(L2).
• (Translation invariance) For every r.v. L, for every m ∈ R, ESα(L+m) = ESα(L)+m.
• (Convexity) For every L1,L2 and every λ ∈ [0,1],
ESα(λL1+(1−λ)L2)≤ λESα(L1)+(1−λ)ESα(L2).
• (Positive Homogeneity) For every r.v. L, for every positive number c,
ESα(cL) = cESα(L).
It follows from the Theorem that ESα is a coherent risk measure.
Equivalent expression in the case where FL is continuous
If the CDF FL is a continuous function, then the expected shortfall at level α can be expressed
as follows:
ESα(L) = E(L|L≥ VaRα(L)).
In other words, when FL is continuous, the expected shortfall at level α is equal to the
expected loss conditioned on the event that VaR at level α is exceeded.
Particular case: ESα for normal distribution
If L∼ N(µ,σ2), then
ESα(L) = µ+σE
{
Z|Z ≥Φ−1(α)}= µ+σϕ(Φ−1(α))
1−α
where ϕ denotes the density function of the standard normal N(0,1) and Φ is the CDF of
N(0,1).
Recall: ϕ=Φ′.
Use of VaR to compute regulatory capital
• Regulatory capital calculation for trading book of a bank
RC(t) =max
{
VaRt,100.99; k
1
60
60

j=1
VaRt− j+1,100.99
}
+CSR,
where 3≤ k ≤ 4 (determined by the regulator) and CSR is a component for specific risk.
LECTURE 2. RISK MEASURES 22
VaR and ES
• ESα is an example of coherent risk measure.
• VaRα is not coherent (as it does not satisfy convexity).
An example through a picture
An example through a picture
Figure 2.1
2.5 Transformations of random variables
Quantile transformation of a uniform random variable
Proposition
Let U ∼U(0,1) (uniform distribution on the interval (0,1)). Let X be a random variable with
quantile function qX (α) (note the alternative notation). Set Y = qX (U). Then, the random
variable Y and the random variable X have the same distribution. That is,
FY (x) = FqX (U)(x) = FX (x), for all x ∈ R.
This proposition is very useful for simulating from a given distribution.
Transformation of a random variable by its CDF
Proposition
Let X be a random variable with the cumulative distribution function denoted by FX . Assume
moreover that the CDF FX is a continuous function. Set
U = FX (X)
Then the random variable U has the Uniform distribution U(0,1).
LECTURE 2. RISK MEASURES 23
2.6 Approaches for computing VaR and ES in practice
Approaches for computing VaR and ES
Here, we will give some "recipes" used by practitioners.
• Variance-Covariance method (also known as normal method)
• Historical estimation
• Monte Carlo simulation
It is important that we understand the underlying assumptions, the main steps, the ad-
vantages and disadvantages of each method. We follow the presentation from the book by
Embrechts, Frey, McNeil.
2.6.1 Variance-covariance method
Variance-Covariance method
Assumption 2.6.1. • X∆ ∼ N(m,Σ) (That is, the random vector of risk factor changes
follows a multivariate Normal distribution with mean vectorm∈Rd and covariance matrix
Σ)
• The linearized loss Llin is a good approximation for the loss L.
The main steps of this method are:
1. Use historical data to estimate the mean m and the covariance matrix Σ.
Variance-Covariance method
2. Consider the linearized loss Llint+∆. L
lin
t+∆ has the structure (cf. Lecture 1 and 2):
Llint+∆ =−
(
c+b′Xt+∆
)
where b is a d-dimensional vector and c is a real constant.
3. As Xt+∆ follows a (multivariate) Normal distribution (by assumption), and as Llint+∆ is
written as a known linear function of Xt+∆, we get
LLint+∆ ∼ N(−c−b′m,b′Σb).
Remark: Note that −c−b′m is a real number and that b′Σb is a (non-negative) real number. Hence,
N(−c−b′m,b′Σb) denotes the (univariate) Normal with mean µ=−c−b′m and variance σ2 = b′Σb.
Variance-Covariance method
4. Compute VaRα and ESα by using the explicit formulas established in the Gaussian
framework.
LECTURE 2. RISK MEASURES 24
Variance-Covariance method
• Advantages:
– analytic method (no simulation)
– easy to implement
• Disadvantages (weaknesses):
– the linearized loss may not offer good approximation for the loss
– normality of risk factors changes may not be a realistic assumption
2.6.2 Historial estimation
Historical Estimation
Idea: This method uses historical data for the risk factor changes and an estimate for the
probability distribution of the loss based on the empirical distribution.
There are no assumptions on the probability distribution of X made: this is a non paramet-
ric method.
Historical Estimation
We place ourselves at t. The main steps of this method are:
• Collect a historical dataset Xt−(n+1)∆, . . . ,Xt−2∆,Xt−∆,Xt . Remark: Note that n cor-
responds to the size of the data set. Each data point in the data set is a vector of
dimension d.
• Use the above data to "get" a dataset of losses:
L˜t = l[t](Xt), L˜t−∆ = l[t](Xt−∆), . . . , L˜t−(n+1)∆ = l[t](Xt−(n+1)∆),
where l[t] is the loss operator at time t (cf. Lecture 1).
Remark: L˜t−∆ corresponds to the portfolio loss that we would have if the risk factor changes
observed between t−2∆ and t−∆ were to recur.
• Use this dataset to compute an empirical distribution estimate (e.g. "histogram estimate")
of the loss probability distribution.
Historical Estimation
• Compute an estimate for VaRα(Lt+∆) and ESα(Lt+∆).
An estimator for VaRα(Lt+∆) can be obtained as follows:
– Order the data points of the set from the smallest loss to the greatest loss: L˜(1) ≤
L˜(2) ≤ . . .≤ L˜(n).
– An estimator for VaRα(Lt+∆) is given by:
qα(L˜)
which can be approximated by
L˜(dnαe),
where dnαe denotes the smallest integer greater or equal to nα.
• How can an estimator for ESα(Lt+∆) be obtained?
LECTURE 2. RISK MEASURES 25
Historical Estimation
• Advantages:
– Easy to implement.
– No assumption about distribution and dependence structure of risk-factors.
• Disadvantages:
– In this approach, "the worst case is never worse than what has happened in the
past".
– Need of a large sample of relevant historical data to get reliable estimates of VaRα
and ESα; missing data can cause problems.
2.6.3 Monte Carlo simulation
Monte Carlo simulation
Monte Carlo simulation (in risk management) is a general name for any approach which
involves simulation from an explicit parametric model for the risk factor changes.
Main steps of this approach:
• Choose a d-dimensional parametric model for the vector of risk factor changes X
• Use historical data Xt−(n+1)∆, . . . ,Xt−2∆,Xt−∆,Xt of past risk factor changes to calibrate
the d-dimensional model.
• Use the calibrated model to simulate M (possible) values of risk factor changes for the
next period (that is, the period from t to t+∆):
X˜1t+∆, X˜
2
t+∆, . . . , X˜
M
t+∆.
• Use the simulated risk factors to simulate losses the period from t to t+∆) using the loss
operator l[t].
Monte Carlo simulation
• Advantages: We can choose the number of simulations M ourselves (M can be chosen
much bigger than n).
• Disadvantages: simulation can be computationally intensive; "results are as good as is
the model used" (think of model risk).
2.7 Risk measures in practice: Scaling
Losses over several periods: "Scaling"
Up to now, we considered losses and risk measures on one period (from t to t+∆).
Question: How to express the loss over the next k periods in terms of the risk factors
changes over each individual time period?
LECTURE 2. RISK MEASURES 26
The loss Lt+k∆ from time t till t+ k∆ (that is, over the next k periods) can be written
Lt+k∆ =−(Vt+k∆−Vt) (2.1)
=−( f (t+ k∆,Zt+k∆)− f (t,Zt)) (2.2)
=−( f (t+ k∆,Zt +Xt+k∆)− f (t,Zt)). (2.3)
Notice that
Xt+k∆ = Xt+∆+X(t+∆)+∆+X(t+2∆)+∆+ . . .X(t+(k−1)∆)+∆,
where X(t+i∆)+∆ denotes the risk factor change from time t+ i∆ to time t+ i∆+∆.
A very specific case: square-root scaling
We make the following assumptions:
• The portfolio is time homogeneous (meaning that the function f does not depend explic-
itly on time t).
• The risk-factor change vectors Xt+∆, X(t+∆)+∆,X(t+2∆)+∆, . . . are independent and iden-
tically distributed.
• Each of these vectors has the multivariate normal distribution Nd(0,Σ).
Under these assumptions, we have
• ∑ki=1Xt+i∆ ∼ Nd(0,kΣ) (due to independence assumption)
• Linearized loss Llint+k∆ follows N(0,kσ
2) where σ2 = b′Σb.
• VaRα(Lt+k∆)≈VaRα(Llint+k∆)=

kσVaRα(Y ) and ESα(Lt+k∆)≈ESα(Llint+k∆)=

kσESα(Y ),
where Y ∼ N(0,1).
• Compare to VaRα(Llint+∆) = σVaRα(Y ) and ESα(L
lin
t+∆) = σESα(Y ).
2.8 Back-testing and stress-testing
Back-testing of VaRα and ESα
Idea: Back-testing consists in checking whether the realized losses are consistent with
the corresponding VaRα or ESα produced by the model. Consider the following situation.
Consider a time horizon ∆ of one day. At time t, we "estimate" ("predict") VaRα(Lt+∆) by
using our favorite method (variance-covariance, historical, Monte Carlo simulation). We obtain
a number, call it vt+∆. At time t+∆, we know the realization lt+∆ of the loss Lt+∆.
So, at time t+∆, we have the opportunity of comparing our one period prediction with what
actually happened. More precisely, we can check whether the realized loss lt+∆ is greater than
or smaller than vt+∆. If lt+∆ > vt+∆, we call this an exception.
LECTURE 2. RISK MEASURES 27
Back-testing of VaRα
• Recall that if the CDF of the loss is continuous (and increasing), then P(Lt+∆ > vt+∆) =
1−α.
• We expect that if our modelling assumptions and computation method are reasonable,
the value of daily losses should exceed the predicted VaRα number v on less than 1−α
out of every 100 days, on average. If α= 99%, this means that, on average, the number
of "exceptions" in every 100 days should be no bigger than 99%×100= 1.
• Statistical test... to come in your coursework.
Back-testing of ESα
• Estimate ESα(Lt+∆) and call this estimate et+∆.
• To back-test predicted ESα, we look at the difference lt+∆− et+∆ on days that the vt+∆
was exceeded, that is, on those days where there is an exception. We expect that this
difference be zero, on average (Think why?)
• Regulators require banks to perform back-test.
Scenario analysis and stress-testing
Models might not assign enough chance for the occurrence of a scenario in which many
things go wrong at the same time (e.g. normal distribution can lead to an understatement of
risk).
• Scenario analysis is a (necessary) complement to value-at-risk.
• Scenarios are based on combination of predetermined “stress shocks”.
Scenarios can be based on economic insight:
• Extreme historical events.
• Hypothetical (imagined) future events.
Lecture 3
Copulas
3.1 Modelling dependence with Copulas
Copulas: Motivation
We saw that the variance-covariance method for computation of VaRα and ESα assumes
that the risk factor change vector Xt+∆ follows a multivariate normal distribution.
In a number of cases, this assumption is not realistic.
How to specify a model of Xt+∆? How to describe (mathematically) dependence between
the components of the random vector Xt+∆?
Let us first do some revision from probability.
3.1.1 Multi-variate distributions - revision
Revision
Let X= (X1,X2, . . . ,Xd)′ be a d-dimensional random vector.
For instance, the vector of risk factor changes over one time period.
The joint CDF of (X1,X2, . . . ,Xd)′ is defined by: for x= (x1,x2, . . .xd)′ ∈ Rd ,
FX(x) = FX(x1,x2, . . . ,xd) = P(X1 ≤ x1,X2 ≤ x2, . . . ,Xd ≤ xd).
The marginal CDF of one individual component X i is defined by: for x ∈ R,
FX i(x) = P(X
i ≤ x).
28
LECTURE 3. COPULAS 29
Revision (joint CDF)
Let us recall some properties of the joint CDF. For simplicity, we look at the bivariate case
(d = 2).
Let
F(X1,X2)(x1,x2) = P(X
1 ≤ x1,X2 ≤ x2).
The joint CDF satisfies the following properties:
• FX1(x) = F(X1,X2)(x,∞), for any x ∈ R
• FX2(x) = F(X1,X2)(∞,x), for any x ∈ R
Revision (joint CDF)
The joint CDF satisfies the following properties:
• (rectangular inequality) For a1,a2,b1,b2, such that a1 ≤ b1, a2 ≤ b2, we have:
P(a1 < X1 ≤ b1,a2 < X2 ≤ b2) =
F(X1,X2)(b1,b2)−F(X1,X2)(a1,b2)−F(X1,X2)(b1,a2)+F(X1,X2)(a1,a2)
≥ 0
• X1 and X2 are independent if and only if
F(X1,X2)(x1,x2) = FX1(x1)FX2(x2), for all (x1,x2) ∈ R2
If we know the joint CDF , we can easily deduce all the marginals. More precisely, we have
the following property:
FX i(x) = F(∞, . . . ,∞,x
i,∞, . . . ,∞),
where the above denotes the limit when the components (with the exception of the ith compo-
nent) tend to ∞.
However, the knowledge of the marginals is in general not enough to obtain the joint CDF.
Question: What should we know more, besides the marginal CDF’s, in order to character-
ize the joint CDF?
COPULA
3.1.2 Copulas: what are they?
Copulas: general ideas
• In a joint CDF, dependence between the components is usually implicit.
• The role of copulas is to disentangle dependence and marginal distributions.
• A copula “describes” the dependence structure (a copula “couples” the joint distribution
function to its marginals distribution functions).
• Very useful In Risk Management for modelling the dependence among components of a
random vector of risk factor changes.
LECTURE 3. COPULAS 30
Copulas: general ideas
Why are copulas interesting? (Fischer, Encyclopedia of Statistics)
• they represent a way of studying (scale-free) measures of dependence
• they are a starting point for constructing families of multivariate distributions, sometimes
with a view to simulation.
Let us start with dimension 2, that is, d = 2.
3.1.3 2-dimensional copulas
Definition of 2-dimensional copula
Definition
A 2-dimensional copula is a joint CDF on [0,1]2 whose marginals are uniform distributions on
[0,1].
Characterization of 2-dimensional copula
A 2-dimensional copula is a function C : [0,1]2→ [0,1] with the following properties
• u1 7→C(u1,u2) is non-decreasing and u2 7→C(u1,u2) is non-decreasing.
• C(u1,1) = u1 and C(1,u2) = u2 (that is, C has standard uniform marginals).
• For u1,u2,v1,v2 ∈ [0,1] such that u1 ≤ v1 and u2 ≤ v2, it holds
C(v1,v2)−C(u1,v2)−C(v1,u2)+C(u1,u2)≥ 0
Conversely, to check that a given function C : [0,1]2→ [0,1] is a (2-dimensional) copula, it is
enough to check the three properties above.
Example 1
Let U1 and U2 be two random variables such that
• U1 and U2 are U(0,1), and
• U1 and U2 are independent.
Compute the (joint) CDF of (U1,U2). Answer:
F(U1,U2)(u1,u2) = u1u2.
This is a copula, which is known as the independence copula.
The independence copula is sometimes denoted by Π. So, Π(u1,u2) = u1u2.
LECTURE 3. COPULAS 31
Example 2
In this example we consider two random variables which are equal, and follow U(0,1).
Compute the (joint) CDF of (U,U), where U follows U(0,1).
Answer:
F(U,U)(u1,u2) =min(u1,u2).
This is a copula, which is known as the co-monotonicity copula.
The co-monotonicity copula is sometimes denoted by M. So, M(u1,u2) =min(u1,u2).
Example 3
In this example we consider two random variables (U1,U2) such that (U1,U2) = (U,1−U),
where U follows U(0,1).
Note that, if U follows U(0,1), then 1−U follows U(0,1). Hence, both marginals are
uniform.
Compute the (joint) CDF of (U,1−U), where U follows U(0,1).
Answer:
F(U,1−U)(u1,u2) =max(u1+u2−1,0).
This is a copula, which is known as the counter-monotonicity copula.
The counter-monotonicity copula is sometimes denoted by W . So, W (u1,u2) =max(u1+
u2−1,0).
3.1.4 d-dimensional copulas
Definition of d-dimensional copula
Definition
A d-dimensional copula is a joint CDF on [0,1]d whose marginals are uniform distributions
on [0,1].
Characterization of d-dimensional copula
A d-dimensional copula is a function C : [0,1]d → [0,1] with the following properties
• ui 7→C(u1, . . . ,ud) is non-decreasing, for every i= 1, . . . ,d.
• C(1, . . . ,1,ui,1 . . . ,1) = ui, for every ui ∈ [0,1], for every i= 1, . . . ,d.
• For all (a1, . . . ,ad),(b1, . . . ,bd) ∈ [0,1]d such that a1 ≤ b1, a2 ≤ b2, . . . ,ad ≤ bd , it holds
2

i1=1
· · ·
2

id=1
(−1)i1+···+idC(u1(i1), · · · ,ud(id))≥ 0
where u j(1) = a j and u j(2) = b j, for j ∈ {1, . . . ,d}.
LECTURE 3. COPULAS 32
Characterization of d-dimensional copula (continued)
Conversely, to check that a given function C : [0,1]d → [0,1] is a (d-dimensional) copula, it is
enough to check the three properties of the previous slide.
3.1.5 Sklar’s theorem
Abe Sklar’s Theorem
Sklar’s Theorem (1959)
1) Let F be a given joint CDF. Let F1, . . . ,Fd denote its marginal CDFs.
• Then, there exists a copula, denoted by C, such that, for all (x1, · · · ,xd) ∈ Rd ,
F(x1, · · · ,xd) =C(F1(x1), . . . ,Fd(xd)). (3.1)
• If the marginal CDFs are continuous functions, then C is unique. Otherwise, C is
uniquely determined (only) on Range(F1)×·· ·×Range(Fd).
2) Conversely, let now F1, . . . ,Fd be given univariate CDFs. And, let C be a given copula.
Then, the function F defined by (3.1) is a joint CDF, with marginal CDFs F1, . . . ,Fd .
Copulas are not unique in the discontinuous case (Example)
Consider a bivariate Bernoulli (X1,X2) with
P(X1 = 0,X2 = 0) = 1/8 P(X1 = 0,X2 = 1) = 2/8
P(X1 = 1,X2 = 0) = 2/8 P(X1 = 1,X2 = 1) = 3/8
Then
F1(x) = F2(x) =
 0 x< 03/8 0≤ x< 11 x≥ 1
A copula that joins F1,F2 and F need to satisfy
C(0,0) =C(0,3/8) =C(0,1) =C(3/8,0) =C(1,0) = 0,
C(3/8,1) =C(1,3/8) = 3/8,
C(3/8,3/8) = 1/8.
Only the last condition is a constraint. Hence, there are infinitely many copulas that satisfy
it.
3.2 Properties of copulas
3.2.1 Invariance properties
First invariance property
Property (The "core" invariance property)
Let (X1, . . . ,Xd)′ be a random vector, with continuous marginals denoted by F1, . . . ,Fd . Then,
the vectors (X1, . . . ,Xd)′ and (F1(X1), . . . ,Fd(Xd))′ have the same copula.
LECTURE 3. COPULAS 33
Second invariance property
Copulas are invariant under increasing transformations. More precisely, we have
Property (Invariance under increasing transformations)
Let (X1, . . . ,Xd)′ be a random vector with continuous marginal CDFs and a copula C. Let
T1,T2, . . . ,Td be strictly increasing functions. Then, the random vector (T1(X1), . . . ,Td(Xd))′
also has the copula C.
Examples of Applications
Example 1
Let X1 and X2 be two random variables such that
• X1 and X2 have continuous CDF.
• X1 and X2 are independent.
Question: What is the copula of the random vector (X1,X2)′?
Answer: Let us first notice that there exists a copula, by the first part of Sklar’s theorem. The
copula is unique, by the second part of Sklar’s theorem (as the marginal CDFs are continuous).
We will show that the copula of (X1,X2) is the independence copula Π.
Examples of Applications
Example 2
Let X be a random variable with continuous marginal CDF. Let T be an strictly increasing
function.
Show that the random vector (X ,T (X))′ has the the co-monotonicity copula M.
Can there be more copulas for the vector (X ,T (X))′?
Examples of Applications
Example 3
Let X be a random variable with continuous marginal CDF. Let T be a strictly decreasing
function.
Show that the random vector (X ,T (X))′ has the the counter-monotonicity copula W .
Can there be more copulas for the vector (X ,T (X))′?
3.2.2 Fréchet-Hoeffding bounds
Fréchet-Hoeffding bounds (dimension d = 2)
Fréchet-Hoeffding bounds (dimension d = 2)
Let C be a two dimensional copula. Then, for all (u1,u2) ∈ [0,1]2,
max{u1+u2−1,0} ≤C(u1,u2)≤min{u1,u2} .
Remark: In dimension 2, the upper bound and the lower bound can be attained. In other
words, the upper bound and the lower bound are copulas.
LECTURE 3. COPULAS 34
Fréchet-Hoeffding bounds (dimension d ≥ 3)
Fréchet-Hoeffding bounds (dimension d ≥ 3)
Let d ≥ 3. Let C be a d-dimensional copula. Then, for all (u1,u2, . . . ,ud) ∈ [0,1]d ,
max
{
d

i=1
ui+1−d,0
}
≤C(u1, . . . ,ud)≤min{u1, . . . ,ud} .
Remark: For the case d ≥ 3, the upper bound can be attained. In other words, the upper
bound is a copula. However, the lower bound cannot be attained (in other words, the lower
bound is not a copula in d ≥ 3).
3.3 Further examples of copulas
Further examples
We have seen some examples of explicit copulas: independence copula, co-monotonicity
copula, counter-monotonicity copula. These are copulas which have an explicit expression
(an expression in closed form).
Let us now present some further examples of explicit copulas:
• Gumbel Copula and Clayton Copula
• Archimedean copulas
3.3.1 Archimedian copulas
Gumbel Copula
For simplicity, we will consider the case d = 2. Let θ be a given parameter in [1,∞).
Definition (Gumbel copula)
The (bivariate) Gumbel Copula (with parameter θ ∈ [1,∞)) is the function CGuθ : [0,1]2→ [0,1]
defined by
CGuθ (u1,u2) = exp
{

(
(− logu1)θ+(− logu2)θ
)1/θ}
.
• For θ= 1, we obtain the independence copula.
• For θ→ ∞, we obtain the co-monotonicity copula.
Interpretation: For θ ∈ [1,∞), the Gumbel copulas "interpolate" between the independence
and the co-monotonicity copula, where θ is interpreted as "the strength of dependence".
Let now θ be a given parameter in (0,∞). Note that 0 is excluded.
LECTURE 3. COPULAS 35
Clayton Copula
Definition (Clayton copula)
The (bivariate) Clayton Copula (with parameter θ ∈ (0,∞)) is the function CClθ : [0,1]2→ [0,1]
defined by
CClθ (u1,u2) = (u
−θ
1 +u
−θ
2 −1)−1/θ.
• For θ→ 0, we get the independence copula.
• For θ→ ∞, we get the co-monotonicity copula.
Interpretation: Clayton copulas "interpolate" between different copulas, namely, independence
and co-monotonicity copulas, where θ is interpreted as "the strength of dependence".
Gumbel and Clayton copulas belong to a bigger class (or family) of copulas, known as
Archimedian copulas.
Let us define this class of copulas. Again, we will restrict ourselves to the case d = 2.
(Bivariate) Archimedean copulas
Let ψ : [0,1]→ [0,∞] be a function such that ψ is decreasing, continuous, convex, with ψ(1) =
0. A function satisfying these properties will be called a generator.
Define a function C : [0,1]2→ [0,1] by setting
C(u1,u2) =
{
ψ−1(ψ(u1)+ψ(u2)), if ψ(u1)+ψ(u2)≤ ψ(0),
0, otherwise.
Theorem 3.3.1. The function C defined above is a copula.
Remark: If ψ(0) = ∞, then ψ is called a strict generator. In this case, C(u1,u2) =
ψ−1(ψ(u1)+ψ(u2)), for all (u1,u2) ∈ [0,1]2.
Gumbel copula as an example of Archimedean copula
Recall that the Gumbel copula (with parameter θ) is defined by
CGuθ (u1,u2) = exp
{
− ((− log(u1))θ+(− log(u2))θ)1/θ},
where θ ∈ [1,∞). We set ψ(u) = (− log(u))θ, with the convention log(0) =−∞. Let us check
that ψ is a generator. We have
• ψ is non-negative.
• ψ is continuous.
• ψ′(u) =− θu (− log(u))θ−1. Hence, ψ′ < 0. Hence, ψ decreasing.
• ψ′′(u)≥ 0. Hence, ψ is convex.
• ψ(1) = 0
LECTURE 3. COPULAS 36
Hence, ψ is a generator.
Moreover, ψ(0) = ∞. Hence, ψ is a strict generator.
Let us compute the inverse ψ−1. We have: ψ−1(t) = exp(−t1/θ).
Finally, we check that
CGuθ (u1,u2) = ψ
−1(ψ(u1)+ψ(u2)).
We conclude that the Gumbel copula (with parameter θ) belongs to the family of Archimedean
copulas.
Clayton copula as an example of Archimedean copula
Recall that the Clayton copula (with parameter θ) is defined by
CClθ (u1,u2) = (u
−θ
1 +u
−θ
2 −1)−1/θ
where θ ∈ (0,∞). We set ψ(u) = (u−θ−1)/θ. Let us check that ψ is a generator. We have
• ψ is non-negative.
• ψ is continuous.
• ψ′(u) =−u−(θ+1) < 0. Hence, ψ′ < 0. Hence, ψ is decreasing.
• ψ′′(u)≥ 0. Hence, ψ is convex.
• ψ(1) = 0
Hence, ψ is a generator.
Moreover, ψ(0) = ∞. Hence, ψ is a strict generator.
Let us compute the inverse ψ−1. We have: ψ−1(t) = (θt+1)−1/θ.
Finally, we check that
CClθ (u1,u2) = ψ
−1(ψ(u1)+ψ(u2)).
We conclude that the Clayton copula (with parameter θ) belongs to the family of Archimedean
copulas.
Copulas for d > 2
Let Ψ be a generator of an Archimedian copula and d ≥ 2. Then
C(u1,u2, . . . ,ud) =
{
ψ−1
(
∑di=1ψ(ui)
)
, if ∑di=1ψ(ui)≤ ψ(0),
0, otherwise.
Gumbel copula: θ ∈ [1,∞)
CGuθ (u1, . . . ,ud) = exp
{

(
(− logu1)θ+ . . .+(− logud)θ
)1/θ}
.
LECTURE 3. COPULAS 37
Clayton copula: θ ∈ (0,∞)
CClθ (u1, . . . ,ud) = (u
−θ
1 + . . .+u
−θ
d −d+1)−1/θ.
There are other well-known examples of copulas belonging to the Archimedean family (such
as Frank copula,...).
All examples of copulas seen up to now were examples of explicit copulas (that is, copulas
having an explicit expression in closed form).
Let us now discuss some examples of copulas which are derived (deduced) from well-
known multivariate distributions. Some of these copulas cannot be obtained in an explicit
closed form.
Two major examples of implicit copulas are: Gauss copulas and t-copulas.
3.3.2 Implicit Copulas
Gauss copula
We will give the general definition (in dimension d ≥ 2). Let R be a d× d correlation
matrix. Let (X1, . . . ,Xd)′ be a random vector following the multivariate normal Nd(0,R).
Definition (Gauss copula)
The Gauss copula CGaR : [0,1]
d → [0,1] is defined by
CGaR (u1, . . . ,ud) = P(Φ(X1)≤ u1, . . . ,Φ(Xd)≤ ud)
=ΦR(Φ−1(u1), . . . ,Φ−1(ud))
where (as usual) Φ denotes the CDF of a standard (univariate) normal, and ΦR denotes the
CDF of the random vector (X1, . . . ,Xd)′.
Some special Gauss Copulas
• If R is the identity matrix, then CGaR =Π (the independence copula)
• If R is the matrix
R=
 1 . . . 1... . . . ...
1 . . . 1
 ,
then CGaR =M (the co-monotonicity copula).
• We place ourselves in dimension d = 2. If
R=
(
1 −1
−1 1
)
,
then CGaR =W (the counter-monotonicity copula).
Remark: Note that the last statement cannot be generalized in dimension greater than 2.
A second example of implicit copulas will be given by the so-called t-copulas (or Student’s
t-copulas). Before introducing the t-copulas, we need some revision on the multivariate t-
distribution (or multivariate Student’s t-distribution).
LECTURE 3. COPULAS 38
Multivariate t-distribution (Revision)
Let X = (X1, . . . ,Xd)′ be a random vector. We say that (X1, . . . ,Xd)′ has a multivariate
t-distribution with parameters ν (the degrees of freedom) and R, denoted by X∼ td(ν,R), if the
random vector X can be written as
X=
Y√
Z/ν
,
where
• Y is a random vector following Nd(0,R)
• Z is a (univariate) random variable following χ2(ν), and
• X and Z are independent.
Remark: Note that R is the covariance matrix of Y, but not the covariance matrix of X.
Student’s t-copula
Definition 3.3.2 (Student’s t-copula). The Student’s t-copula (with given parameters ν∈{1,2, . . .}
and a correlation matrix R) CStν,R : [0,1]
d → [0,1] is defined by
CStν,R(u1, . . . ,ud) = tν,R
(
t−1ν (u1), . . . , t
−1
ν (ud)
)
,
where tν denotes the CDF of a (univariate) standard Student t distribution with ν degrees of
freedom, and tν,R denotes the joint CDF of (X1, . . .Xd)′ ∼ td(ν,R).
Remark: Student’s t distribution is often called t-distribution. Student’s t-copula is often
called t-copula.
Some special t-copulas
• If
R=
 1 . . . 1... . . . ...
1 . . . 1
 ,
then CStν,R =M.
• For d = 2, if
R=
(
1 −1
−1 1
)
,
then Ctν,R =W.
• If R is the identity matrix, then CStν,R 6= Π, that is, we DO NOT get the independence
copula.
LECTURE 3. COPULAS 39
3.4 Meta distributions
Before introducing the notion of meta-distribution, let us recall the second part of Sklar’s theo-
rem:
2) Let F1,F2, . . . ,Fd be univariate CDFs and let C be a copula. Then the function F defined by
F(x1, · · · ,xd) =C(F1(x1), . . . ,Fd(xd))
is a joint CDF with marginal CDFs F1, . . . ,Fd .
This second part of Sklar’s theorem gives us a way of constructing a joint CDF from given
marginal CDFs and a given copula C.
Meta distribution
Meta-distribution
A meta-distribution is a joint distribution constructed from a given copula and given arbitrary
marginal CDFs, using the second part of Sklar’s theorem.
Examples
• meta-Gauss distribution: this is a joint distribution constructed from the Gauss copula
and arbitrary marginals.
• meta-t distribution: This is a joint distribution which has the Student’s tν,R copula and
arbitrary marginals.
• meta-Clayton distribution: This is a joint distribution which has Clayton copula with pa-
rameter θ and arbitrary marginals.
We know, by the first part of Sklar’s theorem, that every joint CDF "contains" a copula.
Question: How to constructively "extract" a copula from a given joint CDF?
“constructively” means that we can generate samples from a meta distribution.
To answer this question, let us first recall two important properties (in dimension 1).
Revision
Proposition 1(Quantile transformation of a uniform random variable)
Let U ∼U(0,1) (uniform distribution on the interval (0,1)). Let X be a random variable with
quantile function qX . Set
Y = qX (U).
Then, the random variable Y and the random variable X have the same distribution. That is,
FY (x) = FqX (U)(x) = FX (x), for all x ∈ R.
LECTURE 3. COPULAS 40
Revision
Proposition 2
Let X be a (univariate) random variable. Let FX be the cumulative distribution function of the
r.v. X . Assume moreover that the CDF FX is a continuous function. Set
U = FX (X).
Then, the random variable U has the uniform distribution U(0,1).
Proposition 3
Assume that F is a continuous CDF with a quantile function qF . Then
• qF is strictly increasing,
• F(qF(u)) = u for u ∈ (0,1).
See Proposition A.3 (ii) and (viii) in McNeil et al. (on Minerva as “Properties of CDFs and
quantiles...”).
3.5 How to "extract" a copula from a given joint distribu-
tion?
How to "extract" a copula form a given joint distribution?
In the following procedure, we assume that all the marginal CDFs F1, F2,. . . , Fd are con-
tinuous. We follow the 2 step procedure:
• Step 1: Simulate a random vector (X1, . . . ,Xd)′ from the given multivariate joint distribu-
tion.
• Step 2: Construct a random vector (U1 . . . ,Ud)′, by settingU1=F1(X1),U2=F2(X2), . . .,
Ud = Fd(Xd).
Then, the joint CDFC of the random vector (U1, . . . ,Ud)′ is the copula of the random vector
(X1, . . . ,Xd)′.
Example 1: Gauss Copula (in dimension d)
Let R be a given d×d correlation matrix. We apply the previous procedure to the case of
the d-dimensional Gauss copula.
• Step 1: Simulate a random vector (X1, . . . ,Xd)′ from the d-dimensional multivariate
normal distribution Nd(0,R).
• Step 2: Construct a random vector (U1, . . . ,Ud)′, by settingU1=Φ(X1),U2=Φ(X2), . . .,
Ud =Φ(Xd).
Then, the joint CDF of the vector (U1, . . . ,Ud)′ is CGaR (by construction) and it is the copula of
the vector (X1, . . . ,Xd)′.
LECTURE 3. COPULAS 41
Example 2: Student’s t-Copula
Let R be a given d×d correlation matrix.
• Step 1: Simulate a random vector (X1, . . . ,Xd)′ from the d-dimensional Student’s t-
distribution with ν degrees of freedom and correlation matrix R.
• Step 2: Construct a random vector (U1, . . . ,Ud)′, by settingU1= tν(X1),U2= tν(X2), . . . ,Ud =
tν(Xd) (where, as usual, tν denotes the CDF of the univariate Student with ν degrees of
freedom).
Then, the joint distribution of the vector (U1, . . . ,Ud)′ is CStν,R (by construction) and it is the
copula of (X1, . . . ,Xd)′.
3.6 Simulation from a meta distribution
Question: How to simulate from a meta-distribution?
• Implicit copulas: we can do this by adding a third step in the previous two-step procedure.
• Archimedian copulas: we can do it by considering conditional distributions.
3.6.1 Simulation from meta distributions with an implicit copula
How to simulate from the meta-Gauss distribution?
Let R be a given d×d correlation matrix. Let G1, G2, . . .Gd be given univariate CDFs.
(Example: Exponential CDFs with parameters λ1, λ2 . . . ,λd).
We wish to simulate from the meta-Gauss distribution (with copulaCGaR ) and marginal CDFs
given by G1, G2, . . . ,Gd .
We apply the previous two-step procedure to extract the d-dimensional Gauss copulaCGaR ,
and, then, we apply an additional third step which will allow us to have the desired marginals
G1, G2, . . . ,Gd .
How to simulate from the meta-Gauss distribution?
The procedure becomes:
• Step 1: Simulate a random vector (X1, . . . ,Xd)′ from the d-dimensional multivariate
normal distribution Nd(0,R).
• Step 2: Construct a random vector (U1, . . . ,Ud)′, by settingU1=Φ(X1),U2=Φ(X2), . . .,
Ud =Φ(Xd).
Then, the joint CDF of the vector (U1, . . . ,Ud)′ is CGaR .
• Step 3: Apply quantile transformations to get the vector
Y= (q1(U1), . . . ,qd(Ud))′,
where q1, . . ., qd denote the quantile functions of G1, . . . ,Gd .
Then, the vector Y has the desired copula CGaR . (This is true under the assumption q1, . . . ,qd
are strictly increasing; think why?). And also, Y has the desired marginal CDFs G1, . . . ,Gd .
LECTURE 3. COPULAS 42
3.6.2 Simulation from an Archimedian copula
Alrogithm based on a conditional distribution
A copula C is a CDF of (U1,U2) and the marginals are U(0,1). Hence, we have the
following algorithm for simulating from this CDF:
• Step 1: Simulate u1 ∼U(0,1).
• Step 2: Simulate u2 with the CDF
Fu1(x2) = P(U2 ≤ x2|U1 = u1)
By Proposition 1, Step 2 can be done in the following way:
u2 = qu1(u˜2),
where u˜2 ∼U(0,1) and qu1 is the quantile function of Fu1 .
Conditional distribution
Assume that u1 7→C(u1,x2) is differentiable for each x2. Then, we have
Fu1(x2) = P(U2 ≤ x2|U1 = u1)
= lim
h↓0
P(U2 ≤ x2|u1 ≤U1 ≤ u1+h)
= lim
h↓0
P(U2 ≤ x2|u1 = lim
h↓0
P(U2 ≤ x2, u1 P(u1 = lim
h↓0
C(u1+h,x2)−C(u1,x2)
h
=
∂C
∂u1
(u1,x2),
where the second equality is because P(U1 = u1) = 0.
Conditional distribution for a Clayton copula
Clayton copula is given by a formula:
C(u1,u2) = (u−θ1 +u
−θ
2 −1)−1/θ.
Then
∂C
∂u1
(u1,u2) = (u−θ1 +u
−θ
2 −1)−
1+θ
θ u−θ−11
and the solution in u2 to
y= (u−θ1 +u
−θ
2 −1)−
1+θ
θ u−θ−11
is
u2 =
(
u−θ1
(
y−
θ
1+θ −1)+1)−1/θ.
LECTURE 3. COPULAS 43
Algorithm for sampling from Clayton copula
• Step 1: Simulate u1 ∼U(0,1).
• Step 2: Simulate y∼U(0,1) and set
u2 =
(
u−θ1
(
y−
θ
1+θ −1)+1)−1/θ.
• Output: (u1,u2)
Algorithm for sampling from meta-Clayton copula
We want to sample from a distribution with marginals G1,G2 joined with a Clayton copula.
• Step 1: Simulate u1 ∼U(0,1).
• Step 2: Simulate y∼U(0,1) and set
u2 =
(
u−θ1
(
y−
θ
1+θ −1)+1)−1/θ.
• Step 3: Set x1 = q1(u1),
x2 = q2(u2),
where q1 is the quantile function for G1 and q2 is the quantile function for G2.
Compare to the simulation from meta-Gaussian distribution.
Conditional distribution for an Archimedian copula
Consider an Archimedian copula with a strict generator ψ, i.e., ψ(0) = ∞. Then the copula
is given by
C(u1,u2) = ψ−1(ψ(u1)+ψ(u2)).
Assume further that ψ is differentiable. We have
∂C
∂u1
(u1,u2) =
ψ′(u1)
ψ′(C(u1,u2))
.
3.7 Measuring dependence
Dependence measures
There are various dependence "measures":
• Linear correlation (or Pearson’s correlation coefficient)
• Rank correlations (Spearman’s rho and Kendall’s tau)
• Tail dependence
We will see that each of the dependence measures has some advantages and disadvan-
tages.
LECTURE 3. COPULAS 44
3.7.1 Linear correlation
Linear correlation (Pearson’s correlation coefficient)
Definition
Let X1 and X2 be two (univariate) random variables such that var(X1) < ∞ and var(X2) < ∞.
The linear correlation of X1 and X2 is defined by
ρ(X1,X2) =
cov(X1,X2)√
var(X1)

var(X2)
The linear correlation satisfies the following properties:
(1) −1≤ ρ(X1,X2)≤ 1.
(2) ρ(X1,X2) = ρ(X2,X1) (property of symmetry).
(3) If X2 = α+βX1 (with β 6= 0), then ρ(X1,X2) =±1. Otherwise, −1< ρ(X1,X2)< 1.
Properties of the linear correlation coefficient
(4) The linear correlation is invariant under increasing linear transformations: If β1,β2 > 0,
then ρ(α1+ β1X1,α2+ β2X2) = ρ(X1,X2). But, the linear correlation is not invariant
under general (non-linear) increasing transformations.
(5) If X1 and X2 are independent, then ρ(X1,X2) = 0. But, ρ(X1,X2) = 0 does not in general
imply that X1 and X2 are independent.
Properties of the linear correlation coefficient
Caveat: We have to pay attention to the following:
(6) Let the marginal CDFs be given and the linear correlation coefficient ρ ∈ [0,1] be given.
This is not enough to determine the joint distribution uniquely.
(7) In general, not all correlation values in [−1,1] are attainable, given the marginal CDFs.
Illustration for statement (6)
Let us consider two bivariate models.
• Model 1: The vector (X1,X2)′ follows N2(0, I2) (the bivariate normal with the identity
covariance matrix).
• Model 2: (Y1,Y2) = (X1,VX1), where V is a random variable independent of X1 with
P(V = 1) = P(V =−1) = 0.5
Then, we can show that:
• The marginals CDFs in both models are the same: that is, FY1 = FX1 and FY2 = FX2 .
• The correlation coefficient in both models is 0. But:
• Model 1 has the Gauss copula CGaI2 .
• Model 2 has copula C, where
C(u1,u2) = 0.5min{u1,u2}+0.5max{u1+u2−1,0}.
LECTURE 3. COPULAS 45
More about statement (7)
Proposition (Attainable correlations)
Let (X1,X2)′ be a random vector with marginal CDFs F1 and F2. We assume moreover that
0< var(X1)< ∞ and 0< var(X2)< ∞.
• The set of possible linear correlations is a closed interval [ρmin,ρmax] with ρmin < 0 <
ρmax.
• ρmin =−1 if and only if X2 is a decreasing linear function of X1.
• ρmax = 1 if and only if X2 is an increasing linear function of X1.
3.7.2 Rank correlation coefficients
Rank correlation coefficients
We place ourselves in dimension d = 2.
Spearman’s rho
Let X1 and X2 be two random variables with marginal CDFs F1 and F2. Spearman’s rank
correlation coefficient (or just Spearman’s rho) between X1 and X2, denoted by ρS, is defined
by
ρS(X1,X2) = ρ(F1(X1),F2(X2)),
where ρ is the (usual) linear correlation coefficient.
Rank correlation coefficients
We place ourselves in dimension d = 2.
Kendall’s tau
Let (X1,X2)′ and (X˜1, X˜2)′ be two random vectors which are independent and identically dis-
tributed (iid). Kendalls’s rank correlation coefficient (or just Kendall’s tau) between X1 and
X2, denoted by ρτ, is defined by
ρτ(X1,X2) = P
(
(X1− X˜1)(X2− X˜2)> 0
)−P((X1− X˜1)(X2− X˜2)< 0)
= E
[
sign((X1− X˜1)(X2− X˜2))
]
.
sign(x) =

1, if x> 0
0, if x= 0
−1, if x< 0.
Properties of Spearman’s rho
We have:
• −1≤ ρS(X1,X2)≤ 1.
• If X1 and X2 are independent, then ρS(X1,X2) = 0.
• ρS(X1,X2) = 0 does not, in general, imply that X1 and X2 are independent.
• If the vector (X1,X2)′ has continuous marginal CDFs F1 and F2 and (unique) copula C,
then
LECTURE 3. COPULAS 46
– ρS is invariant under increasing transformations.
– ρS(X1,X2) = 1 if and only if (X1,X2)′ has the co-monotonicity copula.
– ρS(X1,X2) =−1 if and only if (X1,X2)′ has the counter-monotonicity copula.
Properties of Kendall’s tau
We have:
• −1≤ ρτ ≤ 1.
• If X1 and X2 are independent, then ρτ(X1,X2) = 0.
• ρτ(X1,X2) = 0 does not, in general, imply that X1 and X2 are independent.
• If the vector (X1,X2)′ has continuous marginal CDFs F1 and F2 and (unique) copula C,
then
– ρτ is invariant under increasing transformations.
– ρτ(X1,X2) = 1 if and only if (X1,X2)′ has the co-monotonicity copula.
– ρτ(X1,X2) =−1 if and only if (X1,X2)′ has the counter-monotonicity copula.
Properties of the rank correlation coefficients (continued)
Proposition 3.7.1. If X1,X2 have continuous marginal CDFs (and unique copula C), then
ρτ(X1,X2) = 4
∫ 1
0
∫ 1
0
C(u1,u2)dC(u1,u2)−1
ρS(X1,X2) = 12
∫ 1
0
∫ 1
0
(C(u1,u2)−u1u2)du1du2.
Remark: By applying this proposition, we see that, in the case of continuous marginal
CDFs, the coefficients ρτ and ρS do not depend on the marginal CDFs (they depend only on
the copula C).
Properties of the rank correlation coefficients (continued)
Proposition 3.7.2 (The case of Archimedean copulas). Let X1 and X2 be random variables
with an Archimedean copula C generated by generator ψ. Then, Kendall’s tau of X1 and X2
can be computed by:
ρτ(X1,X2) = 1+4
∫ 1
0
ψ(u)
ψ′(u)
du.
Linear correlation vs. Rank correlations
• Marginal distributions and pairwise rank correlations do not fully determine joint distri-
bution.
• However, for any choice of continuous marginal distributions it is possible to specify a
bivariate distribution with the desired rank coefficient in [−1,1].
LECTURE 3. COPULAS 47
3.7.3 Tail dependence
Tail dependence coefficients
We place ourselves in the case d=2.
• Coefficients of tail dependence are "measures" of pairwise dependence based on copu-
las.
• Coefficients of tail dependence "measure" extremal dependence.
Definition (Tail dependence coefficients)
Let (X1,X2)′ be a random vector with marginal CDFs F1 and F2 and quantile functions q1 and
q2.
• The coefficient of upper tail dependence λupper is defined by the following limit (if the
limit exists in [0,1])
λupper(X1,X2) = lim
α↑1
P(X2 > q2(α)|X1 > q1(α))
• The coefficient of lower tail dependence λlower is defined by the following limit (if the
limit exists in [0,1])
λlower(X1,X2) = lim
α↓0
P(X2 ≤ q2(α)|X1 ≤ q1(α))
Tail dependence coefficients
• If λupper(X1,X2) = 0, we speak about asymptotic upper tail independence.
• If λlower(X1,X2) = 0, we speak about asymptotic lower tail independence.
We can show that:
If the marginal CDFs F1 and F2 are continuous, then
• λupper(X1,X2) = limα↑1
1−2α+C(α,α)
1−α
• λlower(X1,X2) = limα↓0
C(α,α)
α
,
where C denotes the unique copula of (X1,X2)′.
The coefficients of tail dependence can be computed explicitly in the case of a number of
well-known copulas.
LECTURE 3. COPULAS 48
Coefficients of tail dependence for Gumbel Copula
Recall the Gumbel copula with parameter θ:
CGuθ (u1,u2) = exp(−[(− ln(u1))θ+(− ln(u2))θ]1/θ).
We have
λupper = lim
α↑1
1−2α+CGuθ (α,α)
1−α = 2− limα↑1 2
1/θα2
1/θ−1 = 2−21/θ
λlower = lim
α↓0
CGuθ (α,α)/α= limα↓0
α2
1/θ−1 = 0
Interpretation: Gumbel copulas exhibits upper tail dependence and asymptotic lower tail
independence.
Coefficients of tail dependence for Clayton Copula
Clayton copula with parameter θ:
CClθ (u1,u2) = (u
−θ
1 +u
−θ
2 −1)−1/θ
We have
λlower = lim
α↓0
CClθ (α,α)/α= limα↓0
(2α−θ−1)−1/θ/α
= lim
α↓0
2(2−αθ)−1/θ−1 = 2−1/θ
λupper = 0
Interpretation: Clayton copulas exhibit lower tail dependence and asymptotic upper tail
independence.
Pictures
Coefficients of tail dependence for Gaussian copula
We have λupper = λlower because of radial symmetry. We can show that
λupper = λlower = 2 limx→−∞Φ
(
x(1−ρ)√
1−ρ2
)
= 0.
Interpretation: Gaussian copulas exhibit asymptotic upper and asymptotic lower tail inde-
pendence.
Coefficients of tail dependence for Student’s t-copula
We have λupper = λlower because of radial symmetry. Moreover, we have
λupper = λlower = 2tν
(


(ν+1)(1−ρ)
1+ρ
)
.
(Here, tν denotes the CDF of the univariate Student with ν degrees of freedom.) Remark:
We note that tail dependence coefficient for t-copula decreases as the degree of freedom ν
increases.
LECTURE 3. COPULAS 49
3.8 Calibration of copulas
Why are measures of dependence important?
To calibrate copulas:
• There is a known relationship between dependence measures and copula coefficients.
• It is very hard to estimate copula parameters from the data as the data incorporates
marginal distributions!
Calibration of a Gauss copula
Suppose we assume a meta-Gaussian model for (X1, . . . ,Xd) with copulaCGaR and we wish
to estimate the correlation matrix R= (ri, j)di, j=1.
It can be shown (see Thm. 5.36 in McNeil et al.) that
ρS(Xi,X j) = (6/pi)arcsin
1
2
ri, j ≈ ri, j,
where the final approximation is very accurate. This suggests we estimate R by the matrix of
pairwise Spearman’s rank correlation coefficients.
Calibration of Gumbel and Clayton copula
Recall the proposition: Let X1 and X2 be random variables with an Archimedean copula C
with generator ψ. Then, Kendall’s tau of X1 and X2 equals
ρτ(X1,X2) = 1+4
∫ 1
0
ψ(u)
ψ′(u)
du.
Gumbel copula CGuθ :
ρτ(Xi,X j) = 1− 1θ
LECTURE 3. COPULAS 50
Clayton copula CClθ :
ρτ(Xi,X j) =
θ
θ+2
Estimation of Kendall’s tau
Recall that
ρτ(Xi,X j) = P
(
(Xi− X˜i)(X j− X˜ j)> 0
)−P((Xi− X˜i)(X j− X˜ j)< 0)
= E
[
sign((Xi− X˜i)(X j− X˜ j))
]
.
Then, having a dataset (Xi,n,X j,n) for n= 1, . . . ,N, we have the following estimator:
ρˆτ(Xi,X j) =
(
n
2
)−1

1≤msign
(
(Xi,m−Xi,n)(X j,m−X j,n)
)
Estimation of Spearman’s rho
Recall the definition of Spearman’s rho: ρS(Xi,X j) = ρ(Fi(Xi),Fj(X j)).
A dataset (Xi,n,X j,n) for n= 1, . . . ,N
Direct estimator: compute empirical CDFs Fˆi and Fˆj, and estimate linear correlation for the
dataset
(Fˆi(Xi,n), Fˆj(X j,n), n= 1, . . . ,N.
Rank-based estimator: construct a rank dataset
(ranki(Xi,n), rank j(X j,n)
)
, n= 1, . . . ,N,
and estimate linear correlation for this dataset.
Lecture 4
Credit risk
Credit Risk and Credit Risk Management
Definition (Credit Risk)
Credit risk is the risk that the value of a portfolio changes due to default of a counterparty or
unexpected changes (downgrades) in the credit quality of a counterparty.
Default = the counterparty cannot honour a financial commitment, for example, repay the
debt.
This is a relevant risk component in all portfolios.
4.1 Types of credit risk models
Static and dynamic models of credit risk
Static models:
• Credit standing of a counterparty is assessed at the end of the period over which the
loss distribution is calculated
• Used in credit risk management.
Dynamic models:
• An action is performed at the time when the counterparty defaults.
• Used in modelling and valuation of credit-risk derivatives.
We will mainly focus on static models for credit risk and credit risk management.
Modelling credit risk
Overview of the main types of models:
• Reduced form models:
– Mixture models (Bernoulli mixture and Poisson mixture)
• Structural models (or Firm-value models):
51
LECTURE 4. CREDIT RISK 52
– univariate threshold models and Merton’s model
– multivariate threshold models
Challenges in credit risk modelling:
• lack of public data
• skewed loss distribution
• dependence of defaults
Overview of types of credit risk models
• Reduced form models: the mechanism leading to default is not specified. In static re-
duced form models the occurence of a default is usually modelled by a Bernoulli random
variable.
– Mixture models: Defaults occur independently given the values of common (ran-
dom) factors. Hence, defaults are not independent.
• Structural models: Default occurs when the value of some random variable (e. g. value
of the company’s assets), falls below a certain threshold.
Notation
• We consider a portfolio of m obligors (where m is a positive integer).
• T is a fixed time horizon.
• We will focus on the binary outcomes of default and non-default (we will ignore down-
grading of the credit ranking of obligors).
• We will denote by Y1,Y2, . . . ,Ym the default indicator random variables of obligors 1, 2,
. . . ,m, defined by:
Yi =
{
1, if obligor i defaults,
0, if obligor i does not default.
• We denote by M the random variable corresponding to the number of obligors which
default, that is,
M = Y1+Y2+ . . .+Ym.
4.2 Mixture models
Mixture models
• These are static reduced-form models.
• The mechanism leading to default is left unspecified.
• There are K common economic factors.
LECTURE 4. CREDIT RISK 53
• The default risk of each obligor is assumed to depend on the common (random) eco-
nomic factors.
• Given a realisation of the factors, defaults of individual obligors are assumed to be inde-
pendent.
• Examples:
– Bernoulli mixture model.
– Poisson mixture model.
4.2.1 Bernoulli mixture model
Bernoulli mixture model
Definition 4.2.1. The random vector Y= (Y1,Y2 . . . ,Ym)′ follows a Bernoulli mixture model if
• there is a K-dimensional random vector (of risk factors)Ψ = (Ψ1, . . . ,ΨK)′, and
• for every i= 1, . . . ,m, there is a function pi : RK → [0,1] such that, conditionally on the
values of Ψ, the components of Y are independent Bernoulli random variables with
P(Yi = 1|Ψ = ψ) = pi(ψ),
P(Yi = 0|Ψ = ψ) = 1− pi(ψ),
for every i= 1, . . . ,m.
Default of a single obligor
Conditional on the realisation ψ of common economic factorsΨ we have
P(Yi = 1|Ψ = ψ) = pi(ψ).
Remark 4.2.2. The default probability p¯i of a single obligor i is
p¯i = P(Yi = 1) = E(pi(Ψ)).
Defaults of different obligors are not idependent - they all depend on the common economic
factorsΨ!
Default for multiple obligors
Remark 4.2.3. For y= (y1,y2, . . . ,ym) ∈ {0,1}m:
• P(Y= y|Ψ = ψ) =
m

i=1
pi(ψ)yi(1− pi(ψ))1−yi
• P(Y= y) = E
[ m

i=1
pi(Ψ)yi(1− pi(Ψ))1−yi
]
.
−→ Example : 2 companies with default indicators Y1 and Y2 and a 1-dimensional risk
factorΨ.
LECTURE 4. CREDIT RISK 54
One-factor model: the homogeneous case
We consider the following particular case:
• Ψ is univariate random variable, that is, only one factor.
• The same function p for all m obligors, that is,
p1 = p2 = . . .= pm = p (homogeneous)
We define the random variable Q by:
• Q= p(Ψ).
Theorem 4.2.4. Conditionally on Q = q, the number of defaults M = ∑mi=1Yi has a binomial
distribution:
P(M = j|Q= q) =
(
m
j
)
q j(1−q)m− j.
One-factor model: the homogeneous case
Moreover:
If the random variable Q follows a discrete distribution with values {q1, . . . ,qL}, then
P(M = j) =
(
m
j
) L

n=1
q jn(1−qn)m− jP(Q= qn).
If the random variable Q follows a continuous distribution with the density g(q), then
P(M = j) =
(
m
j
)∫ 1
0
q j(1−q)m− jg(q)dq.
4.3 Structural models
Structural models (or Firm-value models)
• Threshold models
• Merton model (1974)
4.3.1 Threshold models
Univariate threshold model
univariate = default of one counterparty
The default occurs when the value of a (random) state variable X1 lies below a threshold
d1, i.e., the default indicator Y1 is given by
Y1 =
{
0, X1 > d1,
1, X1 ≤ d1.
Examples:
• Merton model: X1 denotes the firm value at time T , d1 = D is the debt.
• Credit rating model: X1 denotes a rating at time T taking values in {0,1,2, . . . ,N} with 0
denoting bankruptcy and d1 = 0.
LECTURE 4. CREDIT RISK 55
Different forms of default indicators
Depending on the interpretation of X1, the default indicator Y1 may take the following forms:
Y1 = 1IX1≥d1 , where X1 may be the debt-to-equity ratio,
or versions with strict inequalities:
Y1 = 1IX1d1 .
Multivariate threshold model
There are m firms. Default of firm i occurs if Xi ≤ di, i.e., the default indicator is
Yi = 1IXi≤di
or one of the forms from the previous slide.[5pt]
The marginal distributions of Xi’s are linked using a copula C.[5pt]
Why this new model?
• This model offers an alternative to mixture models.
• Such models have been popular in the industry: CreditMetrics and KMV model
Example of a multivariate threshold model
• There are m obligors
• The state variable Xi takes two values: 0 or 1
• Threshold is di = 0
• The dependence between state variables is described by a given copula C
What is the probability that all counterparties default, i.e.
P(M = m) =?,
where
M =
m

i=1
Yi.
CreditMetrics and KMV
• There are m obligors (called firms)
• The state variable Vi represents firm i’s value
• Threshold for firm i is di
• (log(V1), . . . , log(Vm))∼ N(µ,Σ)
Default:
Vi ≤ di ⇐⇒ log(Vi)≤ log(di)
so there is an equivalent model with state variables Xi = log(Vi) with
(X1, . . . ,Xm)∼ N(µ,Σ)
and thresholds dˆi = log(di).
LECTURE 4. CREDIT RISK 56
4.3.2 Merton’s model
Merton’s model
Merton’s model is an extension of a univariate threshold model.
• There is one firm (company, obligor).
• T is the fixed time horizon.
• The value of the firm’s assets at time t is denoted by V f irmt .
Debt
• The structure of the firm’s debt is simple: it consists of D units of zero-coupon bonds
with maturity T which the firm issues at 0. In other words, the firm owes the amount D
to its bond holders.
• At T , there are two situations:
– If V f irmT >D, the firm does not default: The firm pays D to its bond holders and the
residual V f irmT −D is left for the shareholders.
– If V f irmT ≤ D, the firm defaults: The firm owes D but can repay only V f irmT . The
bond holders receive V f irmT , the shareholders receive nothing.
Payoffs
In summary:
• The amount received by the shareholders at time T is
0I{V f irmT ≤D}
+(V f irmT −D)I{V f irmT >D} = (V
f irm
T −D)+.
• The amount received at T by the bondholders is
V f irmT I{V f irmT ≤D}
+DI{V f irmT >D}
=min(D,V f irmT )
= D−max(0,D−V f irmT )
= D− (D−V f irmT )+.
Remark: Bondholders are "compensated" for the credit risk by the so-called credit spread (cf.
later in the slides).
Dynamics of firm’s value
In Merton model, the value of the firm V f irmt is modelled by a geometric Brownian motion.
More precisely,
dV f irmt = µVV
f irm
t dt+σVV
f irm
t dWt ,
where µV is the drift parameter and σV 6= 0 is the volatility parameter, and (Wt) is a Brownian
motion under P. The explicit solution of the above SDE is:
V f irmt =V
f irm
0 e
(µV−σ2V /2)t+σVWt ,
i.e.
log(V f irmt )∼ N(log(V f irm0 )+(µV −σ2V/2)t, σ2V t).
LECTURE 4. CREDIT RISK 57
Probability of default
Question: What is the default probability of the firm in this model? In other words, compute
P(V f irmT ≤ D) =?
Answer: We use the same type of computations as the Black-Scholes model to get:
P(V f irmT ≤ D) =Φ
(
− log(V
f irm
0 /D)+(µV −σ2V/2)T
σV

T
)
.
Probability of default v. parameters
Question: How is the default probability affected by changes in the parameters?
Answer: We can see from the formula for the default probability that:
• The default probability increases, when the debt D increases.
• The default probability decreases, when the initial value of the firm V f irm0 (at time 0)
increases.
• The default probability decreases, when the drift parameter µV increases (upward ten-
dency in the dynamics the process (V f irmt )).
• If V f irm0 > D and µV ≥ σ2V/2, when the volatility parameter σV increases, the default
probability descreases.
Derivatives on firm’s value
By using the tools developed for Black-Scholes option pricing model, we can price deriva-
tives contracts in Merton’s model when the underlying is the value of the firm’s assets V f irmT .
More specifically, let r be the risk-free interest rate (where r > 0).
Let X = h(V f irmT ) be a pay-off payable at T (with h deterministic). We have:
price0(X) = EQ(e−rTh(V
f irm
T )),
where Q denotes the risk neutral probability measure in this model.
Risk neutral measure Q
Under the risk neutral measure the discounted firm’s value (Vˆt) is a Q-martingale, i.e.,
V̂t = e−rtV
f irm
t
and for any t ≥ s≥ 0 we have
EQ
(
V̂t |Fs) = V̂s,
where the filtration (Ft) is generated by the Brownian motion Wt .
Furthermore,
dV f irmt = rV
f irm
t dt+σVV
f irm
t dW˜t ,
where (W˜t) is a Q-Brownian motion.
LECTURE 4. CREDIT RISK 58
Defaultable zero-coupon bond
A defaultable zero-coupon bond with maturity T and face value D issued by the firm is a
bond which
• pays off D at the maturity T if the firm has not defaulted;
• pays off V f irmT at the maturity T if the firm has defaulted.
The amount received at T by the bondholders is
V f irmT I{V f irmT ≤D}
+DI{V f irmT >D}
=min(D,V f irmT ) = D− (D−V f irmT )+.
Notice that the amount received at T for a zero-coupon non-defaultable bond would be D.
Price of the defaultable zero-coupon bond
The payoff to the bondholders at time T is
D− (D−V f irmT )+.
Its price at time 0 is:
EQ
(
e−rT (D− (D−V f irmT )+)
)
(4.1)
= e−rTD−EQ(e−rT (D−V f irmT )+) = e−rTD− pBS0 , (4.2)
where pBS0 denotes the Black-Scholes price at 0 of a put option on V
f irm
T with strike D and
maturity T .
Price of put option on the firm’s value
The price of the put is computed identically as in the Black-Scholes model of the stock
market:
pBS0 = e
−rTDΦ(−d2)−V f irm0 Φ(−d1),
where
d1 =
log(V f irm0 /D)+(r+σ
2
V/2)T
σV

T
,
d2 = d1−σV

T .
Price per unit of debt
Hence, the pay-off at T of one unit of defaultable zero-coupon bond is
1
D
(D− (D−V f irmT )+) = 1−
1
D
(D−V f irmT )+.
The price at 0 of one unit of defaultable zero-coupon bond, denoted by P(0,T ), is
P(0,T ) =
1
D
(e−rTD−PBS0 ) = e−rT −
1
D
pBS0 .
For comparison:
The pay-off at T of one unit of risk-free zero-coupon bond is 1. The price at 0 of one unit of
risk-free zero-coupon bond, denoted by P f ree(0,T ), is e−rT .
LECTURE 4. CREDIT RISK 59
Credit spread
Definition (Credit spread)
The credit spread at time 0 (for maturity T ), denoted by Spread(0,T ), is defined by:
Spread(0,T ) =− 1
T
(
log(P(0,T ))− log(P f ree(0,T ))
)
=− 1
T
log
( P(0,T )
P f ree(0,T )
)
where
• P f ree(0,T ) is the price at time 0 of the default-free zero coupon bond, and
• P(0,T ) is the price at time 0 of the defaultable zero coupon bond.
Credit spread in Merton’s model
In Merton’s model, we can compute:
Spread(0,T ) =− 1
T
log
(
Φ(d2)+
V f irm0
DP f ree(0,T )
Φ(−d1)
)
.
Questions:
• Why is there the logarithm in the definition of the spread?
• How to derive the formula for the credit spead?

essay、essay代写