xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

扫码添加客服微信

扫描添加客服微信

程序代写案例-UL18/0326

时间：2021-05-05

© University of London 2018

UL18/0326 Page 1 of 6 D0

~~ST3133_ZA_2016_d0

This paper is not to be removed from the Examination Hall

UNIVERSITY OF LONDON ST3133 ZA

BSc degrees and Diplomas for Graduates in Economics, Management, Finance

and the Social Sciences, the Diplomas in Economics and Social Sciences

Advanced Statistics: Distribution Theory

Monday, 14 May 2018: 10:00 to 12:00

Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks)

and all THREE questions from Section B (60 marks in total). Candidates are strongly

advised to divide their time accordingly.

A calculator may be used when answering questions on this paper and it must comply

in all respects with the specification given with your Admission Notice. The make and

type of machine must be clearly stated on the front cover of the answer book.

PLEASE TURN OVER

Section A

Answer all three parts of question 1 (40 marks in total)

1. (a) Let g(x) be a function taking on integer values of x, with

g(x) =

a, x = −2,−1;

3a, x = 0, 1;

4a, x = 2, 3;

0, otherwise.

i. Find a so that g(x) is a probability mass function. [3 marks]

ii. Let X be a discrete random variable with probability mass function g(x).

Find E(X) and Var(X). [5 marks]

iii. Write down the probability mass function for Y = X2−2|X|+1. [4 marks]

(b) The cumulative distribution function FX(x) for the continuous random variable

X is defined by

FX(x) =

0, x < 0;

ax3/3, 0 ≤ x < 1;

(a(x− 2)3 + 2a)/3, 1 ≤ x < 2;

1, x ≥ 2.

i. Find the value of a. [2 marks]

ii. Derive the probability density function of X. [3 marks]

iii. Let W = X2. Derive the cumulative distribution function of W . Hence,

derive the probability density function of W . [7 marks]

UL18/0326 2 of 6

(c) Let X follow an exponential distribution with rate λ, i.e., X has a density

function

fX(x) =

{

λe−λx, x > 0;

0, otherwise.

i. Derive the moment generating function of X. [3 marks]

ii. Let Y be an independent and identically distributed copy of X. For w > 0,

show that

P (X − Y ≤ w) = 1− e

−λw

2

.

(Hint: find the joint density of X and Y first. Determine the valid region

in the double integral involved.) [5 marks]

iii. For w ≤ 0, show that

P (X − Y ≤ w) = e

λw

2

.

[5 marks]

iv. Using parts ii and iii of question (c), show that the density function of

W = X − Y is given by

fW (w) =

λe−λ|w|

2

, w ∈ R.

[3 marks]

UL18/0326 3 of 6

Section B

Answer all three questions in this section (60 marks in total)

2. The conditional density of a random variable X given Y = y is given by

fX|Y (x|y) =

{

3x2/y3, 0 < x < y < 3;

0, otherwise.

The conditional density of Y given X = x is given by

fY |X(y|x) =

{

3y2/(27− x3), 0 < x < y < 3;

0, otherwise.

(a) Find the ratio fX(x)/fY (y), where fX(x) and fY (y) are the marginal den-

sities of X and Y , respectively. [2 marks]

(b) By integrating out x first in the answer in (a), show that

fY (y) =

{

2y5/243, 0 < y < 3;

0, otherwise.

Is X independent of Y ? Justify your answer. [9 marks]

(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully

state the region for (U, V ) where this joint density is non-zero. [9 marks]

UL18/0326 4 of 6

3. If X is Gamma distributed with parameters α and β, i.e., X ∼ Gamma(α, β),

then it has density

fX(x) =

βα

Γ(α)

xα−1e−βx, x > 0,

and Γ(α) =

∫∞

0

yα−1e−ydy for α > 0.

(a) If X ∼ Gamma(α1, β), Y ∼ Gamma(α2, β), and X is independent of Y ,

derive the distribution of X + Y . You may use the moment generating

function of a Gamma random variable without proof, as long as you state

it clearly. [7 marks]

(b) Let Xi ∼ Gamma(α, β), i = 1, . . . , N , be independent of each other and

α, β > 0. Each Xi is also independent of N , which is Poisson distributed

with mean µ, so that the probability mass function for N is given by

pN(n) =

µne−µ

n!

, n = 0, 1, . . . .

Consider the random variable

W =

N∑

i=1

Xi,

with the convention that W = 0 if N = 0.

i. Derive the moment generating function of W . [8 marks]

ii. Find the mean ofW . You can use the means of a Poisson and a Gamma

random variable without proof. If you use any standard results about

random sums, you must first state them clearly. [5 marks]

UL18/0326 5 of 6

4. Suppose we have a biased coin, which comes up heads with probability u. An

experiment is carried out so that X is the number of independent flips of the

coin required for r heads to show up, where r ≥ 1 is known.

(a) Show that the probability mass function of X is

pX(x) =

{ (

x−1

r−1

)

ur(1− u)x−r, x = r, r + 1, . . . .;

0, otherwise.

[5 marks]

(b) Suppose U is uniformly distributed on (0, 1), and the distribution in part

(a) becomes

pX|U(x|u) =

{ (

x−1

r−1

)

ur(1− u)x−r, x = r, r + 1, . . . .;

0, otherwise.

i. Find the marginal probability mass function for X. You can use∫ 1

0

ya(1− y)bdy = a!b!

(a+ b+ 1)!

for non-negative integers a and b without proof. [6 marks]

ii. Show that the density of U |X = x is given by

fU |X(u|x) = (x+ 1)!

r!(x− r)!u

r(1− u)x−r, 0 < u < 1.

Hence find the mean of U |X = x. [5 marks]

(c) Another independent experiment is carried out, with Y denoting the num-

ber of independent flips of the coin required for r heads to show up (the

same r as for the first experiment).

State (no need for a derivation) the density of U |(X,Y ) = (x, y) and its

mean, where U is still uniformly distributed on (0, 1) as in part (b).

. [4 marks]

END OF PAPER

UL18/0326 6 of 6

© University of London 2018

UL18/0327 Page 1 of 7 D0

~~ST3133_ZA_2016_d0

This paper is not to be removed from the Examination Hall

UNIVERSITY OF LONDON ST3133 ZB

BSc degrees and Diplomas for Graduates in Economics, Management, Finance

and the Social Sciences, the Diplomas in Economics and Social Sciences

Advanced Statistics: Distribution Theory

Monday, 14 May 2018: 10:00 to 12:00

Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks)

and all THREE questions from Section B (60 marks in total). Candidates are strongly

advised to divide their time accordingly.

A calculator may be used when answering questions on this paper and it must comply

in all respects with the specification given with your Admission Notice. The make and

type of machine must be clearly stated on the front cover of the answer book.

PLEASE TURN OVER

Section A

Answer all three parts of question 1 (40 marks in total)

1. (a) Let g(x) be a function taking on integer values of x, with

g(x) =

2a, x = −3,−1;

a, x = 0, 2;

3a, x = 1, 3;

0, otherwise.

i. Find a so that g(x) is a probability mass function. [3 marks]

ii. Let X be a discrete random variable with probability mass function g(x).

Find E(X) and Var(X). [5 marks]

iii. Write down the probability mass function of Y = X2− 4|X|+4. [4 marks]

(b) The cumulative distribution function FX(·) for the continuous random variable

X is defined by

FX(x) =

0, x < 0;

ax2/4, 0 ≤ x < 1;

((x− 1)3 + a)/4, 1 ≤ x < 2;

1, x ≥ 2.

i. Find the value of a. [1 mark]

ii. Derive the probability density function of X. [4 marks]

iii. Let W = X2. Derive the cumulative distribution function of W . Hence,

derive the probability density function of W . [7 marks]

(c) Let X follow an exponential distribution with rate λ, i.e., X has a density

function

fX(x) =

{

λe−λx, x > 0;

0, otherwise.

UL18/0327 Page 2 of 7

i. Derive the moment generating function of X. [3 marks]

ii. Let Y be an independent and identically distributed copy of X. For w > 0,

show that

P (X − Y ≤ w) = 1− e

−λw

2

.

(Hint: find the joint density of X and Y first. Determine the valid region

in the double integral involved.) [5 marks]

iii. For w ≤ 0, show that

P (X − Y ≤ w) = e

λw

2

.

[5 marks]

iv. Using parts ii and iii of question (c), show that the density function of

W = X − Y is given by

fW (w) =

λe−λ|w|

2

, w ∈ R.

[3 marks]

UL18/0327 Page 3 of 7

Section B

Answer all three questions in this section (60 marks in total)

2. The conditional density of a random variable X given Y = y is given by

fX|Y (x|y) =

{

x/(2y2), 0 < x < 2y < 2;

0, otherwise.

The conditional density of Y given X = x is given by

fY |X(y|x) =

{

24y2/(8− x3), 0 < x < 2y < 2;

0, otherwise.

(a) Find the ratio fY (y)/fX(x), where fX(x) and fY (y) are the marginal den-

sities of X and Y , respectively. [2 marks]

(b) By integrating out y first in the answer in (a), show that

fX(x) =

{

(5x(8− x3))/48, 0 < x < 2;

0, otherwise.

Is X independent of Y ? Justify your answer. [9 marks]

(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully

state the region for (U, V ) where this joint density is non-zero. [9 marks]

UL18/0327 Page 4 of 7

3. If X is Gamma distributed with parameters α and β, i.e., X ∼ Gamma(α, β),

then it has density

fX(x) =

βα

Γ(α)

xα−1e−βx, x > 0,

and Γ(α) =

∫∞

0

yα−1e−ydy for α > 0.

(a) Suppose X ∼ Gamma(α1, β1), Y ∼ Gamma(α2, β2), and X is independent

of Y . Derive the distribution of β1X + β2Y . You may use the moment

generating function of a Gamma random variable without proof, as long as

you state it clearly. [7 marks]

(b) Let Xi ∼ Gamma(α, βi), i = 1, . . . , N , be independent of each other and

α, βi > 0. Each Xi is also independent of N , which is Poisson distributed

with mean µ, so that the probability mass function for N is given by

pN(n) =

µne−µ

n!

, n = 0, 1, . . . .

Consider the random variable

W =

N∑

i=1

βiXi,

with the convention that W = 0 if N = 0.

i. Derive the moment generating function of W . [8 marks]

ii. Find the mean of W . You can use the mean of a Poisson random vari-

able without proof. The mean of X ∼ Gamma(α, β) is α/β. [5 marks]

UL18/0327 Page 5 of 7

4. Suppose we have a biased coin, which comes up heads with probability u. An

experiment is carried out so that X is the number of independent flips of the

coin required for r heads to show up, where r ≥ 1 is known.

(a) Show that the probability mass function for X is

pX(x) =

{ (

x−1

r−1

)

ur(1− u)x−r, x = r, r + 1, . . . .;

0, otherwise.

[5 marks]

(b) Suppose U is random and has a density given by

fU(u) =

{

Γ(α+β)

Γ(α)Γ(β)

uα−1(1− u)β−1, 0 < u < 1;

0, otherwise.

where α, β > 0, and Γ(α) is defined in question 3, which has the property

that Γ(α) = (α − 1)Γ(α − 1) for α ≥ 1, and Γ(k) = (k − 1)! for a positive

integer k. The distribution in part (a) thus becomes

pX|U(x|u) =

{ (

x−1

r−1

)

ur(1− u)x−r, x = r, r + 1, . . . .;

0, otherwise.

i. Find the marginal probability mass function of X if α = β = 2.

. [6 marks]

ii. With α = β = 2 still, show that the density of U |X = x is given by

fU |X(u|x) =

{

(x+3)!

(r+1)!(x−r+1)!u

r+1(1− u)x−r+1, 0 < u < 1;

0, otherwise.

Hence find the mean of U |X = x. [5 marks]

(c) Another independent experiment is carried out, with Y denoting the num-

ber of independent flips of the coin required for r heads to show up (the

same r as for the first experiment).

UL18/0327 Page 6 of 7

State (no need for a derivation) the density of U |(X,Y ) = (x, y) and its

mean, where U still has the density in part (b) with α = β = 2. [4 marks]

END OF PAPER

UL18/0327 Page 7 of 7

Examiners’ commentaries 2018

Examiners’ commentaries 2018

ST3133 Advanced statistics: distribution theory

Important note

This commentary reflects the examination and assessment arrangements for this course in the

academic year 2017–18. The format and structure of the examination may change in future years,

and any such changes will be publicised on the virtual learning environment (VLE).

Information about the subject guide and the Essential reading

references

Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).

You should always attempt to use the most recent edition of any Essential reading textbook, even if

the commentary and/or online reading list and/or subject guide refer to an earlier edition. If

different editions of Essential reading are listed, please check the VLE for reading supplements – if

none are available, please use the contents list and index of the new edition to find the relevant

section.

General remarks

Learning outcomes

At the end of this half course and having completed the Essential reading and activities, you should

be able to:

• recall a large number of distributions and be a competent user of their mass/density and

distribution functions and moment generating functions

• explain relationships between variables, conditioning, independence and correlation

• relate the theory and method taught in the half course to solve practical problems.

Format of the examination

As in previous years, the ability of candidates varied widely. As every year, though, many

candidates still have not studied the subject guide thoroughly, which you should do and know the

topics covered by the syllabus, not only practise on past papers.

The format of this year’s examination will be retained for next year’s examination.

Key steps to improvement

Many candidates found it difficult to perform variable transformations when the random variable

involved is discrete, showing a lack of understanding of probability distributions. The same goes for

1

ST3133 Advanced statistics: distribution theory

bivariate random variables with continuous distributions. No matter whether finding the

distribution function first (for example, Question 1 (c)) or using the variable transformation formula

(Question 2 (c)), many candidates struggled to find the correct answer either because of inaccurate

calculations or, worse, a lack of knowledge of the subject.

When calculating a probability or an expectation, especially when evaluating double integrals, many

candidates got the results wrong because of carelessly placing the wrong limits of integration. Please

practise more on how to find the limits correctly for a particular region of a joint density.

Question 1 (c) iv., for example, can be answered even without finishing the previous parts; similarly,

Question 2 (b) when determining if X and Y are independent. Candidates should always spare some

time to read the questions carefully, then see if there are parts which can be completed quickly

without the need to solve previous parts.

You should be ready to derive the moment generating functions of standard random variables, like

the normal, gamma, chi-squared, exponential (all continuous), or the geometric, binomial, Poisson

(all discrete), and ideally know the forms by heart. It is also important to know basic applications of

these distributions, and apply the correct formulae in probability questions.

Examination revision strategy

Many candidates are disappointed to find that their examination performance is poorer than they

expected. This may be due to a number of reasons, but one particular failing is ‘question

spotting’, that is, confining your examination preparation to a few questions and/or topics which

have come up in past papers for the course. This can have serious consequences.

We recognise that candidates might not cover all topics in the syllabus in the same depth, but you

need to be aware that examiners are free to set questions on any aspect of the syllabus. This

means that you need to study enough of the syllabus to enable you to answer the required number of

examination questions.

The syllabus can be found in the Course information sheet available on the VLE. You should read

the syllabus carefully and ensure that you cover sufficient material in preparation for the

examination. Examiners will vary the topics and questions from year to year and may well set

questions that have not appeared in past papers. Examination papers may legitimately include

questions on any topic in the syllabus. So, although past papers can be helpful during your revision,

you cannot assume that topics or specific questions that have come up in past examinations will

occur again.

If you rely on a question-spotting strategy, it is likely you will find yourself in difficulties

when you sit the examination. We strongly advise you not to adopt this strategy.

2

Examiners’ commentaries 2018

Examiners’ commentaries 2018

ST3133 Advanced statistics: distribution theory

Important note

This commentary reflects the examination and assessment arrangements for this course in the

academic year 2017–18. The format and structure of the examination may change in future years,

and any such changes will be publicised on the virtual learning environment (VLE).

Information about the subject guide and the Essential reading

references

Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).

You should always attempt to use the most recent edition of any Essential reading textbook, even if

the commentary and/or online reading list and/or subject guide refer to an earlier edition. If

different editions of Essential reading are listed, please check the VLE for reading supplements – if

none are available, please use the contents list and index of the new edition to find the relevant

section.

Comments on specific questions – Zone A

Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks) and all

THREE questions from Section B (60 marks in total). Candidates are strongly advised to

divide their time accordingly.

Section A

Answer all three parts of question 1 (40 marks in total).

Question 1

(a) Let g(x) be a function taking on integer values of x, with:

g(x) =

a for x = −2,−1

3a for x = 0, 1

4a for x = 2, 3

0 otherwise.

i. Find a so that g(x) is a probability mass function.

(3 marks)

ii. Let X be a discrete random variable with probability mass function g(x).

Find E(X) and Var(X).

(5 marks)

iii. Write down the probability mass function for Y = X2 − 2|X|+ 1.

(4 marks)

3

ST3133 Advanced statistics: distribution theory

Reading for this question

This question is about discrete probability distributions and basic moment calculations, and

has been overall well-answered except for part iii. Discrete random variables are discussed in

Section 3.3.1 of the subject guide, with examples. The mean and variance are covered in

Sections 3.4.2 and 3.4.3, respectively.

Part i. needs the application of Claim 3.3.6 iii. that the sum of probabilities over the

support is 1 to find the value of a. This part was done well in general although many

candidates calculated the answer incorrectly because of careless mistakes. Some candidates

just equated a+ 3a+ 4a = 1, not realising that there are two values of x which have

probability a, 3a and 4a, respectively. This indeed shows a lack of understanding of discrete

probability distributions, which is disappointing.

Part ii. was done well in general even if a could be incorrect, and marks were awarded in full

if the value of a was the only thing that was wrong.

For part iii., there is no formal transformation formula like that in the continuous case. To

do this question, candidates should find out the support (Definition 3.3.2 on page 53 of the

subject guide) of Y first, which is 0, 1, 4. For instance, X = −2, 0, 2 are all mapped to

Y = 1, so

P (Y = 1) = g(−2) + g(0) + g(2) = a+ 3a+ 4a = 8a = 1

2

.

Approaching the question

i. We must have:

1 =

∑

x

g(x) = (a+ a) + (3a+ 3a) + (4a+ 4a)

so that a = 1/16.

ii. We have:

E(X) =

∑

x

x g(x) = (−2− 1)× a+ (0 + 1)× 3a+ (2 + 3)× 4a = 20a = 5

4

and:

Var(X) = E(X2)− (E(X))2 =

∑

x

x2 g(x)− 25

16

= 5a+ 3a+ 13× 4a− 25

16

= 60a− 25

16

=

35

16

.

iii. Since Y = (|X| − 1)2, it is easy to see that −2, 0 and 2 are mapped to 1, −1 and 1 are

mapped to 0 and, finally, 3 is mapped to 4. Hence the probability mass function of Y is:

gY (y) =

a+ 3a for y = 0

a+ 3a+ 4a for y = 1

4a for y = 4

0 otherwise

=

1/4 for y = 0

1/2 for y = 1

1/4 for y = 4

0 otherwise.

(b) The cumulative distribution function FX(x) for the continuous random variable

X is defined by:

FX(x) =

0 for x < 0

ax3/3 for 0 ≤ x < 1

(a(x− 2)3 + 2a)/3 for 1 ≤ x < 2

1 for x ≥ 2.

4

Examiners’ commentaries 2018

i. Find the value of a.

(2 marks)

ii. Derive the probability density function of X.

(3 marks)

iii. Let W = X2. Derive the cumulative distribution function of W . Hence,

derive the probability density function of W .

(7 marks)

Reading for this question

This question was answered badly in general. Part i. requires candidates to understand that

the distribution function of a continuous random variable is continuous itself. This can be

seen from Proposition 3.2.4 equation iv. on page 50 of the subject guide. Since for a

continuous random variable X we have P (X = x) = 0 for all x, then:

0 = FX(x)− FX(x−)

so that FX(x) = FX(x−). With this, we have FX(2) = FX(2−), which is:

1 =

a(2− 2)3 + 2a

3

so that a = 3/2.

Part ii. was better answered in general, and even if a is left as unknown if the density

function is correct candidates received full marks. To find the probability density function

you just need to differentiate the distribution function with respect to x. See Claim 3.3.14

on page 58 of the subject guide.

For part iii., many candidates could get that:

P (W ≤ w) = P (X2 ≤ w) = P (−√w ≤ X ≤ √w) = FX(

√

w)− FX(−

√

w)

but failed to realise FX(−

√

w) = 0 since it is given that FX(x) = 0 for x < 0.

Approaching the question

i. For a continuous random variable, we must have lim

x↗2

F (x) = F (2) = 1, so that 2a/3 = 1,

meaning a = 3/2.

ii. We have fX(x) = F

′

X(x), so that:

fX(x) =

3x2/2 for 0 ≤ x < 1

3(x− 2)2/2 for 1 ≤ x < 2

0 otherwise.

iii. The cumulative distribution function for W is, for w > 0:

FW (w) = P (W ≤ w) = P (X ≤

√

w) = FX(

√

w) =

0 for w < 0

w3/2/2 for 0 ≤ w < 1

(

√

w − 2)3/2 + 1 for 1 ≤ w < 4

1 for w ≥ 4.

The probability density function of W is then fW (w) = F

′

W (w), which is:

fW (w) =

3

√

w/4 for 0 ≤ w < 1

3(

√

w − 2)2/4√w for 1 ≤ w < 4

0 otherwise.

5

ST3133 Advanced statistics: distribution theory

(c) Let X follow an exponential distribution with rate λ, i.e. X has a density

function:

fX(x) =

{

λe−λx for x > 0

0 otherwise.

i. Derive the moment generating function of X.

(3 marks)

ii. Let Y be an independent and identically distributed copy of X. For w > 0,

show that:

P (X − Y ≤ w) = 1− e

−λw

2

.

(Hint: find the joint density of X and Y first. Determine the valid region in

the double integral involved.)

(5 marks)

iii. For w ≤ 0, show that:

P (X − Y ≤ w) = e

λw

2

.

(5 marks)

iv. Using parts ii. and iii. of question (c), show that the density function of

W = X − Y is given by:

fW (w) =

λe−λ|w|

2

, for w ∈ R.

(3 marks)

Reading for this question

Part i. was well-answered in general, and finding moment generating functions for a random

variable is a basic technique which candidates need to practise. See Section 3.5 of the

subject guide for more details. For parts ii. and iii., many candidates were not able to

pinpoint the exact limits of integration which should be used in the double integration.

Some candidates did not even realise it should be a double integration because two random

variables are involved. The first thing you should do is to find the region of integration, then

to work out the joint probability density function of (X,Y ), which is just fX(x) fY (y), since

X and Y are independent. See Sections 4.1 and 4.2 of the subject guide for more basic

knowledge on joint density functions.

Some candidates were confused as to how w > 0 or w < 0 affects the calculation. For part

iii., since X ≤ Y + w and w < 0, the limits of Y cannot start from 0, otherwise X will be

negative, which is not allowed. To make sure X ≥ 0, Y has to start from −w. This is

exactly the difference between parts ii. and iii.

For part iv., many candidates could not work out the answer, which is disappointing since

you do not even need to know how to calculate the answers to parts ii. and iii. As long as

you realise we are calculating the distribution function of W in parts ii. and iii., you only

need to differentiate those given answers with respect to w, the result will then follow.

Approaching the question

i. The moment generating function of T is:

MT (s) = E(e

sT ) =

∫ ∞

0

λe−(λ−s)t dt =

λ

λ− s

∫ ∞

0

(λ− s)e−(λ−s)t dt = λ

λ− s , for s < λ.

ii. For w ≥ 0, X − Y ≤ w implies that 0 < X ≤ Y + w, where Y > 0. Hence:

P (X − Y ≤ w) =

∫ ∞

0

∫ y+w

0

fX,Y (x, y) dxdy =

∫ ∞

0

λe−λy[−e−λx]y+w0 dy

=

∫ ∞

0

λe−λy(1− e−λ(y+w)) dy

= 1− 1

2

e−λw.

6

Examiners’ commentaries 2018

iii. For w < 0, X − Y ≤ w implies that 0 < X ≤ Y + w where Y > −w. Hence:

P (X − Y ≤ w) =

∫ ∞

−w

∫ y+w

0

fX,Y (x, y) dxdy =

∫ ∞

−w

λe−λy(1− e−λ(y+w)) dy

= eλw − e−λw × 1

2

e2λw

=

1

2

eλw.

iv. Differentiating the answers in ii. and iii. with respect to w, we have:

fW (w) =

{

λeλw/2 for w < 0

λe−λw/2 for w ≥ 0 =

1

2

λe−λ|w|, for w ∈ R.

Section B

Answer all three questions in this section (60 marks in total).

Question 2

The conditional density of a random variable X given Y = y is given by:

fX|Y (x | y) =

{

3x2/y3 for 0 < x < y < 3

0 otherwise.

The conditional density of Y given X = x is given by:

fY |X(y |x) =

{

3y2/(27− x3) for 0 < x < y < 3

0 otherwise.

(a) Find the ratio fX(x)/fY (y), where fX(x) and fY (y) are the marginal densities

of X and Y , respectively.

(2 marks)

(b) By integrating out x first in the answer in (a), show that:

fY (y) =

{

2y5/243 for 0 < y < 3

0 otherwise.

Is X independent of Y ? Justify your answer.

(9 marks)

(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully

state the region for (U, V ) where this joint density is non-zero.

(9 marks)

Reading for this question

This question was not well-answered in general, which is a little unexpected. You should look at

the marks allocated to each part to determine approximately how long the answers should be.

Part (a) is only worth two marks, so you should not expect the answer to have long derivations.

See Section 5.2 of the subject guide for the definition of continuous conditional distributions.

For part (b), many candidates knew to follow the hint and integrate out x, but the limits should

be the limits for the marginal density of X, i.e. the limits should not involve y. In fact, you only

know

∫

fX(x) dx = 1, and the lower and upper limits are for the marginal density of X, which in

7

ST3133 Advanced statistics: distribution theory

this case should be 0 and 3, respectively. Many candidates went on to prove if

fX,Y (x, y) = fX(x) fY (y), which is correct but is not a quick way to see if X and Y are

independent since you still need to calculate fX,Y (x, y) and fX(x), both of which are not given to

you. In the process, some candidates unfortunately got the wrong answers. To determine

independence between X and Y more quickly, you should check whether fY |X(y |x) = fY (y) or

not. This is equivalent to the criterion fX,Y (x, y) = fX(x) fY (y),l of course, but fY |X(y |x) and

fY (y) are both given to you! So you do not even need to calculate anything to know that X and

Y are not independent! See Section 4.4 of the subject guide for more details on independence of

a pair of random variables.

Part (c) was not done well because of inaccurate calculations mostly, especially the calculations

of the Jacobian. See Section 4.6 of the subject guide for more details.

Approaching the question

(a) We have:

fX|Y (x | y)

fY |X(y |x) =

fX,Y (x, y)/fY (y)

fX,Y (x, y)/fX(x)

=

fX(x)

fY (y)

so that:

fX(x)

fY (y)

=

x2(27− x3)

y5

.

(b) Since 0 < x < y < 3, integrating out the effect of y means that 0 < x < 3. Hence:

1

fY (y)

=

∫ 3

0

fX(x)

fY (y)

dx =

1

y5

∫ 3

0

(27x2 − x5) dx = 1

y5

[

9x3 − x

6

6

]3

0

=

243

2y5

so that:

fY (y) =

2y5

243

for 0 < y < 3.

Since fY (y) 6= fY |X(y |x), X is not independent of Y .

(c) We have:

X =

√

UV and Y =

√

U

V

so that 0 < X < Y < 3 implies:

0 <

√

UV <

√

U

V

< 3

meaning:

0 < U < 9V and V < 1.

Hence:

fU,V (u, v) = fX,Y (

√

uv,

√

u/v)

∣∣∣∣∣∣∣∣ √v/2√u √u/2√v1/2√uv −√u/2v3/2

∣∣∣∣∣∣∣∣

=

3uv

u3/2/v3/2

× 2

243

u5/2/v5/2 × 1

2v

=

u2

81v

for u < 9v, 0 < v < 1.

Question 3

If X is Gamma distributed with parameters α and β, i.e. X ∼ Gamma(α, β), then it

has density:

fX(x) =

βα

Γ(α)

xα−1e−βx, for x > 0

and Γ(α) =

∫∞

0

yα−1e−y dy for α > 0.

8

Examiners’ commentaries 2018

(a) If X ∼ Gamma(α1, β), Y ∼ Gamma(α2, β), and X is independent of Y , derive

the distribution of X + Y . You may use the moment generating function of a

Gamma random variable without proof, as long as you state it clearly.

(7 marks)

(b) Let Xi ∼ Gamma(α, β), i = 1, . . . , N , be independent of each other and

α, β > 0. Each Xi is also independent of N , which is Poisson distributed with

mean µ, so that the probability mass function for N is given by:

pN(n) =

µne−µ

n!

, for n = 0, 1, . . . .

Consider the random variable:

W =

N∑

i=1

Xi

with the convention that W = 0 if N = 0.

i. Derive the moment generating function of W .

(8 marks)

ii. Find the mean of W . You can use the means of a Poisson and a Gamma

random variable without proof. If you use any standard results about

random sums, you must first state them clearly.

(5 marks)

Reading for this question

This question was not answered as well as it should have been. Parts (a) and (b) i. are both

standard exercises. For part (a), see Proposition 4.7.3 for finding the moment generating function

of two independent random variables. Identifying the form of the answer is then the key to

knowing that the sum is still a Gamma distribution.

For part (b) i., the derivation of the moment generating function of a random sum is covered in

the subject guide in Section 5.6. See Lemma 5.6.2 iii. and Proposition 5.6.3 iii. on page 165 of

the subject guide. Indeed many candidates realised this which is good, and scored decent marks

already even though they could not work out the final answer in the end.

For part (b) ii., you can use Proposition 5.6.3 i., or differentiate the moment generating function

which you obtained in part (b) i. Of course, the former will give you the answer much quicker!

Approaching the question

(a) First, let W = X + Y . Since MX(t) = (β/(β − t))α for X ∼ Gamma(α, β), then:

MW (t) = E(e

tW ) = E(etX) E(etY ) =

(

β

β − t

)α1+α2

, for t < β.

This shows that, by the one-to-one correspondence between distribution and moment

generating function, W = X + Y has a Gamma(α1 + α2, β) distribution.

(b) i. We have:

MW (t) = E(e

tW ) = E

(

E

(

exp

(

t

N∑

i=1

Xi

)

|N

))

.

However, given N , we have:

E

(

exp

(

t

N∑

i=1

Xi

))

= E

(

N∏

i=1

exp(tXi)

)

=

N∏

i=1

E

(

etXi

)

=

(

β

β − t

)αN

, for t < β.

9

ST3133 Advanced statistics: distribution theory

At the same time:

MN (s) = E(e

sN ) =

∞∑

n=1

(µes)ne−µ

n!

= eµ(e

s−1), for s ∈ R.

Hence:

MW (t) = E

(

exp

(

N log

(

β

β − t

)α))

= exp

(

µ

(

β

β − t

)α

− µ

)

, for t < β.

ii. We have:

E(W ) = E(E(W |N)) = E(N E(Xi)) = µα

β

.

Question 4

Suppose we have a biased coin, which comes up heads with probability u. An

experiment is carried out so that X is the number of independent flips of the coin

required for r heads to show up, where r ≥ 1 is known.

(a) Show that the probability mass function of X is:

pX(x) =

{(x−1

r−1

)

ur (1− u)x−r for x = r, r + 1, . . .

0 otherwise.

(5 marks)

(b) Suppose U is uniformly distributed on (0, 1), and the distribution in part (a)

becomes

pX|U(x |u) =

{(x−1

r−1

)

ur (1− u)x−r for x = r, r + 1, . . .

0 otherwise.

i. Find the marginal probability mass function for X. You can use:∫ 1

0

ya(1− y)b dy = a! b!

(a+ b+ 1)!

for non-negative integers a and b without proof.

(6 marks)

ii. Show that the density of U |X = x is given by:

fU |X(u |x) =

(x+ 1)!

r!(x− r)! u

r (1− u)x−r, for 0 < u < 1.

Hence find the mean of U |X = x.

(5 marks)

(c) Another independent experiment is carried out, with Y denoting the number of

independent flips of the coin required for r heads to show up (the same r as for

the first experiment).

State (no need for a derivation) the density of U |(X,Y ) = (x, y) and its mean,

where U is still uniformly distributed on (0, 1) as in part (b).

(4 marks)

Reading for this question

This question was not well-answered in general. Part (a) needs you to explain why the probability

mass function is as stated. Many candidates stated that this is a negative binomial distribution

10

Examiners’ commentaries 2018

and hence the density is as given, which is not a proof or an explanation at all. See Example

3.3.10 on page 56 of the subject guide for the justification of its probability mass function.

Part (b) i. needs you to work out the joint density function of X and U , then integrate out U to

obtain the marginal probability mass function of X. Some candidates were not careful in their

calculations even though they realised to integrate out u in pX|U (x |u) fU (u). Even though not

being able to do (b) i., you should be able to do (b) ii. using the answer given to you in (b) i. To

find the mean, you need to apply the integral formula given in (b) i.

Part (c) was done worst since it is supposed to be difficult. You need to realise X and Y are

independent experiments which can be seen as one, so that x is replaced by x+ y and r is

replaced by 2r in the answer in (b) ii.

Approaching the question

(a) To wait for r heads to show up, suppose x flips are required. The last flip must be a head,

with r − 1 heads randomly appearing in the first x− 1 flips. In each particular combination

of heads and tails, there must be r heads by definition of the experiment, as well as x− r

tails (so adding together to x flips in total), with probability:

ur (1− u)x−r.

Hence we have:

pX(x) =

(

x− 1

r − 1

)

ur (1− u)x−r, for x = r, r + 1, . . . .

(b) i. The joint probability density for X,U is:

fX,U (x, u) =

(

x− 1

r − 1

)

ur (1− u)x−r, for 0 < u < 1, x = r, r + 1, . . . .

Therefore, the marginal probability mass function of X is:

pX(x) =

∫ 1

0

(

x− 1

r − 1

)

ur (1− u)x−r du =

(

x− 1

r − 1

)∫ 1

0

ur (1− u)x−r du

=

(

x− 1

r − 1

)

r! (x− r)!

(x+ 1)!

=

r

x(x+ 1)

, for x = r, r + 1, . . . .

ii. We have:

fU |X(u |x) = fX,U (x, u)

pX(x)

=

x(x+ 1)

r

×

(

x− 1

r − 1

)

ur (1− u)x−r

=

(x+ 1)!

r! (x− r)!u

r(1− u)x−r, for 0 < u < 1.

The mean is:

E(U |X = x) = (x+ 1)!

r! (x− r)!

∫ 1

0

ur+1(1−u)x−r du = (x+ 1)!

r! (x− r)! ×

(r + 1)! (x− r)!

(x+ 2)!

=

r + 1

x+ 2

.

(c) Mathematically, we have:

pX,Y |U (x, y |u) = pX|U (x |u) pY |U (y |u)

11

ST3133 Advanced statistics: distribution theory

so that with fU (u) = 1, we have:

pU |X,Y (u |x, y) =

pX|U (x |u) pY |U (y |u)∫ 1

0

pX|U (x |u) pY |U (y |u) du

=

u2r(1− u)x+y−2r∫ 1

0

u2r(1− u)x+y−2r du

=

(x+ y + 1)!

(2r)! (x+ y − 2r)!u

2r(1− u)x+y−2r, for 0 < u < 1

which is in parallel to the answer in part (b) ii. The mean is:

2r + 1

x+ y + 2

which is in parallel to the answer in (b) ii.

To see these two answers more quickly, note that X and Y can be seen as one experiment,

waiting for 2r heads to show up. So we need x+ y flips for 2r heads to come up, and hence

we can replace x by x+ y and r by 2r directly from answers in (b) ii.

12

Examiners’ commentaries 2018

Examiners’ commentaries 2018

ST3133 Advanced statistics: distribution theory

Important note

This commentary reflects the examination and assessment arrangements for this course in the

academic year 2017–18. The format and structure of the examination may change in future years,

and any such changes will be publicised on the virtual learning environment (VLE).

Information about the subject guide and the Essential reading

references

Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).

You should always attempt to use the most recent edition of any Essential reading textbook, even if

the commentary and/or online reading list and/or subject guide refer to an earlier edition. If

different editions of Essential reading are listed, please check the VLE for reading supplements – if

none are available, please use the contents list and index of the new edition to find the relevant

section.

Comments on specific questions – Zone B

Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks) and all

THREE questions from Section B (60 marks in total). Candidates are strongly advised to

divide their time accordingly.

Section A

Answer all three parts of question 1 (40 marks in total).

Question 1

(a) Let g(x) be a function taking on integer values of x, with:

g(x) =

2a for x = −3,−1

a for x = 0, 2

3a for x = 1, 3

0 otherwise.

i. Find a so that g(x) is a probability mass function.

(3 marks)

ii. Let X be a discrete random variable with probability mass function g(x).

Find E(X) and Var(X).

(5 marks)

iii. Write down the probability mass function of Y = X2 − 4|X|+ 4.

(4 marks)

13

ST3133 Advanced statistics: distribution theory

Reading for this question

This question is about discrete probability distributions and basic moment calculations, and

has been overall well-answered except for part iii. Discrete random variables are discussed in

Section 3.3.1 of the subject guide, with examples. The mean and variance are covered in

Sections 3.4.2 and 3.4.3, respectively.

Part i. needs the application of Claim 3.3.6 iii. that the sum of probabilities over the

support is 1 to find the value of a. This part was done well in general although many

candidates calculated the answer incorrectly because of careless mistakes. Some candidates

just equated 2a+ a+ 3a = 1, not realising that there are two values of x which have

probability 2a, a and 3a, respectively. This indeed shows a lack of understanding of discrete

probability distributions, which is disappointing.

Part ii. was done well in general even if a could be incorrect, and marks were awarded in full

if the value of a was the only thing that was wrong.

For part iii., there is no formal transformation formula like that in the continuous case. To

do this question, candidates should find out the support (Definition 3.3.2 on page 53 of the

subject guide) of Y first, which is 0, 1, 4. For instance, X = −3,−1, 1, 3 are all mapped to

Y = 1, so

P (Y = 1) = g(−3) + g(−1) + g(1) + g(3) = 2a+ 2a+ 3a+ 3a = 10a = 5

6

.

Approaching the question

i. We must have:

1 =

∑

x

g(x) = (2a+ 2a) + (a+ a) + (3a+ 3a)

so that a = 1/12.

ii. We have:

E(X) =

∑

x

x g(x) = (−3− 1)× 2a+ (0 + 2)× a+ (1 + 3)× 3a = 6a = 1

2

.

and:

Var(X) = E(X2)− (E(X))2 =

∑

x

x2 g(x)− 1

4

= 10× 2a+ 4a+ 10× 3a− 1

4

= 54a− 1

4

=

51

12

.

iii. Since Y = (|X| − 2)2, it is easy to see that −3,−1, 1 and 3 are mapped to 1, 0 is mapped

to 4 and, finally, 2 is mapped to 0. Hence the probability mass function of Y is:

gY (y) =

a for y = 0

2a+ 2a+ 3a+ 3a for y = 1

a for y = 4

0 otherwise.

=

1/12 for y = 0

5/6 for y = 1

1/12 for y = 4

0 otherwise.

(b) The cumulative distribution function FX(·) for the continuous random variable

X is defined by:

FX(x) =

0 for x < 0

ax2/4 for 0 ≤ x < 1

((x− 1)3 + a)/4 for 1 ≤ x < 2

1 x ≥ 2.

14

Examiners’ commentaries 2018

i. Find the value of a.

(1 mark)

ii. Derive the probability density function of X.

(4 marks)

iii. Let W = X2. Derive the cumulative distribution function of W . Hence,

derive the probability density function of W .

(7 marks)

Reading for this question

This question was answered badly in general. Part i. requires candidates to understand that

the distribution function of a continuous random variable is continuous itself. This can be

seen from Proposition 3.2.4 equation iv. on page 50 of the subject guide. Since for a

continuous random variable X we have P (X = x) = 0 for all x, then:

0 = FX(x)− FX(x−)

so that FX(x) = FX(x−). With this, we have FX(2) = FX(2−), which is:

1 =

(2− 1)3 + a

4

so that a = 3.

Part ii. was better answered in general, and even if a is left as unknown if the density

function is correct candidates received full marks. To find the probability density function

you just need to differentiate the distribution function with respect to x. See Claim 3.3.14

on page 58 of the subject guide.

For part iii., many candidates could get that:

P (W ≤ w) = P (X2 ≤ w) = P (−√w ≤ X ≤ √w) = FX(

√

w)− FX(−

√

w)

but failed to realise FX(−

√

w) = 0 since it is given that FX(x) = 0 for x < 0.

Approaching the question

i. For a continuous random variable, we must have lim

x↗2

F (x) = F (2) = 1, so that

1/4 + a/4 = 1, meaning a = 3.

ii. We have fX(x) = F

′

X(x), so that:

fX(x) =

3x/2 for 0 ≤ x < 1

3(x− 1)2/4 for 1 ≤ x < 2

0 otherwise.

iii. The cumulative distribution function for W is, for w > 0:

FW (w) = P (W ≤ w) = P (X ≤

√

w) = FX(

√

w) =

0 for w < 0

3w/4 for 0 ≤ w < 1

(

√

w − 1)3/4 + 3/4 for 1 ≤ w < 4

1 for w ≥ 4.

The probability density function of W is then fW (w) = F

′

W (w), which is:

fW (w) =

3/4 for 0 ≤ w < 1

3(

√

w − 1)2/8√w for 1 ≤ w < 4

0 otherwise.

15

ST3133 Advanced statistics: distribution theory

(c) Let X follow an exponential distribution with rate λ, i.e. X has a density

function:

fX(x) =

{

λe−λx for x > 0

0 otherwise.

i. Derive the moment generating function of X.

(3 marks)

ii. Let Y be an independent and identically distributed copy of X. For w > 0,

show that:

P (X − Y ≤ w) = 1− e

−λw

2

.

(Hint: find the joint density of X and Y first. Determine the valid region in

the double integral involved.)

(5 marks)

iii. For w ≤ 0, show that:

P (X − Y ≤ w) = e

λw

2

.

(5 marks)

iv. Using parts ii. and iii. of question (c), show that the density function of

W = X − Y is given by:

fW (w) =

λe−λ|w|

2

, for w ∈ R.

(3 marks)

Reading for this question

Part i. was well-answered in general, and finding moment generating functions for a random

variable is a basic technique which candidates need to practise. See Section 3.5 of the

subject guide for more details. For parts ii. and iii., many candidates were not able to

pinpoint the exact limits of integration which should be used in the double integration.

Some candidates did not even realise it should be a double integration because two random

variables are involved. The first thing you should do is to find the region of integration, then

to work out the joint probability density function of (X,Y ), which is just fX(x) fY (y), since

X and Y are independent. See Sections 4.1 and 4.2 of the subject guide for more basic

knowledge on joint density functions.

Some candidates were confused as to how w > 0 or w < 0 affects the calculation. For part

iii., since X ≤ Y + w and w < 0, the limits of Y cannot start from 0, otherwise X will be

negative, which is not allowed. To make sure X ≥ 0, Y has to start from −w. This is

exactly the difference between parts ii. and iii.

For part iv., many candidates could not work out the answer, which is disappointing since

you do not even need to know how to calculate the answers to parts ii. and iii. As long as

you realise we are calculating the distribution function of W in parts ii. and iii., you only

need to differentiate those given answers with respect to w, the result will then follow.

Approaching the question

i. The moment generating function of T is:

MT (s) = E(e

sT ) =

∫ ∞

0

λe−(λ−s)t dt =

λ

λ− s

∫ ∞

0

(λ− s)e−(λ−s)t dt = λ

λ− s , for s < λ.

ii. For w ≥ 0, X − Y ≤ w implies that 0 < X ≤ Y + w, where Y > 0. Hence:

P (X − Y ≤ w) =

∫ ∞

0

∫ y+w

0

fX,Y (x, y) dxdy =

∫ ∞

0

λe−λy[−e−λx]y+w0 dy

=

∫ ∞

0

λe−λy(1− e−λ(y+w)) dy

= 1− 1

2

e−λw.

16

Examiners’ commentaries 2018

iii. For w < 0, X − Y ≤ w implies that 0 < X ≤ Y + w where Y > −w. Hence:

P (X − Y ≤ w) =

∫ ∞

−w

∫ y+w

0

fX,Y (x, y) dxdy =

∫ ∞

−w

λe−λy(1− e−λ(y+w)) dy

= eλw − e−λw × 1

2

e2λw

=

1

2

eλw.

iv. Differentiating the answers in ii. and iii. with respect to w, we have:

fW (w) =

{

λeλw/2 for w < 0

λe−λw/2 for w ≥ 0 =

1

2

λe−λ|w|, for w ∈ R.

Section B

Answer all three questions in this section (60 marks in total).

Question 2

The conditional density of a random variable X given Y = y is given by:

fX|Y (x | y) =

{

x/(2y2) for 0 < x < 2y < 2

0 otherwise.

The conditional density of Y given X = x is given by:

fY |X(y |x) =

{

24y2/(8− x3) for 0 < x < 2y < 2

0 otherwise.

(a) Find the ratio fY (y)/fX(x), where fX(x) and fY (y) are the marginal densities

of X and Y , respectively.

(2 marks)

(b) By integrating out y first in the answer in (a), show that:

fX(x) =

{

(5x(8− x3))/48 for 0 < x < 2

0 otherwise.

Is X independent of Y ? Justify your answer.

(9 marks)

(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully

state the region for (U, V ) where this joint density is non-zero.

(9 marks)

Reading for this question

This question was not well-answered in general, which is a little unexpected. You should look at

the marks allocated to each part to determine approximately how long the answers should be.

Part (a) is only worth two marks, so you should not expect the answer to have long derivations.

See Section 5.2 of the subject guide for the definition of continuous conditional distributions.

For part (b), many candidates knew to follow the hint and integrate out x, but the limits should

be the limits for the marginal density of X, i.e. the limits should not involve y. In fact, you only

know

∫

fX(x) dx = 1, and the lower and upper limits are for the marginal density of X, which in

17

ST3133 Advanced statistics: distribution theory

this case should be 0 and 3, respectively. Many candidates went on to prove if

fX,Y (x, y) = fX(x) fY (y), which is correct but is not a quick way to see if X and Y are

independent since you still need to calculate fX,Y (x, y) and fX(x), both of which are not given to

you. In the process, some candidates unfortunately got the wrong answers. To determine

independence between X and Y more quickly, you should check whether fY |X(y |x) = fY (y) or

not. This is equivalent to the criterion fX,Y (x, y) = fX(x) fY (y),l of course, but fY |X(y |x) and

fY (y) are both given to you! So you do not even need to calculate anything to know that X and

Y are not independent! See Section 4.4 of the subject guide for more details on independence of

a pair of random variables.

Part (c) was not done well because of inaccurate calculations mostly, especially the calculations

of the Jacobian. See Section 4.6 of the subject guide for more details.

Approaching the question

(a) We have:

fY |X(y |x)

fX|Y (x | y) =

fX,Y (x, y)/fX(x)

fX,Y (x, y)/fY (y)

=

fY (y)

fX(x)

so that:

fY (y)

fX(x)

=

48y4

8x− x4 .

(b) Since 0 < x < 2y < 2, integrating out the effect of y means that 0 < x < 2. Hence:

1

fX(x)

=

∫ 1

0

fY (y)

fX(x)

dy =

1

8x− x4

∫ 1

0

48y4 dy =

1

x(8− x3)

48

5

so that:

fX(x) =

5x(8− x3)

48

, for 0 < x < 2.

Since fX(x) 6= fX|Y (x | y), X is not independent of Y .

(c) We have:

X =

√

UV and Y =

√

U

V

so that 0 < X < 2Y < 2 implies:

0 <

√

UV < 2

√

U

V

< 2

meaning:

0 < U < V and V < 2.

Hence:

fU,V (u, v) = fX,Y (

√

uv,

√

u/v)

∣∣∣∣∣∣∣∣ √v/2√u √u/2√v1/2√uv −√u/2v3/2

∣∣∣∣∣∣∣∣

=

24u/v

8− (uv)3/2 ×

5

48

(8

√

uv − u2v2)× 1

2v

=

5u3/2

4v3/2

for u < v, 0 < v < 2.

Question 3

If X is Gamma distributed with parameters α and β, i.e. X ∼ Gamma(α, β), then it

has density:

fX(x) =

βα

Γ(α)

xα−1e−βx, for x > 0

and Γ(α) =

∫∞

0

yα−1e−y dy for α > 0.

18

Examiners’ commentaries 2018

(a) Suppose X ∼ Gamma(α1, β1), Y ∼ Gamma(α2, β2), and X is independent of Y .

Derive the distribution of β1X + β2Y . You may use the moment generating

function of a Gamma random variable without proof, as long as you state it

clearly.

(7 marks)

(b) Let Xi ∼ Gamma(α, βi), i = 1, . . . , N , be independent of each other and

α, βi > 0. Each Xi is also independent of N , which is Poisson distributed with

mean µ, so that the probability mass function for N is given by:

pN(n) =

µne−µ

n!

, for n = 0, 1, . . . .

Consider the random variable:

W =

N∑

i=1

βiXi

with the convention that W = 0 if N = 0.

i. Derive the moment generating function of W .

(8 marks)

ii. Find the mean of W . You can use the mean of a Poisson random variable

without proof. The mean of X ∼ Gamma(α, β) is α/β.

(5 marks)

Reading for this question

This question was not answered as well as it should have been. Parts (a) and (b) i. are both

standard exercises. For part (a), see Proposition 4.7.3 for finding the moment generating function

of two independent random variables. Identifying the form of the answer is then the key to

knowing that the sum is still a Gamma distribution. Many candidates did not realise:

Mβ1X(t) = E(e

tβ1X) = MX(β1t) = (1− t)−α

and could not get the correct answer in the end by eliminating β1 in the moment generating

function. We also see that β1X ∼ Gamma(α, 1).

For part (b) i., the derivation of the moment generating function of a random sum is covered in

the subject guide in Section 5.6. See Lemma 5.6.2 iii. and Proposition 5.6.3 iii. on page 165 of

the subject guide. Indeed many candidates realised this which is good, and scored decent marks

already even though they could not work out the final answer in the end.

For part (b) ii., you can use Proposition 5.6.3 i. since the βiXis are all i.i.d. Gamma(α, 1) from

the calculations in part (a) and so have a common mean α, or differentiate the moment

generating function you obtained in part (b) i. Of course, the former will give you the answer

much quicker!

Approaching the question

(a) First, let W = β1X + β2Y . Since MX(t) = (β/(β − t))α for X ∼ Gamma(α, β), then:

MW (t) = E(e

tW ) = E(etβ1X) E(etβ2Y ) =

(

β1

β1 − β1t

)α1 ( β2

β2 − β2t

)α2

=

(

1

1− t

)α1+α2

for t < 1. This shows that, by the one-to-one correspondence between distribution and

moment generating function, β1X + β2Y has a Gamma(α1 + α2, 1) distribution.

19

ST3133 Advanced statistics: distribution theory

(b) i. We have:

MW (t) = E(e

tW ) = E

(

E

(

exp

(

t

N∑

i=1

βiXi

)

|N

))

.

However, given N , we have:

E

(

exp

(

t

N∑

i=1

βiXi

))

= E

(

N∏

i=1

exp(tβiXi)

)

=

N∏

i=1

E

(

etβiXi

)

=

(

1

1− t

)αN

, for t < 1.

At the same time:

MN (s) = E(e

sN ) =

∞∑

n=1

(µes)ne−µ

n!

= eµ(e

s−1), for s ∈ R.

Hence:

MW (t) = E

(

exp

(

N log

(

1

1− t

)α))

= exp

(

µ

(

1

1− t

)α

− µ

)

, for t < 1.

ii. We have:

E(W ) = E(E(W |N)) = E

(

N∑

i=1

βi E(Xi)

)

= µα.

Question 4

Suppose we have a biased coin, which comes up heads with probability u. An

experiment is carried out so that X is the number of independent flips of the coin

required for r heads to show up, where r ≥ 1 is known.

(a) Show that the probability mass function for X is:

pX(x) =

{(x−1

r−1

)

ur (1− u)x−r for x = r, r + 1, . . .

0 otherwise.

(5 marks)

(b) Suppose U is random and has a density given by:

fU(u) =

{

Γ(α+β)

Γ(α)Γ(β)

uα−1 (1− u)β−1 for 0 < u < 1

0 otherwise

where α, β > 0, and Γ(α) is defined in Question 3, which has the property that

Γ(α) = (α− 1)Γ(α− 1) for α ≥ 1, and Γ(k) = (k − 1)! for a positive integer k.

The distribution in part (a) thus becomes:

pX|U(x |u) =

{(x−1

r−1

)

ur (1− u)x−r for x = r, r + 1, . . .

0 otherwise.

i. Find the marginal probability mass function of X if α = β = 2.

(6 marks)

ii. With α = β = 2 still, show that the density of U |X = x is given by:

fU |X(u |x) =

{

(x+3)!

(r+1)! (x−r+1)! u

r+1 (1− u)x−r+1 for 0 < u < 1

0 otherwise.

Hence find the mean of U |X = x.

(5 marks)

20

Examiners’ commentaries 2018

(c) Another independent experiment is carried out, with Y denoting the number of

independent flips of the coin required for r heads to show up (the same r as for

the first experiment).

State (no need for a derivation) the density of U | (X,Y ) = (x, y) and its mean,

where U still has the density in part (b) with α = β = 2.

(4 marks)

Reading for this question

This question was not well-answered in general. Part (a) needs you to explain why the probability

mass function is as stated. Many candidates stated that this is a negative binomial distribution

and hence the density is as given, which is not a proof or an explanation at all. See Example

3.3.10 on page 56 of the subject guide for the justification of its probability mass function.

Part (b) i. needs you to work out the joint density function of X and U , then integrate out U to

obtain the marginal probability mass function of X. Some candidates were not careful in their

calculations even though they realised to integrate out u in pX|U (x |u) fU (u). Even though not

being able to do (b) i., you should be able to do (b) ii. using the answer given to you in (b) i. To

find the mean, you need to find a general formula for

∫ 1

0

uα−1(1− u)β−1 du by realising that∫ 1

0

fU (u) du = 1 in the probability density function given in part (b).

Part (c) was done worst since it is supposed to be difficult. You need to realise X and Y are

independent experiments which can be seen as one, so that x is replaced by x+ y and r is

replaced by 2r in the answer in (b) ii.

Approaching the question

(a) To wait for r heads to show up, suppose x flips are required. Therefore, the last flip must be

a head, with r − 1 heads randomly appearing in the first x− 1 flips. In each particular

combination of heads and tails, there must be r heads by definition of the experiment, as

well as x− r tails (so adding together to x flips in total), with probability:

ur (1− u)x−r.

Hence we have:

pX(x) =

(

x− 1

r − 1

)

ur (1− u)x−r, for x = r, r + 1, . . . .

(b) i. The joint probability density for X,U is:

fX,U (x, u) =

(

x− 1

r − 1

)

Γ(α+ β)

Γ(α) Γ(β)

ur+α−1 (1−u)x−r+β−1, for 0 < u < 1, x = r, r+ 1, . . . .

Therefore, the marginal probability mass function of X is:

pX(x) =

∫ 1

0

(

x− 1

r − 1

)

Γ(α+ β)

Γ(α) Γ(β)

ur+α−1 (1− u)x−r+β−1 du

=

(

x− 1

r − 1

)

Γ(α+ β)

Γ(α) Γ(β)

∫ 1

0

ur+α−1 (1− u)x−r+β−1 du

=

(

x− 1

r − 1

)

Γ(α+ β)

Γ(α) Γ(β)

Γ(r + α) Γ(x− r + β)

Γ(x+ α+ β)

=

6r(r + 1)(x− r + 1)

x(x+ 1)(x+ 2)(x+ 3)

, for x = r, r + 1, . . . .

21

ST3133 Advanced statistics: distribution theory

ii. We have:

fU |X(u |x) = fX,U (x, u)

pX(x)

=

ur+α−1(1− u)x−r+β−1∫ 1

0

ur+α−1(1− u)x−r+β−1 du

=

Γ(x+ 4)

Γ(r + 2) Γ(x− r + 2) u

r+1 (1− u)x−r+1, for 0 < u < 1.

The mean is:

E(U |X = x) = (x+ 3)!

(r + 1)! (x− r + 1)!

∫ 1

0

ur+2(1− u)x−r+1 du

=

(x+ 3)!

(r + 1)! (x− r + 1)! ×

(r + 2)! (x− r + 1)!

(x+ 4)!

=

r + 2

x+ 4

.

(c) Mathematically, we have:

pX,Y |U (x, y |u) = pX|U (x |u) pY |U (y|u)

so that:

pU |X,Y (u |x, y) =

pX|U (x |u) pY |U (y |u) fU (u)∫ 1

0

pX|U (x |u) pY |U (y |u) fU (u) du

=

u2r+1(1− u)x+y−2r+1∫ 1

0

u2r+1(1− u)x+y−2r+1 du

=

(x+ y + 3)!

(2r + 1)! (x+ y − 2r + 1)! u

2r+1 (1− u)x+y−2r+1, for 0 < u < 1

which is in parallel to the answer in part (b) ii. The mean is:

2r + 2

x+ y + 4

which is in parallel to the answer in (b) ii.

To see these two answers more quickly, note that X nd Y can be seen as one experiment,

waiting for 2r heads to show up. So we need x+ y flips for 2r heads to come up, and hence

we can replace x by x+ y and r by 2r directly from answers in (b) ii.

22

学霸联盟

UL18/0326 Page 1 of 6 D0

~~ST3133_ZA_2016_d0

This paper is not to be removed from the Examination Hall

UNIVERSITY OF LONDON ST3133 ZA

BSc degrees and Diplomas for Graduates in Economics, Management, Finance

and the Social Sciences, the Diplomas in Economics and Social Sciences

Advanced Statistics: Distribution Theory

Monday, 14 May 2018: 10:00 to 12:00

Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks)

and all THREE questions from Section B (60 marks in total). Candidates are strongly

advised to divide their time accordingly.

A calculator may be used when answering questions on this paper and it must comply

in all respects with the specification given with your Admission Notice. The make and

type of machine must be clearly stated on the front cover of the answer book.

PLEASE TURN OVER

Section A

Answer all three parts of question 1 (40 marks in total)

1. (a) Let g(x) be a function taking on integer values of x, with

g(x) =

a, x = −2,−1;

3a, x = 0, 1;

4a, x = 2, 3;

0, otherwise.

i. Find a so that g(x) is a probability mass function. [3 marks]

ii. Let X be a discrete random variable with probability mass function g(x).

Find E(X) and Var(X). [5 marks]

iii. Write down the probability mass function for Y = X2−2|X|+1. [4 marks]

(b) The cumulative distribution function FX(x) for the continuous random variable

X is defined by

FX(x) =

0, x < 0;

ax3/3, 0 ≤ x < 1;

(a(x− 2)3 + 2a)/3, 1 ≤ x < 2;

1, x ≥ 2.

i. Find the value of a. [2 marks]

ii. Derive the probability density function of X. [3 marks]

iii. Let W = X2. Derive the cumulative distribution function of W . Hence,

derive the probability density function of W . [7 marks]

UL18/0326 2 of 6

(c) Let X follow an exponential distribution with rate λ, i.e., X has a density

function

fX(x) =

{

λe−λx, x > 0;

0, otherwise.

i. Derive the moment generating function of X. [3 marks]

ii. Let Y be an independent and identically distributed copy of X. For w > 0,

show that

P (X − Y ≤ w) = 1− e

−λw

2

.

(Hint: find the joint density of X and Y first. Determine the valid region

in the double integral involved.) [5 marks]

iii. For w ≤ 0, show that

P (X − Y ≤ w) = e

λw

2

.

[5 marks]

iv. Using parts ii and iii of question (c), show that the density function of

W = X − Y is given by

fW (w) =

λe−λ|w|

2

, w ∈ R.

[3 marks]

UL18/0326 3 of 6

Section B

Answer all three questions in this section (60 marks in total)

2. The conditional density of a random variable X given Y = y is given by

fX|Y (x|y) =

{

3x2/y3, 0 < x < y < 3;

0, otherwise.

The conditional density of Y given X = x is given by

fY |X(y|x) =

{

3y2/(27− x3), 0 < x < y < 3;

0, otherwise.

(a) Find the ratio fX(x)/fY (y), where fX(x) and fY (y) are the marginal den-

sities of X and Y , respectively. [2 marks]

(b) By integrating out x first in the answer in (a), show that

fY (y) =

{

2y5/243, 0 < y < 3;

0, otherwise.

Is X independent of Y ? Justify your answer. [9 marks]

(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully

state the region for (U, V ) where this joint density is non-zero. [9 marks]

UL18/0326 4 of 6

3. If X is Gamma distributed with parameters α and β, i.e., X ∼ Gamma(α, β),

then it has density

fX(x) =

βα

Γ(α)

xα−1e−βx, x > 0,

and Γ(α) =

∫∞

0

yα−1e−ydy for α > 0.

(a) If X ∼ Gamma(α1, β), Y ∼ Gamma(α2, β), and X is independent of Y ,

derive the distribution of X + Y . You may use the moment generating

function of a Gamma random variable without proof, as long as you state

it clearly. [7 marks]

(b) Let Xi ∼ Gamma(α, β), i = 1, . . . , N , be independent of each other and

α, β > 0. Each Xi is also independent of N , which is Poisson distributed

with mean µ, so that the probability mass function for N is given by

pN(n) =

µne−µ

n!

, n = 0, 1, . . . .

Consider the random variable

W =

N∑

i=1

Xi,

with the convention that W = 0 if N = 0.

i. Derive the moment generating function of W . [8 marks]

ii. Find the mean ofW . You can use the means of a Poisson and a Gamma

random variable without proof. If you use any standard results about

random sums, you must first state them clearly. [5 marks]

UL18/0326 5 of 6

4. Suppose we have a biased coin, which comes up heads with probability u. An

experiment is carried out so that X is the number of independent flips of the

coin required for r heads to show up, where r ≥ 1 is known.

(a) Show that the probability mass function of X is

pX(x) =

{ (

x−1

r−1

)

ur(1− u)x−r, x = r, r + 1, . . . .;

0, otherwise.

[5 marks]

(b) Suppose U is uniformly distributed on (0, 1), and the distribution in part

(a) becomes

pX|U(x|u) =

{ (

x−1

r−1

)

ur(1− u)x−r, x = r, r + 1, . . . .;

0, otherwise.

i. Find the marginal probability mass function for X. You can use∫ 1

0

ya(1− y)bdy = a!b!

(a+ b+ 1)!

for non-negative integers a and b without proof. [6 marks]

ii. Show that the density of U |X = x is given by

fU |X(u|x) = (x+ 1)!

r!(x− r)!u

r(1− u)x−r, 0 < u < 1.

Hence find the mean of U |X = x. [5 marks]

(c) Another independent experiment is carried out, with Y denoting the num-

ber of independent flips of the coin required for r heads to show up (the

same r as for the first experiment).

State (no need for a derivation) the density of U |(X,Y ) = (x, y) and its

mean, where U is still uniformly distributed on (0, 1) as in part (b).

. [4 marks]

END OF PAPER

UL18/0326 6 of 6

© University of London 2018

UL18/0327 Page 1 of 7 D0

~~ST3133_ZA_2016_d0

This paper is not to be removed from the Examination Hall

UNIVERSITY OF LONDON ST3133 ZB

BSc degrees and Diplomas for Graduates in Economics, Management, Finance

and the Social Sciences, the Diplomas in Economics and Social Sciences

Advanced Statistics: Distribution Theory

Monday, 14 May 2018: 10:00 to 12:00

Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks)

and all THREE questions from Section B (60 marks in total). Candidates are strongly

advised to divide their time accordingly.

A calculator may be used when answering questions on this paper and it must comply

in all respects with the specification given with your Admission Notice. The make and

type of machine must be clearly stated on the front cover of the answer book.

PLEASE TURN OVER

Section A

Answer all three parts of question 1 (40 marks in total)

1. (a) Let g(x) be a function taking on integer values of x, with

g(x) =

2a, x = −3,−1;

a, x = 0, 2;

3a, x = 1, 3;

0, otherwise.

i. Find a so that g(x) is a probability mass function. [3 marks]

ii. Let X be a discrete random variable with probability mass function g(x).

Find E(X) and Var(X). [5 marks]

iii. Write down the probability mass function of Y = X2− 4|X|+4. [4 marks]

(b) The cumulative distribution function FX(·) for the continuous random variable

X is defined by

FX(x) =

0, x < 0;

ax2/4, 0 ≤ x < 1;

((x− 1)3 + a)/4, 1 ≤ x < 2;

1, x ≥ 2.

i. Find the value of a. [1 mark]

ii. Derive the probability density function of X. [4 marks]

iii. Let W = X2. Derive the cumulative distribution function of W . Hence,

derive the probability density function of W . [7 marks]

(c) Let X follow an exponential distribution with rate λ, i.e., X has a density

function

fX(x) =

{

λe−λx, x > 0;

0, otherwise.

UL18/0327 Page 2 of 7

i. Derive the moment generating function of X. [3 marks]

ii. Let Y be an independent and identically distributed copy of X. For w > 0,

show that

P (X − Y ≤ w) = 1− e

−λw

2

.

(Hint: find the joint density of X and Y first. Determine the valid region

in the double integral involved.) [5 marks]

iii. For w ≤ 0, show that

P (X − Y ≤ w) = e

λw

2

.

[5 marks]

iv. Using parts ii and iii of question (c), show that the density function of

W = X − Y is given by

fW (w) =

λe−λ|w|

2

, w ∈ R.

[3 marks]

UL18/0327 Page 3 of 7

Section B

Answer all three questions in this section (60 marks in total)

2. The conditional density of a random variable X given Y = y is given by

fX|Y (x|y) =

{

x/(2y2), 0 < x < 2y < 2;

0, otherwise.

The conditional density of Y given X = x is given by

fY |X(y|x) =

{

24y2/(8− x3), 0 < x < 2y < 2;

0, otherwise.

(a) Find the ratio fY (y)/fX(x), where fX(x) and fY (y) are the marginal den-

sities of X and Y , respectively. [2 marks]

(b) By integrating out y first in the answer in (a), show that

fX(x) =

{

(5x(8− x3))/48, 0 < x < 2;

0, otherwise.

Is X independent of Y ? Justify your answer. [9 marks]

(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully

state the region for (U, V ) where this joint density is non-zero. [9 marks]

UL18/0327 Page 4 of 7

3. If X is Gamma distributed with parameters α and β, i.e., X ∼ Gamma(α, β),

then it has density

fX(x) =

βα

Γ(α)

xα−1e−βx, x > 0,

and Γ(α) =

∫∞

0

yα−1e−ydy for α > 0.

(a) Suppose X ∼ Gamma(α1, β1), Y ∼ Gamma(α2, β2), and X is independent

of Y . Derive the distribution of β1X + β2Y . You may use the moment

generating function of a Gamma random variable without proof, as long as

you state it clearly. [7 marks]

(b) Let Xi ∼ Gamma(α, βi), i = 1, . . . , N , be independent of each other and

α, βi > 0. Each Xi is also independent of N , which is Poisson distributed

with mean µ, so that the probability mass function for N is given by

pN(n) =

µne−µ

n!

, n = 0, 1, . . . .

Consider the random variable

W =

N∑

i=1

βiXi,

with the convention that W = 0 if N = 0.

i. Derive the moment generating function of W . [8 marks]

ii. Find the mean of W . You can use the mean of a Poisson random vari-

able without proof. The mean of X ∼ Gamma(α, β) is α/β. [5 marks]

UL18/0327 Page 5 of 7

4. Suppose we have a biased coin, which comes up heads with probability u. An

experiment is carried out so that X is the number of independent flips of the

coin required for r heads to show up, where r ≥ 1 is known.

(a) Show that the probability mass function for X is

pX(x) =

{ (

x−1

r−1

)

ur(1− u)x−r, x = r, r + 1, . . . .;

0, otherwise.

[5 marks]

(b) Suppose U is random and has a density given by

fU(u) =

{

Γ(α+β)

Γ(α)Γ(β)

uα−1(1− u)β−1, 0 < u < 1;

0, otherwise.

where α, β > 0, and Γ(α) is defined in question 3, which has the property

that Γ(α) = (α − 1)Γ(α − 1) for α ≥ 1, and Γ(k) = (k − 1)! for a positive

integer k. The distribution in part (a) thus becomes

pX|U(x|u) =

{ (

x−1

r−1

)

ur(1− u)x−r, x = r, r + 1, . . . .;

0, otherwise.

i. Find the marginal probability mass function of X if α = β = 2.

. [6 marks]

ii. With α = β = 2 still, show that the density of U |X = x is given by

fU |X(u|x) =

{

(x+3)!

(r+1)!(x−r+1)!u

r+1(1− u)x−r+1, 0 < u < 1;

0, otherwise.

Hence find the mean of U |X = x. [5 marks]

(c) Another independent experiment is carried out, with Y denoting the num-

ber of independent flips of the coin required for r heads to show up (the

same r as for the first experiment).

UL18/0327 Page 6 of 7

State (no need for a derivation) the density of U |(X,Y ) = (x, y) and its

mean, where U still has the density in part (b) with α = β = 2. [4 marks]

END OF PAPER

UL18/0327 Page 7 of 7

Examiners’ commentaries 2018

Examiners’ commentaries 2018

ST3133 Advanced statistics: distribution theory

Important note

This commentary reflects the examination and assessment arrangements for this course in the

academic year 2017–18. The format and structure of the examination may change in future years,

and any such changes will be publicised on the virtual learning environment (VLE).

Information about the subject guide and the Essential reading

references

Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).

You should always attempt to use the most recent edition of any Essential reading textbook, even if

the commentary and/or online reading list and/or subject guide refer to an earlier edition. If

different editions of Essential reading are listed, please check the VLE for reading supplements – if

none are available, please use the contents list and index of the new edition to find the relevant

section.

General remarks

Learning outcomes

At the end of this half course and having completed the Essential reading and activities, you should

be able to:

• recall a large number of distributions and be a competent user of their mass/density and

distribution functions and moment generating functions

• explain relationships between variables, conditioning, independence and correlation

• relate the theory and method taught in the half course to solve practical problems.

Format of the examination

As in previous years, the ability of candidates varied widely. As every year, though, many

candidates still have not studied the subject guide thoroughly, which you should do and know the

topics covered by the syllabus, not only practise on past papers.

The format of this year’s examination will be retained for next year’s examination.

Key steps to improvement

Many candidates found it difficult to perform variable transformations when the random variable

involved is discrete, showing a lack of understanding of probability distributions. The same goes for

1

ST3133 Advanced statistics: distribution theory

bivariate random variables with continuous distributions. No matter whether finding the

distribution function first (for example, Question 1 (c)) or using the variable transformation formula

(Question 2 (c)), many candidates struggled to find the correct answer either because of inaccurate

calculations or, worse, a lack of knowledge of the subject.

When calculating a probability or an expectation, especially when evaluating double integrals, many

candidates got the results wrong because of carelessly placing the wrong limits of integration. Please

practise more on how to find the limits correctly for a particular region of a joint density.

Question 1 (c) iv., for example, can be answered even without finishing the previous parts; similarly,

Question 2 (b) when determining if X and Y are independent. Candidates should always spare some

time to read the questions carefully, then see if there are parts which can be completed quickly

without the need to solve previous parts.

You should be ready to derive the moment generating functions of standard random variables, like

the normal, gamma, chi-squared, exponential (all continuous), or the geometric, binomial, Poisson

(all discrete), and ideally know the forms by heart. It is also important to know basic applications of

these distributions, and apply the correct formulae in probability questions.

Examination revision strategy

Many candidates are disappointed to find that their examination performance is poorer than they

expected. This may be due to a number of reasons, but one particular failing is ‘question

spotting’, that is, confining your examination preparation to a few questions and/or topics which

have come up in past papers for the course. This can have serious consequences.

We recognise that candidates might not cover all topics in the syllabus in the same depth, but you

need to be aware that examiners are free to set questions on any aspect of the syllabus. This

means that you need to study enough of the syllabus to enable you to answer the required number of

examination questions.

The syllabus can be found in the Course information sheet available on the VLE. You should read

the syllabus carefully and ensure that you cover sufficient material in preparation for the

examination. Examiners will vary the topics and questions from year to year and may well set

questions that have not appeared in past papers. Examination papers may legitimately include

questions on any topic in the syllabus. So, although past papers can be helpful during your revision,

you cannot assume that topics or specific questions that have come up in past examinations will

occur again.

If you rely on a question-spotting strategy, it is likely you will find yourself in difficulties

when you sit the examination. We strongly advise you not to adopt this strategy.

2

Examiners’ commentaries 2018

Examiners’ commentaries 2018

ST3133 Advanced statistics: distribution theory

Important note

This commentary reflects the examination and assessment arrangements for this course in the

academic year 2017–18. The format and structure of the examination may change in future years,

and any such changes will be publicised on the virtual learning environment (VLE).

Information about the subject guide and the Essential reading

references

Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).

You should always attempt to use the most recent edition of any Essential reading textbook, even if

the commentary and/or online reading list and/or subject guide refer to an earlier edition. If

different editions of Essential reading are listed, please check the VLE for reading supplements – if

none are available, please use the contents list and index of the new edition to find the relevant

section.

Comments on specific questions – Zone A

Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks) and all

THREE questions from Section B (60 marks in total). Candidates are strongly advised to

divide their time accordingly.

Section A

Answer all three parts of question 1 (40 marks in total).

Question 1

(a) Let g(x) be a function taking on integer values of x, with:

g(x) =

a for x = −2,−1

3a for x = 0, 1

4a for x = 2, 3

0 otherwise.

i. Find a so that g(x) is a probability mass function.

(3 marks)

ii. Let X be a discrete random variable with probability mass function g(x).

Find E(X) and Var(X).

(5 marks)

iii. Write down the probability mass function for Y = X2 − 2|X|+ 1.

(4 marks)

3

ST3133 Advanced statistics: distribution theory

Reading for this question

This question is about discrete probability distributions and basic moment calculations, and

has been overall well-answered except for part iii. Discrete random variables are discussed in

Section 3.3.1 of the subject guide, with examples. The mean and variance are covered in

Sections 3.4.2 and 3.4.3, respectively.

Part i. needs the application of Claim 3.3.6 iii. that the sum of probabilities over the

support is 1 to find the value of a. This part was done well in general although many

candidates calculated the answer incorrectly because of careless mistakes. Some candidates

just equated a+ 3a+ 4a = 1, not realising that there are two values of x which have

probability a, 3a and 4a, respectively. This indeed shows a lack of understanding of discrete

probability distributions, which is disappointing.

Part ii. was done well in general even if a could be incorrect, and marks were awarded in full

if the value of a was the only thing that was wrong.

For part iii., there is no formal transformation formula like that in the continuous case. To

do this question, candidates should find out the support (Definition 3.3.2 on page 53 of the

subject guide) of Y first, which is 0, 1, 4. For instance, X = −2, 0, 2 are all mapped to

Y = 1, so

P (Y = 1) = g(−2) + g(0) + g(2) = a+ 3a+ 4a = 8a = 1

2

.

Approaching the question

i. We must have:

1 =

∑

x

g(x) = (a+ a) + (3a+ 3a) + (4a+ 4a)

so that a = 1/16.

ii. We have:

E(X) =

∑

x

x g(x) = (−2− 1)× a+ (0 + 1)× 3a+ (2 + 3)× 4a = 20a = 5

4

and:

Var(X) = E(X2)− (E(X))2 =

∑

x

x2 g(x)− 25

16

= 5a+ 3a+ 13× 4a− 25

16

= 60a− 25

16

=

35

16

.

iii. Since Y = (|X| − 1)2, it is easy to see that −2, 0 and 2 are mapped to 1, −1 and 1 are

mapped to 0 and, finally, 3 is mapped to 4. Hence the probability mass function of Y is:

gY (y) =

a+ 3a for y = 0

a+ 3a+ 4a for y = 1

4a for y = 4

0 otherwise

=

1/4 for y = 0

1/2 for y = 1

1/4 for y = 4

0 otherwise.

(b) The cumulative distribution function FX(x) for the continuous random variable

X is defined by:

FX(x) =

0 for x < 0

ax3/3 for 0 ≤ x < 1

(a(x− 2)3 + 2a)/3 for 1 ≤ x < 2

1 for x ≥ 2.

4

Examiners’ commentaries 2018

i. Find the value of a.

(2 marks)

ii. Derive the probability density function of X.

(3 marks)

iii. Let W = X2. Derive the cumulative distribution function of W . Hence,

derive the probability density function of W .

(7 marks)

Reading for this question

This question was answered badly in general. Part i. requires candidates to understand that

the distribution function of a continuous random variable is continuous itself. This can be

seen from Proposition 3.2.4 equation iv. on page 50 of the subject guide. Since for a

continuous random variable X we have P (X = x) = 0 for all x, then:

0 = FX(x)− FX(x−)

so that FX(x) = FX(x−). With this, we have FX(2) = FX(2−), which is:

1 =

a(2− 2)3 + 2a

3

so that a = 3/2.

Part ii. was better answered in general, and even if a is left as unknown if the density

function is correct candidates received full marks. To find the probability density function

you just need to differentiate the distribution function with respect to x. See Claim 3.3.14

on page 58 of the subject guide.

For part iii., many candidates could get that:

P (W ≤ w) = P (X2 ≤ w) = P (−√w ≤ X ≤ √w) = FX(

√

w)− FX(−

√

w)

but failed to realise FX(−

√

w) = 0 since it is given that FX(x) = 0 for x < 0.

Approaching the question

i. For a continuous random variable, we must have lim

x↗2

F (x) = F (2) = 1, so that 2a/3 = 1,

meaning a = 3/2.

ii. We have fX(x) = F

′

X(x), so that:

fX(x) =

3x2/2 for 0 ≤ x < 1

3(x− 2)2/2 for 1 ≤ x < 2

0 otherwise.

iii. The cumulative distribution function for W is, for w > 0:

FW (w) = P (W ≤ w) = P (X ≤

√

w) = FX(

√

w) =

0 for w < 0

w3/2/2 for 0 ≤ w < 1

(

√

w − 2)3/2 + 1 for 1 ≤ w < 4

1 for w ≥ 4.

The probability density function of W is then fW (w) = F

′

W (w), which is:

fW (w) =

3

√

w/4 for 0 ≤ w < 1

3(

√

w − 2)2/4√w for 1 ≤ w < 4

0 otherwise.

5

ST3133 Advanced statistics: distribution theory

(c) Let X follow an exponential distribution with rate λ, i.e. X has a density

function:

fX(x) =

{

λe−λx for x > 0

0 otherwise.

i. Derive the moment generating function of X.

(3 marks)

ii. Let Y be an independent and identically distributed copy of X. For w > 0,

show that:

P (X − Y ≤ w) = 1− e

−λw

2

.

(Hint: find the joint density of X and Y first. Determine the valid region in

the double integral involved.)

(5 marks)

iii. For w ≤ 0, show that:

P (X − Y ≤ w) = e

λw

2

.

(5 marks)

iv. Using parts ii. and iii. of question (c), show that the density function of

W = X − Y is given by:

fW (w) =

λe−λ|w|

2

, for w ∈ R.

(3 marks)

Reading for this question

Part i. was well-answered in general, and finding moment generating functions for a random

variable is a basic technique which candidates need to practise. See Section 3.5 of the

subject guide for more details. For parts ii. and iii., many candidates were not able to

pinpoint the exact limits of integration which should be used in the double integration.

Some candidates did not even realise it should be a double integration because two random

variables are involved. The first thing you should do is to find the region of integration, then

to work out the joint probability density function of (X,Y ), which is just fX(x) fY (y), since

X and Y are independent. See Sections 4.1 and 4.2 of the subject guide for more basic

knowledge on joint density functions.

Some candidates were confused as to how w > 0 or w < 0 affects the calculation. For part

iii., since X ≤ Y + w and w < 0, the limits of Y cannot start from 0, otherwise X will be

negative, which is not allowed. To make sure X ≥ 0, Y has to start from −w. This is

exactly the difference between parts ii. and iii.

For part iv., many candidates could not work out the answer, which is disappointing since

you do not even need to know how to calculate the answers to parts ii. and iii. As long as

you realise we are calculating the distribution function of W in parts ii. and iii., you only

need to differentiate those given answers with respect to w, the result will then follow.

Approaching the question

i. The moment generating function of T is:

MT (s) = E(e

sT ) =

∫ ∞

0

λe−(λ−s)t dt =

λ

λ− s

∫ ∞

0

(λ− s)e−(λ−s)t dt = λ

λ− s , for s < λ.

ii. For w ≥ 0, X − Y ≤ w implies that 0 < X ≤ Y + w, where Y > 0. Hence:

P (X − Y ≤ w) =

∫ ∞

0

∫ y+w

0

fX,Y (x, y) dxdy =

∫ ∞

0

λe−λy[−e−λx]y+w0 dy

=

∫ ∞

0

λe−λy(1− e−λ(y+w)) dy

= 1− 1

2

e−λw.

6

Examiners’ commentaries 2018

iii. For w < 0, X − Y ≤ w implies that 0 < X ≤ Y + w where Y > −w. Hence:

P (X − Y ≤ w) =

∫ ∞

−w

∫ y+w

0

fX,Y (x, y) dxdy =

∫ ∞

−w

λe−λy(1− e−λ(y+w)) dy

= eλw − e−λw × 1

2

e2λw

=

1

2

eλw.

iv. Differentiating the answers in ii. and iii. with respect to w, we have:

fW (w) =

{

λeλw/2 for w < 0

λe−λw/2 for w ≥ 0 =

1

2

λe−λ|w|, for w ∈ R.

Section B

Answer all three questions in this section (60 marks in total).

Question 2

The conditional density of a random variable X given Y = y is given by:

fX|Y (x | y) =

{

3x2/y3 for 0 < x < y < 3

0 otherwise.

The conditional density of Y given X = x is given by:

fY |X(y |x) =

{

3y2/(27− x3) for 0 < x < y < 3

0 otherwise.

(a) Find the ratio fX(x)/fY (y), where fX(x) and fY (y) are the marginal densities

of X and Y , respectively.

(2 marks)

(b) By integrating out x first in the answer in (a), show that:

fY (y) =

{

2y5/243 for 0 < y < 3

0 otherwise.

Is X independent of Y ? Justify your answer.

(9 marks)

(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully

state the region for (U, V ) where this joint density is non-zero.

(9 marks)

Reading for this question

This question was not well-answered in general, which is a little unexpected. You should look at

the marks allocated to each part to determine approximately how long the answers should be.

Part (a) is only worth two marks, so you should not expect the answer to have long derivations.

See Section 5.2 of the subject guide for the definition of continuous conditional distributions.

For part (b), many candidates knew to follow the hint and integrate out x, but the limits should

be the limits for the marginal density of X, i.e. the limits should not involve y. In fact, you only

know

∫

fX(x) dx = 1, and the lower and upper limits are for the marginal density of X, which in

7

ST3133 Advanced statistics: distribution theory

this case should be 0 and 3, respectively. Many candidates went on to prove if

fX,Y (x, y) = fX(x) fY (y), which is correct but is not a quick way to see if X and Y are

independent since you still need to calculate fX,Y (x, y) and fX(x), both of which are not given to

you. In the process, some candidates unfortunately got the wrong answers. To determine

independence between X and Y more quickly, you should check whether fY |X(y |x) = fY (y) or

not. This is equivalent to the criterion fX,Y (x, y) = fX(x) fY (y),l of course, but fY |X(y |x) and

fY (y) are both given to you! So you do not even need to calculate anything to know that X and

Y are not independent! See Section 4.4 of the subject guide for more details on independence of

a pair of random variables.

Part (c) was not done well because of inaccurate calculations mostly, especially the calculations

of the Jacobian. See Section 4.6 of the subject guide for more details.

Approaching the question

(a) We have:

fX|Y (x | y)

fY |X(y |x) =

fX,Y (x, y)/fY (y)

fX,Y (x, y)/fX(x)

=

fX(x)

fY (y)

so that:

fX(x)

fY (y)

=

x2(27− x3)

y5

.

(b) Since 0 < x < y < 3, integrating out the effect of y means that 0 < x < 3. Hence:

1

fY (y)

=

∫ 3

0

fX(x)

fY (y)

dx =

1

y5

∫ 3

0

(27x2 − x5) dx = 1

y5

[

9x3 − x

6

6

]3

0

=

243

2y5

so that:

fY (y) =

2y5

243

for 0 < y < 3.

Since fY (y) 6= fY |X(y |x), X is not independent of Y .

(c) We have:

X =

√

UV and Y =

√

U

V

so that 0 < X < Y < 3 implies:

0 <

√

UV <

√

U

V

< 3

meaning:

0 < U < 9V and V < 1.

Hence:

fU,V (u, v) = fX,Y (

√

uv,

√

u/v)

∣∣∣∣∣∣∣∣ √v/2√u √u/2√v1/2√uv −√u/2v3/2

∣∣∣∣∣∣∣∣

=

3uv

u3/2/v3/2

× 2

243

u5/2/v5/2 × 1

2v

=

u2

81v

for u < 9v, 0 < v < 1.

Question 3

If X is Gamma distributed with parameters α and β, i.e. X ∼ Gamma(α, β), then it

has density:

fX(x) =

βα

Γ(α)

xα−1e−βx, for x > 0

and Γ(α) =

∫∞

0

yα−1e−y dy for α > 0.

8

Examiners’ commentaries 2018

(a) If X ∼ Gamma(α1, β), Y ∼ Gamma(α2, β), and X is independent of Y , derive

the distribution of X + Y . You may use the moment generating function of a

Gamma random variable without proof, as long as you state it clearly.

(7 marks)

(b) Let Xi ∼ Gamma(α, β), i = 1, . . . , N , be independent of each other and

α, β > 0. Each Xi is also independent of N , which is Poisson distributed with

mean µ, so that the probability mass function for N is given by:

pN(n) =

µne−µ

n!

, for n = 0, 1, . . . .

Consider the random variable:

W =

N∑

i=1

Xi

with the convention that W = 0 if N = 0.

i. Derive the moment generating function of W .

(8 marks)

ii. Find the mean of W . You can use the means of a Poisson and a Gamma

random variable without proof. If you use any standard results about

random sums, you must first state them clearly.

(5 marks)

Reading for this question

This question was not answered as well as it should have been. Parts (a) and (b) i. are both

standard exercises. For part (a), see Proposition 4.7.3 for finding the moment generating function

of two independent random variables. Identifying the form of the answer is then the key to

knowing that the sum is still a Gamma distribution.

For part (b) i., the derivation of the moment generating function of a random sum is covered in

the subject guide in Section 5.6. See Lemma 5.6.2 iii. and Proposition 5.6.3 iii. on page 165 of

the subject guide. Indeed many candidates realised this which is good, and scored decent marks

already even though they could not work out the final answer in the end.

For part (b) ii., you can use Proposition 5.6.3 i., or differentiate the moment generating function

which you obtained in part (b) i. Of course, the former will give you the answer much quicker!

Approaching the question

(a) First, let W = X + Y . Since MX(t) = (β/(β − t))α for X ∼ Gamma(α, β), then:

MW (t) = E(e

tW ) = E(etX) E(etY ) =

(

β

β − t

)α1+α2

, for t < β.

This shows that, by the one-to-one correspondence between distribution and moment

generating function, W = X + Y has a Gamma(α1 + α2, β) distribution.

(b) i. We have:

MW (t) = E(e

tW ) = E

(

E

(

exp

(

t

N∑

i=1

Xi

)

|N

))

.

However, given N , we have:

E

(

exp

(

t

N∑

i=1

Xi

))

= E

(

N∏

i=1

exp(tXi)

)

=

N∏

i=1

E

(

etXi

)

=

(

β

β − t

)αN

, for t < β.

9

ST3133 Advanced statistics: distribution theory

At the same time:

MN (s) = E(e

sN ) =

∞∑

n=1

(µes)ne−µ

n!

= eµ(e

s−1), for s ∈ R.

Hence:

MW (t) = E

(

exp

(

N log

(

β

β − t

)α))

= exp

(

µ

(

β

β − t

)α

− µ

)

, for t < β.

ii. We have:

E(W ) = E(E(W |N)) = E(N E(Xi)) = µα

β

.

Question 4

Suppose we have a biased coin, which comes up heads with probability u. An

experiment is carried out so that X is the number of independent flips of the coin

required for r heads to show up, where r ≥ 1 is known.

(a) Show that the probability mass function of X is:

pX(x) =

{(x−1

r−1

)

ur (1− u)x−r for x = r, r + 1, . . .

0 otherwise.

(5 marks)

(b) Suppose U is uniformly distributed on (0, 1), and the distribution in part (a)

becomes

pX|U(x |u) =

{(x−1

r−1

)

ur (1− u)x−r for x = r, r + 1, . . .

0 otherwise.

i. Find the marginal probability mass function for X. You can use:∫ 1

0

ya(1− y)b dy = a! b!

(a+ b+ 1)!

for non-negative integers a and b without proof.

(6 marks)

ii. Show that the density of U |X = x is given by:

fU |X(u |x) =

(x+ 1)!

r!(x− r)! u

r (1− u)x−r, for 0 < u < 1.

Hence find the mean of U |X = x.

(5 marks)

(c) Another independent experiment is carried out, with Y denoting the number of

independent flips of the coin required for r heads to show up (the same r as for

the first experiment).

State (no need for a derivation) the density of U |(X,Y ) = (x, y) and its mean,

where U is still uniformly distributed on (0, 1) as in part (b).

(4 marks)

Reading for this question

This question was not well-answered in general. Part (a) needs you to explain why the probability

mass function is as stated. Many candidates stated that this is a negative binomial distribution

10

Examiners’ commentaries 2018

and hence the density is as given, which is not a proof or an explanation at all. See Example

3.3.10 on page 56 of the subject guide for the justification of its probability mass function.

Part (b) i. needs you to work out the joint density function of X and U , then integrate out U to

obtain the marginal probability mass function of X. Some candidates were not careful in their

calculations even though they realised to integrate out u in pX|U (x |u) fU (u). Even though not

being able to do (b) i., you should be able to do (b) ii. using the answer given to you in (b) i. To

find the mean, you need to apply the integral formula given in (b) i.

Part (c) was done worst since it is supposed to be difficult. You need to realise X and Y are

independent experiments which can be seen as one, so that x is replaced by x+ y and r is

replaced by 2r in the answer in (b) ii.

Approaching the question

(a) To wait for r heads to show up, suppose x flips are required. The last flip must be a head,

with r − 1 heads randomly appearing in the first x− 1 flips. In each particular combination

of heads and tails, there must be r heads by definition of the experiment, as well as x− r

tails (so adding together to x flips in total), with probability:

ur (1− u)x−r.

Hence we have:

pX(x) =

(

x− 1

r − 1

)

ur (1− u)x−r, for x = r, r + 1, . . . .

(b) i. The joint probability density for X,U is:

fX,U (x, u) =

(

x− 1

r − 1

)

ur (1− u)x−r, for 0 < u < 1, x = r, r + 1, . . . .

Therefore, the marginal probability mass function of X is:

pX(x) =

∫ 1

0

(

x− 1

r − 1

)

ur (1− u)x−r du =

(

x− 1

r − 1

)∫ 1

0

ur (1− u)x−r du

=

(

x− 1

r − 1

)

r! (x− r)!

(x+ 1)!

=

r

x(x+ 1)

, for x = r, r + 1, . . . .

ii. We have:

fU |X(u |x) = fX,U (x, u)

pX(x)

=

x(x+ 1)

r

×

(

x− 1

r − 1

)

ur (1− u)x−r

=

(x+ 1)!

r! (x− r)!u

r(1− u)x−r, for 0 < u < 1.

The mean is:

E(U |X = x) = (x+ 1)!

r! (x− r)!

∫ 1

0

ur+1(1−u)x−r du = (x+ 1)!

r! (x− r)! ×

(r + 1)! (x− r)!

(x+ 2)!

=

r + 1

x+ 2

.

(c) Mathematically, we have:

pX,Y |U (x, y |u) = pX|U (x |u) pY |U (y |u)

11

ST3133 Advanced statistics: distribution theory

so that with fU (u) = 1, we have:

pU |X,Y (u |x, y) =

pX|U (x |u) pY |U (y |u)∫ 1

0

pX|U (x |u) pY |U (y |u) du

=

u2r(1− u)x+y−2r∫ 1

0

u2r(1− u)x+y−2r du

=

(x+ y + 1)!

(2r)! (x+ y − 2r)!u

2r(1− u)x+y−2r, for 0 < u < 1

which is in parallel to the answer in part (b) ii. The mean is:

2r + 1

x+ y + 2

which is in parallel to the answer in (b) ii.

To see these two answers more quickly, note that X and Y can be seen as one experiment,

waiting for 2r heads to show up. So we need x+ y flips for 2r heads to come up, and hence

we can replace x by x+ y and r by 2r directly from answers in (b) ii.

12

Examiners’ commentaries 2018

Examiners’ commentaries 2018

ST3133 Advanced statistics: distribution theory

Important note

This commentary reflects the examination and assessment arrangements for this course in the

academic year 2017–18. The format and structure of the examination may change in future years,

and any such changes will be publicised on the virtual learning environment (VLE).

Information about the subject guide and the Essential reading

references

Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).

You should always attempt to use the most recent edition of any Essential reading textbook, even if

the commentary and/or online reading list and/or subject guide refer to an earlier edition. If

different editions of Essential reading are listed, please check the VLE for reading supplements – if

none are available, please use the contents list and index of the new edition to find the relevant

section.

Comments on specific questions – Zone B

Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks) and all

THREE questions from Section B (60 marks in total). Candidates are strongly advised to

divide their time accordingly.

Section A

Answer all three parts of question 1 (40 marks in total).

Question 1

(a) Let g(x) be a function taking on integer values of x, with:

g(x) =

2a for x = −3,−1

a for x = 0, 2

3a for x = 1, 3

0 otherwise.

i. Find a so that g(x) is a probability mass function.

(3 marks)

ii. Let X be a discrete random variable with probability mass function g(x).

Find E(X) and Var(X).

(5 marks)

iii. Write down the probability mass function of Y = X2 − 4|X|+ 4.

(4 marks)

13

ST3133 Advanced statistics: distribution theory

Reading for this question

This question is about discrete probability distributions and basic moment calculations, and

has been overall well-answered except for part iii. Discrete random variables are discussed in

Section 3.3.1 of the subject guide, with examples. The mean and variance are covered in

Sections 3.4.2 and 3.4.3, respectively.

Part i. needs the application of Claim 3.3.6 iii. that the sum of probabilities over the

support is 1 to find the value of a. This part was done well in general although many

candidates calculated the answer incorrectly because of careless mistakes. Some candidates

just equated 2a+ a+ 3a = 1, not realising that there are two values of x which have

probability 2a, a and 3a, respectively. This indeed shows a lack of understanding of discrete

probability distributions, which is disappointing.

Part ii. was done well in general even if a could be incorrect, and marks were awarded in full

if the value of a was the only thing that was wrong.

For part iii., there is no formal transformation formula like that in the continuous case. To

do this question, candidates should find out the support (Definition 3.3.2 on page 53 of the

subject guide) of Y first, which is 0, 1, 4. For instance, X = −3,−1, 1, 3 are all mapped to

Y = 1, so

P (Y = 1) = g(−3) + g(−1) + g(1) + g(3) = 2a+ 2a+ 3a+ 3a = 10a = 5

6

.

Approaching the question

i. We must have:

1 =

∑

x

g(x) = (2a+ 2a) + (a+ a) + (3a+ 3a)

so that a = 1/12.

ii. We have:

E(X) =

∑

x

x g(x) = (−3− 1)× 2a+ (0 + 2)× a+ (1 + 3)× 3a = 6a = 1

2

.

and:

Var(X) = E(X2)− (E(X))2 =

∑

x

x2 g(x)− 1

4

= 10× 2a+ 4a+ 10× 3a− 1

4

= 54a− 1

4

=

51

12

.

iii. Since Y = (|X| − 2)2, it is easy to see that −3,−1, 1 and 3 are mapped to 1, 0 is mapped

to 4 and, finally, 2 is mapped to 0. Hence the probability mass function of Y is:

gY (y) =

a for y = 0

2a+ 2a+ 3a+ 3a for y = 1

a for y = 4

0 otherwise.

=

1/12 for y = 0

5/6 for y = 1

1/12 for y = 4

0 otherwise.

(b) The cumulative distribution function FX(·) for the continuous random variable

X is defined by:

FX(x) =

0 for x < 0

ax2/4 for 0 ≤ x < 1

((x− 1)3 + a)/4 for 1 ≤ x < 2

1 x ≥ 2.

14

Examiners’ commentaries 2018

i. Find the value of a.

(1 mark)

ii. Derive the probability density function of X.

(4 marks)

iii. Let W = X2. Derive the cumulative distribution function of W . Hence,

derive the probability density function of W .

(7 marks)

Reading for this question

This question was answered badly in general. Part i. requires candidates to understand that

the distribution function of a continuous random variable is continuous itself. This can be

seen from Proposition 3.2.4 equation iv. on page 50 of the subject guide. Since for a

continuous random variable X we have P (X = x) = 0 for all x, then:

0 = FX(x)− FX(x−)

so that FX(x) = FX(x−). With this, we have FX(2) = FX(2−), which is:

1 =

(2− 1)3 + a

4

so that a = 3.

Part ii. was better answered in general, and even if a is left as unknown if the density

function is correct candidates received full marks. To find the probability density function

you just need to differentiate the distribution function with respect to x. See Claim 3.3.14

on page 58 of the subject guide.

For part iii., many candidates could get that:

P (W ≤ w) = P (X2 ≤ w) = P (−√w ≤ X ≤ √w) = FX(

√

w)− FX(−

√

w)

but failed to realise FX(−

√

w) = 0 since it is given that FX(x) = 0 for x < 0.

Approaching the question

i. For a continuous random variable, we must have lim

x↗2

F (x) = F (2) = 1, so that

1/4 + a/4 = 1, meaning a = 3.

ii. We have fX(x) = F

′

X(x), so that:

fX(x) =

3x/2 for 0 ≤ x < 1

3(x− 1)2/4 for 1 ≤ x < 2

0 otherwise.

iii. The cumulative distribution function for W is, for w > 0:

FW (w) = P (W ≤ w) = P (X ≤

√

w) = FX(

√

w) =

0 for w < 0

3w/4 for 0 ≤ w < 1

(

√

w − 1)3/4 + 3/4 for 1 ≤ w < 4

1 for w ≥ 4.

The probability density function of W is then fW (w) = F

′

W (w), which is:

fW (w) =

3/4 for 0 ≤ w < 1

3(

√

w − 1)2/8√w for 1 ≤ w < 4

0 otherwise.

15

ST3133 Advanced statistics: distribution theory

(c) Let X follow an exponential distribution with rate λ, i.e. X has a density

function:

fX(x) =

{

λe−λx for x > 0

0 otherwise.

i. Derive the moment generating function of X.

(3 marks)

ii. Let Y be an independent and identically distributed copy of X. For w > 0,

show that:

P (X − Y ≤ w) = 1− e

−λw

2

.

(Hint: find the joint density of X and Y first. Determine the valid region in

the double integral involved.)

(5 marks)

iii. For w ≤ 0, show that:

P (X − Y ≤ w) = e

λw

2

.

(5 marks)

iv. Using parts ii. and iii. of question (c), show that the density function of

W = X − Y is given by:

fW (w) =

λe−λ|w|

2

, for w ∈ R.

(3 marks)

Reading for this question

Part i. was well-answered in general, and finding moment generating functions for a random

variable is a basic technique which candidates need to practise. See Section 3.5 of the

subject guide for more details. For parts ii. and iii., many candidates were not able to

pinpoint the exact limits of integration which should be used in the double integration.

Some candidates did not even realise it should be a double integration because two random

variables are involved. The first thing you should do is to find the region of integration, then

to work out the joint probability density function of (X,Y ), which is just fX(x) fY (y), since

X and Y are independent. See Sections 4.1 and 4.2 of the subject guide for more basic

knowledge on joint density functions.

Some candidates were confused as to how w > 0 or w < 0 affects the calculation. For part

iii., since X ≤ Y + w and w < 0, the limits of Y cannot start from 0, otherwise X will be

negative, which is not allowed. To make sure X ≥ 0, Y has to start from −w. This is

exactly the difference between parts ii. and iii.

For part iv., many candidates could not work out the answer, which is disappointing since

you do not even need to know how to calculate the answers to parts ii. and iii. As long as

you realise we are calculating the distribution function of W in parts ii. and iii., you only

need to differentiate those given answers with respect to w, the result will then follow.

Approaching the question

i. The moment generating function of T is:

MT (s) = E(e

sT ) =

∫ ∞

0

λe−(λ−s)t dt =

λ

λ− s

∫ ∞

0

(λ− s)e−(λ−s)t dt = λ

λ− s , for s < λ.

ii. For w ≥ 0, X − Y ≤ w implies that 0 < X ≤ Y + w, where Y > 0. Hence:

P (X − Y ≤ w) =

∫ ∞

0

∫ y+w

0

fX,Y (x, y) dxdy =

∫ ∞

0

λe−λy[−e−λx]y+w0 dy

=

∫ ∞

0

λe−λy(1− e−λ(y+w)) dy

= 1− 1

2

e−λw.

16

Examiners’ commentaries 2018

iii. For w < 0, X − Y ≤ w implies that 0 < X ≤ Y + w where Y > −w. Hence:

P (X − Y ≤ w) =

∫ ∞

−w

∫ y+w

0

fX,Y (x, y) dxdy =

∫ ∞

−w

λe−λy(1− e−λ(y+w)) dy

= eλw − e−λw × 1

2

e2λw

=

1

2

eλw.

iv. Differentiating the answers in ii. and iii. with respect to w, we have:

fW (w) =

{

λeλw/2 for w < 0

λe−λw/2 for w ≥ 0 =

1

2

λe−λ|w|, for w ∈ R.

Section B

Answer all three questions in this section (60 marks in total).

Question 2

The conditional density of a random variable X given Y = y is given by:

fX|Y (x | y) =

{

x/(2y2) for 0 < x < 2y < 2

0 otherwise.

The conditional density of Y given X = x is given by:

fY |X(y |x) =

{

24y2/(8− x3) for 0 < x < 2y < 2

0 otherwise.

(a) Find the ratio fY (y)/fX(x), where fX(x) and fY (y) are the marginal densities

of X and Y , respectively.

(2 marks)

(b) By integrating out y first in the answer in (a), show that:

fX(x) =

{

(5x(8− x3))/48 for 0 < x < 2

0 otherwise.

Is X independent of Y ? Justify your answer.

(9 marks)

(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully

state the region for (U, V ) where this joint density is non-zero.

(9 marks)

Reading for this question

This question was not well-answered in general, which is a little unexpected. You should look at

the marks allocated to each part to determine approximately how long the answers should be.

Part (a) is only worth two marks, so you should not expect the answer to have long derivations.

See Section 5.2 of the subject guide for the definition of continuous conditional distributions.

For part (b), many candidates knew to follow the hint and integrate out x, but the limits should

be the limits for the marginal density of X, i.e. the limits should not involve y. In fact, you only

know

∫

fX(x) dx = 1, and the lower and upper limits are for the marginal density of X, which in

17

ST3133 Advanced statistics: distribution theory

this case should be 0 and 3, respectively. Many candidates went on to prove if

fX,Y (x, y) = fX(x) fY (y), which is correct but is not a quick way to see if X and Y are

independent since you still need to calculate fX,Y (x, y) and fX(x), both of which are not given to

you. In the process, some candidates unfortunately got the wrong answers. To determine

independence between X and Y more quickly, you should check whether fY |X(y |x) = fY (y) or

not. This is equivalent to the criterion fX,Y (x, y) = fX(x) fY (y),l of course, but fY |X(y |x) and

fY (y) are both given to you! So you do not even need to calculate anything to know that X and

Y are not independent! See Section 4.4 of the subject guide for more details on independence of

a pair of random variables.

Part (c) was not done well because of inaccurate calculations mostly, especially the calculations

of the Jacobian. See Section 4.6 of the subject guide for more details.

Approaching the question

(a) We have:

fY |X(y |x)

fX|Y (x | y) =

fX,Y (x, y)/fX(x)

fX,Y (x, y)/fY (y)

=

fY (y)

fX(x)

so that:

fY (y)

fX(x)

=

48y4

8x− x4 .

(b) Since 0 < x < 2y < 2, integrating out the effect of y means that 0 < x < 2. Hence:

1

fX(x)

=

∫ 1

0

fY (y)

fX(x)

dy =

1

8x− x4

∫ 1

0

48y4 dy =

1

x(8− x3)

48

5

so that:

fX(x) =

5x(8− x3)

48

, for 0 < x < 2.

Since fX(x) 6= fX|Y (x | y), X is not independent of Y .

(c) We have:

X =

√

UV and Y =

√

U

V

so that 0 < X < 2Y < 2 implies:

0 <

√

UV < 2

√

U

V

< 2

meaning:

0 < U < V and V < 2.

Hence:

fU,V (u, v) = fX,Y (

√

uv,

√

u/v)

∣∣∣∣∣∣∣∣ √v/2√u √u/2√v1/2√uv −√u/2v3/2

∣∣∣∣∣∣∣∣

=

24u/v

8− (uv)3/2 ×

5

48

(8

√

uv − u2v2)× 1

2v

=

5u3/2

4v3/2

for u < v, 0 < v < 2.

Question 3

If X is Gamma distributed with parameters α and β, i.e. X ∼ Gamma(α, β), then it

has density:

fX(x) =

βα

Γ(α)

xα−1e−βx, for x > 0

and Γ(α) =

∫∞

0

yα−1e−y dy for α > 0.

18

Examiners’ commentaries 2018

(a) Suppose X ∼ Gamma(α1, β1), Y ∼ Gamma(α2, β2), and X is independent of Y .

Derive the distribution of β1X + β2Y . You may use the moment generating

function of a Gamma random variable without proof, as long as you state it

clearly.

(7 marks)

(b) Let Xi ∼ Gamma(α, βi), i = 1, . . . , N , be independent of each other and

α, βi > 0. Each Xi is also independent of N , which is Poisson distributed with

mean µ, so that the probability mass function for N is given by:

pN(n) =

µne−µ

n!

, for n = 0, 1, . . . .

Consider the random variable:

W =

N∑

i=1

βiXi

with the convention that W = 0 if N = 0.

i. Derive the moment generating function of W .

(8 marks)

ii. Find the mean of W . You can use the mean of a Poisson random variable

without proof. The mean of X ∼ Gamma(α, β) is α/β.

(5 marks)

Reading for this question

This question was not answered as well as it should have been. Parts (a) and (b) i. are both

standard exercises. For part (a), see Proposition 4.7.3 for finding the moment generating function

of two independent random variables. Identifying the form of the answer is then the key to

knowing that the sum is still a Gamma distribution. Many candidates did not realise:

Mβ1X(t) = E(e

tβ1X) = MX(β1t) = (1− t)−α

and could not get the correct answer in the end by eliminating β1 in the moment generating

function. We also see that β1X ∼ Gamma(α, 1).

For part (b) i., the derivation of the moment generating function of a random sum is covered in

the subject guide in Section 5.6. See Lemma 5.6.2 iii. and Proposition 5.6.3 iii. on page 165 of

the subject guide. Indeed many candidates realised this which is good, and scored decent marks

already even though they could not work out the final answer in the end.

For part (b) ii., you can use Proposition 5.6.3 i. since the βiXis are all i.i.d. Gamma(α, 1) from

the calculations in part (a) and so have a common mean α, or differentiate the moment

generating function you obtained in part (b) i. Of course, the former will give you the answer

much quicker!

Approaching the question

(a) First, let W = β1X + β2Y . Since MX(t) = (β/(β − t))α for X ∼ Gamma(α, β), then:

MW (t) = E(e

tW ) = E(etβ1X) E(etβ2Y ) =

(

β1

β1 − β1t

)α1 ( β2

β2 − β2t

)α2

=

(

1

1− t

)α1+α2

for t < 1. This shows that, by the one-to-one correspondence between distribution and

moment generating function, β1X + β2Y has a Gamma(α1 + α2, 1) distribution.

19

ST3133 Advanced statistics: distribution theory

(b) i. We have:

MW (t) = E(e

tW ) = E

(

E

(

exp

(

t

N∑

i=1

βiXi

)

|N

))

.

However, given N , we have:

E

(

exp

(

t

N∑

i=1

βiXi

))

= E

(

N∏

i=1

exp(tβiXi)

)

=

N∏

i=1

E

(

etβiXi

)

=

(

1

1− t

)αN

, for t < 1.

At the same time:

MN (s) = E(e

sN ) =

∞∑

n=1

(µes)ne−µ

n!

= eµ(e

s−1), for s ∈ R.

Hence:

MW (t) = E

(

exp

(

N log

(

1

1− t

)α))

= exp

(

µ

(

1

1− t

)α

− µ

)

, for t < 1.

ii. We have:

E(W ) = E(E(W |N)) = E

(

N∑

i=1

βi E(Xi)

)

= µα.

Question 4

Suppose we have a biased coin, which comes up heads with probability u. An

experiment is carried out so that X is the number of independent flips of the coin

required for r heads to show up, where r ≥ 1 is known.

(a) Show that the probability mass function for X is:

pX(x) =

{(x−1

r−1

)

ur (1− u)x−r for x = r, r + 1, . . .

0 otherwise.

(5 marks)

(b) Suppose U is random and has a density given by:

fU(u) =

{

Γ(α+β)

Γ(α)Γ(β)

uα−1 (1− u)β−1 for 0 < u < 1

0 otherwise

where α, β > 0, and Γ(α) is defined in Question 3, which has the property that

Γ(α) = (α− 1)Γ(α− 1) for α ≥ 1, and Γ(k) = (k − 1)! for a positive integer k.

The distribution in part (a) thus becomes:

pX|U(x |u) =

{(x−1

r−1

)

ur (1− u)x−r for x = r, r + 1, . . .

0 otherwise.

i. Find the marginal probability mass function of X if α = β = 2.

(6 marks)

ii. With α = β = 2 still, show that the density of U |X = x is given by:

fU |X(u |x) =

{

(x+3)!

(r+1)! (x−r+1)! u

r+1 (1− u)x−r+1 for 0 < u < 1

0 otherwise.

Hence find the mean of U |X = x.

(5 marks)

20

Examiners’ commentaries 2018

(c) Another independent experiment is carried out, with Y denoting the number of

independent flips of the coin required for r heads to show up (the same r as for

the first experiment).

State (no need for a derivation) the density of U | (X,Y ) = (x, y) and its mean,

where U still has the density in part (b) with α = β = 2.

(4 marks)

Reading for this question

This question was not well-answered in general. Part (a) needs you to explain why the probability

mass function is as stated. Many candidates stated that this is a negative binomial distribution

and hence the density is as given, which is not a proof or an explanation at all. See Example

3.3.10 on page 56 of the subject guide for the justification of its probability mass function.

Part (b) i. needs you to work out the joint density function of X and U , then integrate out U to

obtain the marginal probability mass function of X. Some candidates were not careful in their

calculations even though they realised to integrate out u in pX|U (x |u) fU (u). Even though not

being able to do (b) i., you should be able to do (b) ii. using the answer given to you in (b) i. To

find the mean, you need to find a general formula for

∫ 1

0

uα−1(1− u)β−1 du by realising that∫ 1

0

fU (u) du = 1 in the probability density function given in part (b).

Part (c) was done worst since it is supposed to be difficult. You need to realise X and Y are

independent experiments which can be seen as one, so that x is replaced by x+ y and r is

replaced by 2r in the answer in (b) ii.

Approaching the question

(a) To wait for r heads to show up, suppose x flips are required. Therefore, the last flip must be

a head, with r − 1 heads randomly appearing in the first x− 1 flips. In each particular

combination of heads and tails, there must be r heads by definition of the experiment, as

well as x− r tails (so adding together to x flips in total), with probability:

ur (1− u)x−r.

Hence we have:

pX(x) =

(

x− 1

r − 1

)

ur (1− u)x−r, for x = r, r + 1, . . . .

(b) i. The joint probability density for X,U is:

fX,U (x, u) =

(

x− 1

r − 1

)

Γ(α+ β)

Γ(α) Γ(β)

ur+α−1 (1−u)x−r+β−1, for 0 < u < 1, x = r, r+ 1, . . . .

Therefore, the marginal probability mass function of X is:

pX(x) =

∫ 1

0

(

x− 1

r − 1

)

Γ(α+ β)

Γ(α) Γ(β)

ur+α−1 (1− u)x−r+β−1 du

=

(

x− 1

r − 1

)

Γ(α+ β)

Γ(α) Γ(β)

∫ 1

0

ur+α−1 (1− u)x−r+β−1 du

=

(

x− 1

r − 1

)

Γ(α+ β)

Γ(α) Γ(β)

Γ(r + α) Γ(x− r + β)

Γ(x+ α+ β)

=

6r(r + 1)(x− r + 1)

x(x+ 1)(x+ 2)(x+ 3)

, for x = r, r + 1, . . . .

21

ST3133 Advanced statistics: distribution theory

ii. We have:

fU |X(u |x) = fX,U (x, u)

pX(x)

=

ur+α−1(1− u)x−r+β−1∫ 1

0

ur+α−1(1− u)x−r+β−1 du

=

Γ(x+ 4)

Γ(r + 2) Γ(x− r + 2) u

r+1 (1− u)x−r+1, for 0 < u < 1.

The mean is:

E(U |X = x) = (x+ 3)!

(r + 1)! (x− r + 1)!

∫ 1

0

ur+2(1− u)x−r+1 du

=

(x+ 3)!

(r + 1)! (x− r + 1)! ×

(r + 2)! (x− r + 1)!

(x+ 4)!

=

r + 2

x+ 4

.

(c) Mathematically, we have:

pX,Y |U (x, y |u) = pX|U (x |u) pY |U (y|u)

so that:

pU |X,Y (u |x, y) =

pX|U (x |u) pY |U (y |u) fU (u)∫ 1

0

pX|U (x |u) pY |U (y |u) fU (u) du

=

u2r+1(1− u)x+y−2r+1∫ 1

0

u2r+1(1− u)x+y−2r+1 du

=

(x+ y + 3)!

(2r + 1)! (x+ y − 2r + 1)! u

2r+1 (1− u)x+y−2r+1, for 0 < u < 1

which is in parallel to the answer in part (b) ii. The mean is:

2r + 2

x+ y + 4

which is in parallel to the answer in (b) ii.

To see these two answers more quickly, note that X nd Y can be seen as one experiment,

waiting for 2r heads to show up. So we need x+ y flips for 2r heads to come up, and hence

we can replace x by x+ y and r by 2r directly from answers in (b) ii.

22

学霸联盟