UL18/0326 Page 1 of 6 D0
~~ST3133_ZA_2016_d0
This paper is not to be removed from the Examination Hall
UNIVERSITY OF LONDON ST3133 ZA
BSc degrees and Diplomas for Graduates in Economics, Management, Finance
and the Social Sciences, the Diplomas in Economics and Social Sciences
Monday, 14 May 2018: 10:00 to 12:00
Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks)
and all THREE questions from Section B (60 marks in total). Candidates are strongly
advised to divide their time accordingly.
A calculator may be used when answering questions on this paper and it must comply
in all respects with the specification given with your Admission Notice. The make and
type of machine must be clearly stated on the front cover of the answer book.
Section A
Answer all three parts of question 1 (40 marks in total)
1. (a) Let g(x) be a function taking on integer values of x, with
g(x) =

a, x = −2,−1;
3a, x = 0, 1;
4a, x = 2, 3;
0, otherwise.
i. Find a so that g(x) is a probability mass function. [3 marks]
ii. Let X be a discrete random variable with probability mass function g(x).
Find E(X) and Var(X). [5 marks]
iii. Write down the probability mass function for Y = X2−2|X|+1. [4 marks]
(b) The cumulative distribution function FX(x) for the continuous random variable
X is defined by
FX(x) =

0, x < 0;
ax3/3, 0 ≤ x < 1;
(a(x− 2)3 + 2a)/3, 1 ≤ x < 2;
1, x ≥ 2.
i. Find the value of a. [2 marks]
ii. Derive the probability density function of X. [3 marks]
iii. Let W = X2. Derive the cumulative distribution function of W . Hence,
derive the probability density function of W . [7 marks]
UL18/0326 2 of 6
(c) Let X follow an exponential distribution with rate λ, i.e., X has a density
function
fX(x) =
{
λe−λx, x > 0;
0, otherwise.
i. Derive the moment generating function of X. [3 marks]
ii. Let Y be an independent and identically distributed copy of X. For w > 0,
show that
P (X − Y ≤ w) = 1− e
−λw
2
.
(Hint: find the joint density of X and Y first. Determine the valid region
in the double integral involved.) [5 marks]
iii. For w ≤ 0, show that
P (X − Y ≤ w) = e
λw
2
.
[5 marks]
iv. Using parts ii and iii of question (c), show that the density function of
W = X − Y is given by
fW (w) =
λe−λ|w|
2
, w ∈ R.
[3 marks]
UL18/0326 3 of 6
Section B
Answer all three questions in this section (60 marks in total)
2. The conditional density of a random variable X given Y = y is given by
fX|Y (x|y) =
{
3x2/y3, 0 < x < y < 3;
0, otherwise.
The conditional density of Y given X = x is given by
fY |X(y|x) =
{
3y2/(27− x3), 0 < x < y < 3;
0, otherwise.
(a) Find the ratio fX(x)/fY (y), where fX(x) and fY (y) are the marginal den-
sities of X and Y , respectively. [2 marks]
(b) By integrating out x first in the answer in (a), show that
fY (y) =
{
2y5/243, 0 < y < 3;
0, otherwise.
(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully
state the region for (U, V ) where this joint density is non-zero. [9 marks]
UL18/0326 4 of 6
3. If X is Gamma distributed with parameters α and β, i.e., X ∼ Gamma(α, β),
then it has density
fX(x) =
βα
Γ(α)
xα−1e−βx, x > 0,
and Γ(α) =
∫∞
0
yα−1e−ydy for α > 0.
(a) If X ∼ Gamma(α1, β), Y ∼ Gamma(α2, β), and X is independent of Y ,
derive the distribution of X + Y . You may use the moment generating
function of a Gamma random variable without proof, as long as you state
it clearly. [7 marks]
(b) Let Xi ∼ Gamma(α, β), i = 1, . . . , N , be independent of each other and
α, β > 0. Each Xi is also independent of N , which is Poisson distributed
with mean µ, so that the probability mass function for N is given by
pN(n) =
µne−µ
n!
, n = 0, 1, . . . .
Consider the random variable
W =
N∑
i=1
Xi,
with the convention that W = 0 if N = 0.
i. Derive the moment generating function of W . [8 marks]
ii. Find the mean ofW . You can use the means of a Poisson and a Gamma
random variable without proof. If you use any standard results about
random sums, you must first state them clearly. [5 marks]
UL18/0326 5 of 6
4. Suppose we have a biased coin, which comes up heads with probability u. An
experiment is carried out so that X is the number of independent flips of the
coin required for r heads to show up, where r ≥ 1 is known.
(a) Show that the probability mass function of X is
pX(x) =
{ (
x−1
r−1
)
ur(1− u)x−r, x = r, r + 1, . . . .;
0, otherwise.
[5 marks]
(b) Suppose U is uniformly distributed on (0, 1), and the distribution in part
(a) becomes
pX|U(x|u) =
{ (
x−1
r−1
)
ur(1− u)x−r, x = r, r + 1, . . . .;
0, otherwise.
i. Find the marginal probability mass function for X. You can use∫ 1
0
ya(1− y)bdy = a!b!
(a+ b+ 1)!
for non-negative integers a and b without proof. [6 marks]
ii. Show that the density of U |X = x is given by
fU |X(u|x) = (x+ 1)!
r!(x− r)!u
r(1− u)x−r, 0 < u < 1.
Hence find the mean of U |X = x. [5 marks]
(c) Another independent experiment is carried out, with Y denoting the num-
ber of independent flips of the coin required for r heads to show up (the
same r as for the first experiment).
State (no need for a derivation) the density of U |(X,Y ) = (x, y) and its
mean, where U is still uniformly distributed on (0, 1) as in part (b).
. [4 marks]
END OF PAPER
UL18/0326 6 of 6
UL18/0327 Page 1 of 7 D0
~~ST3133_ZA_2016_d0
This paper is not to be removed from the Examination Hall
UNIVERSITY OF LONDON ST3133 ZB
BSc degrees and Diplomas for Graduates in Economics, Management, Finance
and the Social Sciences, the Diplomas in Economics and Social Sciences
Monday, 14 May 2018: 10:00 to 12:00
Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks)
and all THREE questions from Section B (60 marks in total). Candidates are strongly
advised to divide their time accordingly.
A calculator may be used when answering questions on this paper and it must comply
in all respects with the specification given with your Admission Notice. The make and
type of machine must be clearly stated on the front cover of the answer book.
Section A
Answer all three parts of question 1 (40 marks in total)
1. (a) Let g(x) be a function taking on integer values of x, with
g(x) =

2a, x = −3,−1;
a, x = 0, 2;
3a, x = 1, 3;
0, otherwise.
i. Find a so that g(x) is a probability mass function. [3 marks]
ii. Let X be a discrete random variable with probability mass function g(x).
Find E(X) and Var(X). [5 marks]
iii. Write down the probability mass function of Y = X2− 4|X|+4. [4 marks]
(b) The cumulative distribution function FX(·) for the continuous random variable
X is defined by
FX(x) =

0, x < 0;
ax2/4, 0 ≤ x < 1;
((x− 1)3 + a)/4, 1 ≤ x < 2;
1, x ≥ 2.
i. Find the value of a. [1 mark]
ii. Derive the probability density function of X. [4 marks]
iii. Let W = X2. Derive the cumulative distribution function of W . Hence,
derive the probability density function of W . [7 marks]
(c) Let X follow an exponential distribution with rate λ, i.e., X has a density
function
fX(x) =
{
λe−λx, x > 0;
0, otherwise.
UL18/0327 Page 2 of 7
i. Derive the moment generating function of X. [3 marks]
ii. Let Y be an independent and identically distributed copy of X. For w > 0,
show that
P (X − Y ≤ w) = 1− e
−λw
2
.
(Hint: find the joint density of X and Y first. Determine the valid region
in the double integral involved.) [5 marks]
iii. For w ≤ 0, show that
P (X − Y ≤ w) = e
λw
2
.
[5 marks]
iv. Using parts ii and iii of question (c), show that the density function of
W = X − Y is given by
fW (w) =
λe−λ|w|
2
, w ∈ R.
[3 marks]
UL18/0327 Page 3 of 7
Section B
Answer all three questions in this section (60 marks in total)
2. The conditional density of a random variable X given Y = y is given by
fX|Y (x|y) =
{
x/(2y2), 0 < x < 2y < 2;
0, otherwise.
The conditional density of Y given X = x is given by
fY |X(y|x) =
{
24y2/(8− x3), 0 < x < 2y < 2;
0, otherwise.
(a) Find the ratio fY (y)/fX(x), where fX(x) and fY (y) are the marginal den-
sities of X and Y , respectively. [2 marks]
(b) By integrating out y first in the answer in (a), show that
fX(x) =
{
(5x(8− x3))/48, 0 < x < 2;
0, otherwise.
(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully
state the region for (U, V ) where this joint density is non-zero. [9 marks]
UL18/0327 Page 4 of 7
3. If X is Gamma distributed with parameters α and β, i.e., X ∼ Gamma(α, β),
then it has density
fX(x) =
βα
Γ(α)
xα−1e−βx, x > 0,
and Γ(α) =
∫∞
0
yα−1e−ydy for α > 0.
(a) Suppose X ∼ Gamma(α1, β1), Y ∼ Gamma(α2, β2), and X is independent
of Y . Derive the distribution of β1X + β2Y . You may use the moment
generating function of a Gamma random variable without proof, as long as
you state it clearly. [7 marks]
(b) Let Xi ∼ Gamma(α, βi), i = 1, . . . , N , be independent of each other and
α, βi > 0. Each Xi is also independent of N , which is Poisson distributed
with mean µ, so that the probability mass function for N is given by
pN(n) =
µne−µ
n!
, n = 0, 1, . . . .
Consider the random variable
W =
N∑
i=1
βiXi,
with the convention that W = 0 if N = 0.
i. Derive the moment generating function of W . [8 marks]
ii. Find the mean of W . You can use the mean of a Poisson random vari-
able without proof. The mean of X ∼ Gamma(α, β) is α/β. [5 marks]
UL18/0327 Page 5 of 7
4. Suppose we have a biased coin, which comes up heads with probability u. An
experiment is carried out so that X is the number of independent flips of the
coin required for r heads to show up, where r ≥ 1 is known.
(a) Show that the probability mass function for X is
pX(x) =
{ (
x−1
r−1
)
ur(1− u)x−r, x = r, r + 1, . . . .;
0, otherwise.
[5 marks]
(b) Suppose U is random and has a density given by
fU(u) =
{
Γ(α+β)
Γ(α)Γ(β)
uα−1(1− u)β−1, 0 < u < 1;
0, otherwise.
where α, β > 0, and Γ(α) is defined in question 3, which has the property
that Γ(α) = (α − 1)Γ(α − 1) for α ≥ 1, and Γ(k) = (k − 1)! for a positive
integer k. The distribution in part (a) thus becomes
pX|U(x|u) =
{ (
x−1
r−1
)
ur(1− u)x−r, x = r, r + 1, . . . .;
0, otherwise.
i. Find the marginal probability mass function of X if α = β = 2.
. [6 marks]
ii. With α = β = 2 still, show that the density of U |X = x is given by
fU |X(u|x) =
{
(x+3)!
(r+1)!(x−r+1)!u
r+1(1− u)x−r+1, 0 < u < 1;
0, otherwise.
Hence find the mean of U |X = x. [5 marks]
(c) Another independent experiment is carried out, with Y denoting the num-
ber of independent flips of the coin required for r heads to show up (the
same r as for the first experiment).
UL18/0327 Page 6 of 7
State (no need for a derivation) the density of U |(X,Y ) = (x, y) and its
mean, where U still has the density in part (b) with α = β = 2. [4 marks]
END OF PAPER
UL18/0327 Page 7 of 7
Examiners’ commentaries 2018
Examiners’ commentaries 2018
Important note
This commentary reflects the examination and assessment arrangements for this course in the
academic year 2017–18. The format and structure of the examination may change in future years,
and any such changes will be publicised on the virtual learning environment (VLE).
references
Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).
You should always attempt to use the most recent edition of any Essential reading textbook, even if
the commentary and/or online reading list and/or subject guide refer to an earlier edition. If
none are available, please use the contents list and index of the new edition to find the relevant
section.
General remarks
Learning outcomes
At the end of this half course and having completed the Essential reading and activities, you should
be able to:
• recall a large number of distributions and be a competent user of their mass/density and
distribution functions and moment generating functions
• explain relationships between variables, conditioning, independence and correlation
• relate the theory and method taught in the half course to solve practical problems.
Format of the examination
As in previous years, the ability of candidates varied widely. As every year, though, many
candidates still have not studied the subject guide thoroughly, which you should do and know the
topics covered by the syllabus, not only practise on past papers.
The format of this year’s examination will be retained for next year’s examination.
Key steps to improvement
Many candidates found it difficult to perform variable transformations when the random variable
involved is discrete, showing a lack of understanding of probability distributions. The same goes for
1
bivariate random variables with continuous distributions. No matter whether finding the
distribution function first (for example, Question 1 (c)) or using the variable transformation formula
(Question 2 (c)), many candidates struggled to find the correct answer either because of inaccurate
calculations or, worse, a lack of knowledge of the subject.
When calculating a probability or an expectation, especially when evaluating double integrals, many
candidates got the results wrong because of carelessly placing the wrong limits of integration. Please
practise more on how to find the limits correctly for a particular region of a joint density.
Question 1 (c) iv., for example, can be answered even without finishing the previous parts; similarly,
Question 2 (b) when determining if X and Y are independent. Candidates should always spare some
time to read the questions carefully, then see if there are parts which can be completed quickly
without the need to solve previous parts.
You should be ready to derive the moment generating functions of standard random variables, like
the normal, gamma, chi-squared, exponential (all continuous), or the geometric, binomial, Poisson
(all discrete), and ideally know the forms by heart. It is also important to know basic applications of
these distributions, and apply the correct formulae in probability questions.
Examination revision strategy
Many candidates are disappointed to find that their examination performance is poorer than they
expected. This may be due to a number of reasons, but one particular failing is ‘question
spotting’, that is, confining your examination preparation to a few questions and/or topics which
have come up in past papers for the course. This can have serious consequences.
We recognise that candidates might not cover all topics in the syllabus in the same depth, but you
need to be aware that examiners are free to set questions on any aspect of the syllabus. This
means that you need to study enough of the syllabus to enable you to answer the required number of
examination questions.
The syllabus can be found in the Course information sheet available on the VLE. You should read
the syllabus carefully and ensure that you cover sufficient material in preparation for the
examination. Examiners will vary the topics and questions from year to year and may well set
questions that have not appeared in past papers. Examination papers may legitimately include
questions on any topic in the syllabus. So, although past papers can be helpful during your revision,
you cannot assume that topics or specific questions that have come up in past examinations will
occur again.
If you rely on a question-spotting strategy, it is likely you will find yourself in difficulties
when you sit the examination. We strongly advise you not to adopt this strategy.
2
Examiners’ commentaries 2018
Examiners’ commentaries 2018
Important note
This commentary reflects the examination and assessment arrangements for this course in the
academic year 2017–18. The format and structure of the examination may change in future years,
and any such changes will be publicised on the virtual learning environment (VLE).
references
Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).
You should always attempt to use the most recent edition of any Essential reading textbook, even if
the commentary and/or online reading list and/or subject guide refer to an earlier edition. If
none are available, please use the contents list and index of the new edition to find the relevant
section.
Comments on specific questions – Zone A
Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks) and all
THREE questions from Section B (60 marks in total). Candidates are strongly advised to
divide their time accordingly.
Section A
Answer all three parts of question 1 (40 marks in total).
Question 1
(a) Let g(x) be a function taking on integer values of x, with:
g(x) =

a for x = −2,−1
3a for x = 0, 1
4a for x = 2, 3
0 otherwise.
i. Find a so that g(x) is a probability mass function.
(3 marks)
ii. Let X be a discrete random variable with probability mass function g(x).
Find E(X) and Var(X).
(5 marks)
iii. Write down the probability mass function for Y = X2 − 2|X|+ 1.
(4 marks)
3
This question is about discrete probability distributions and basic moment calculations, and
has been overall well-answered except for part iii. Discrete random variables are discussed in
Section 3.3.1 of the subject guide, with examples. The mean and variance are covered in
Sections 3.4.2 and 3.4.3, respectively.
Part i. needs the application of Claim 3.3.6 iii. that the sum of probabilities over the
support is 1 to find the value of a. This part was done well in general although many
candidates calculated the answer incorrectly because of careless mistakes. Some candidates
just equated a+ 3a+ 4a = 1, not realising that there are two values of x which have
probability a, 3a and 4a, respectively. This indeed shows a lack of understanding of discrete
probability distributions, which is disappointing.
Part ii. was done well in general even if a could be incorrect, and marks were awarded in full
if the value of a was the only thing that was wrong.
For part iii., there is no formal transformation formula like that in the continuous case. To
do this question, candidates should find out the support (Definition 3.3.2 on page 53 of the
subject guide) of Y first, which is 0, 1, 4. For instance, X = −2, 0, 2 are all mapped to
Y = 1, so
P (Y = 1) = g(−2) + g(0) + g(2) = a+ 3a+ 4a = 8a = 1
2
.
Approaching the question
i. We must have:
1 =

x
g(x) = (a+ a) + (3a+ 3a) + (4a+ 4a)
so that a = 1/16.
ii. We have:
E(X) =

x
x g(x) = (−2− 1)× a+ (0 + 1)× 3a+ (2 + 3)× 4a = 20a = 5
4
and:
Var(X) = E(X2)− (E(X))2 =

x
x2 g(x)− 25
16
= 5a+ 3a+ 13× 4a− 25
16
= 60a− 25
16
=
35
16
.
iii. Since Y = (|X| − 1)2, it is easy to see that −2, 0 and 2 are mapped to 1, −1 and 1 are
mapped to 0 and, finally, 3 is mapped to 4. Hence the probability mass function of Y is:
gY (y) =

a+ 3a for y = 0
a+ 3a+ 4a for y = 1
4a for y = 4
0 otherwise
=

1/4 for y = 0
1/2 for y = 1
1/4 for y = 4
0 otherwise.
(b) The cumulative distribution function FX(x) for the continuous random variable
X is defined by:
FX(x) =

0 for x < 0
ax3/3 for 0 ≤ x < 1
(a(x− 2)3 + 2a)/3 for 1 ≤ x < 2
1 for x ≥ 2.
4
Examiners’ commentaries 2018
i. Find the value of a.
(2 marks)
ii. Derive the probability density function of X.
(3 marks)
iii. Let W = X2. Derive the cumulative distribution function of W . Hence,
derive the probability density function of W .
(7 marks)
This question was answered badly in general. Part i. requires candidates to understand that
the distribution function of a continuous random variable is continuous itself. This can be
seen from Proposition 3.2.4 equation iv. on page 50 of the subject guide. Since for a
continuous random variable X we have P (X = x) = 0 for all x, then:
0 = FX(x)− FX(x−)
so that FX(x) = FX(x−). With this, we have FX(2) = FX(2−), which is:
1 =
a(2− 2)3 + 2a
3
so that a = 3/2.
Part ii. was better answered in general, and even if a is left as unknown if the density
function is correct candidates received full marks. To find the probability density function
you just need to differentiate the distribution function with respect to x. See Claim 3.3.14
on page 58 of the subject guide.
For part iii., many candidates could get that:
P (W ≤ w) = P (X2 ≤ w) = P (−√w ≤ X ≤ √w) = FX(

w)− FX(−

w)
but failed to realise FX(−

w) = 0 since it is given that FX(x) = 0 for x < 0.
Approaching the question
i. For a continuous random variable, we must have lim
x↗2
F (x) = F (2) = 1, so that 2a/3 = 1,
meaning a = 3/2.
ii. We have fX(x) = F

X(x), so that:
fX(x) =

3x2/2 for 0 ≤ x < 1
3(x− 2)2/2 for 1 ≤ x < 2
0 otherwise.
iii. The cumulative distribution function for W is, for w > 0:
FW (w) = P (W ≤ w) = P (X ≤

w) = FX(

w) =

0 for w < 0
w3/2/2 for 0 ≤ w < 1
(

w − 2)3/2 + 1 for 1 ≤ w < 4
1 for w ≥ 4.
The probability density function of W is then fW (w) = F

W (w), which is:
fW (w) =

3

w/4 for 0 ≤ w < 1
3(

w − 2)2/4√w for 1 ≤ w < 4
0 otherwise.
5
(c) Let X follow an exponential distribution with rate λ, i.e. X has a density
function:
fX(x) =
{
λe−λx for x > 0
0 otherwise.
i. Derive the moment generating function of X.
(3 marks)
ii. Let Y be an independent and identically distributed copy of X. For w > 0,
show that:
P (X − Y ≤ w) = 1− e
−λw
2
.
(Hint: find the joint density of X and Y first. Determine the valid region in
the double integral involved.)
(5 marks)
iii. For w ≤ 0, show that:
P (X − Y ≤ w) = e
λw
2
.
(5 marks)
iv. Using parts ii. and iii. of question (c), show that the density function of
W = X − Y is given by:
fW (w) =
λe−λ|w|
2
, for w ∈ R.
(3 marks)
Part i. was well-answered in general, and finding moment generating functions for a random
variable is a basic technique which candidates need to practise. See Section 3.5 of the
subject guide for more details. For parts ii. and iii., many candidates were not able to
pinpoint the exact limits of integration which should be used in the double integration.
Some candidates did not even realise it should be a double integration because two random
variables are involved. The first thing you should do is to find the region of integration, then
to work out the joint probability density function of (X,Y ), which is just fX(x) fY (y), since
X and Y are independent. See Sections 4.1 and 4.2 of the subject guide for more basic
knowledge on joint density functions.
Some candidates were confused as to how w > 0 or w < 0 affects the calculation. For part
iii., since X ≤ Y + w and w < 0, the limits of Y cannot start from 0, otherwise X will be
negative, which is not allowed. To make sure X ≥ 0, Y has to start from −w. This is
exactly the difference between parts ii. and iii.
For part iv., many candidates could not work out the answer, which is disappointing since
you do not even need to know how to calculate the answers to parts ii. and iii. As long as
you realise we are calculating the distribution function of W in parts ii. and iii., you only
need to differentiate those given answers with respect to w, the result will then follow.
Approaching the question
i. The moment generating function of T is:
MT (s) = E(e
sT ) =
∫ ∞
0
λe−(λ−s)t dt =
λ
λ− s
∫ ∞
0
(λ− s)e−(λ−s)t dt = λ
λ− s , for s < λ.
ii. For w ≥ 0, X − Y ≤ w implies that 0 < X ≤ Y + w, where Y > 0. Hence:
P (X − Y ≤ w) =
∫ ∞
0
∫ y+w
0
fX,Y (x, y) dxdy =
∫ ∞
0
λe−λy[−e−λx]y+w0 dy
=
∫ ∞
0
λe−λy(1− e−λ(y+w)) dy
= 1− 1
2
e−λw.
6
Examiners’ commentaries 2018
iii. For w < 0, X − Y ≤ w implies that 0 < X ≤ Y + w where Y > −w. Hence:
P (X − Y ≤ w) =
∫ ∞
−w
∫ y+w
0
fX,Y (x, y) dxdy =
∫ ∞
−w
λe−λy(1− e−λ(y+w)) dy
= eλw − e−λw × 1
2
e2λw
=
1
2
eλw.
iv. Differentiating the answers in ii. and iii. with respect to w, we have:
fW (w) =
{
λeλw/2 for w < 0
λe−λw/2 for w ≥ 0 =
1
2
λe−λ|w|, for w ∈ R.
Section B
Answer all three questions in this section (60 marks in total).
Question 2
The conditional density of a random variable X given Y = y is given by:
fX|Y (x | y) =
{
3x2/y3 for 0 < x < y < 3
0 otherwise.
The conditional density of Y given X = x is given by:
fY |X(y |x) =
{
3y2/(27− x3) for 0 < x < y < 3
0 otherwise.
(a) Find the ratio fX(x)/fY (y), where fX(x) and fY (y) are the marginal densities
of X and Y , respectively.
(2 marks)
(b) By integrating out x first in the answer in (a), show that:
fY (y) =
{
2y5/243 for 0 < y < 3
0 otherwise.
(9 marks)
(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully
state the region for (U, V ) where this joint density is non-zero.
(9 marks)
This question was not well-answered in general, which is a little unexpected. You should look at
the marks allocated to each part to determine approximately how long the answers should be.
Part (a) is only worth two marks, so you should not expect the answer to have long derivations.
See Section 5.2 of the subject guide for the definition of continuous conditional distributions.
For part (b), many candidates knew to follow the hint and integrate out x, but the limits should
be the limits for the marginal density of X, i.e. the limits should not involve y. In fact, you only
know

fX(x) dx = 1, and the lower and upper limits are for the marginal density of X, which in
7
this case should be 0 and 3, respectively. Many candidates went on to prove if
fX,Y (x, y) = fX(x) fY (y), which is correct but is not a quick way to see if X and Y are
independent since you still need to calculate fX,Y (x, y) and fX(x), both of which are not given to
you. In the process, some candidates unfortunately got the wrong answers. To determine
independence between X and Y more quickly, you should check whether fY |X(y |x) = fY (y) or
not. This is equivalent to the criterion fX,Y (x, y) = fX(x) fY (y),l of course, but fY |X(y |x) and
fY (y) are both given to you! So you do not even need to calculate anything to know that X and
Y are not independent! See Section 4.4 of the subject guide for more details on independence of
a pair of random variables.
Part (c) was not done well because of inaccurate calculations mostly, especially the calculations
of the Jacobian. See Section 4.6 of the subject guide for more details.
Approaching the question
(a) We have:
fX|Y (x | y)
fY |X(y |x) =
fX,Y (x, y)/fY (y)
fX,Y (x, y)/fX(x)
=
fX(x)
fY (y)
so that:
fX(x)
fY (y)
=
x2(27− x3)
y5
.
(b) Since 0 < x < y < 3, integrating out the effect of y means that 0 < x < 3. Hence:
1
fY (y)
=
∫ 3
0
fX(x)
fY (y)
dx =
1
y5
∫ 3
0
(27x2 − x5) dx = 1
y5
[
9x3 − x
6
6
]3
0
=
243
2y5
so that:
fY (y) =
2y5
243
for 0 < y < 3.
Since fY (y) 6= fY |X(y |x), X is not independent of Y .
(c) We have:
X =

UV and Y =

U
V
so that 0 < X < Y < 3 implies:
0 <

UV <

U
V
< 3
meaning:
0 < U < 9V and V < 1.
Hence:
fU,V (u, v) = fX,Y (

uv,

u/v)
∣∣∣∣∣∣∣∣ √v/2√u √u/2√v1/2√uv −√u/2v3/2
∣∣∣∣∣∣∣∣
=
3uv
u3/2/v3/2
× 2
243
u5/2/v5/2 × 1
2v
=
u2
81v
for u < 9v, 0 < v < 1.
Question 3
If X is Gamma distributed with parameters α and β, i.e. X ∼ Gamma(α, β), then it
has density:
fX(x) =
βα
Γ(α)
xα−1e−βx, for x > 0
and Γ(α) =
∫∞
0
yα−1e−y dy for α > 0.
8
Examiners’ commentaries 2018
(a) If X ∼ Gamma(α1, β), Y ∼ Gamma(α2, β), and X is independent of Y , derive
the distribution of X + Y . You may use the moment generating function of a
Gamma random variable without proof, as long as you state it clearly.
(7 marks)
(b) Let Xi ∼ Gamma(α, β), i = 1, . . . , N , be independent of each other and
α, β > 0. Each Xi is also independent of N , which is Poisson distributed with
mean µ, so that the probability mass function for N is given by:
pN(n) =
µne−µ
n!
, for n = 0, 1, . . . .
Consider the random variable:
W =
N∑
i=1
Xi
with the convention that W = 0 if N = 0.
i. Derive the moment generating function of W .
(8 marks)
ii. Find the mean of W . You can use the means of a Poisson and a Gamma
random variable without proof. If you use any standard results about
random sums, you must first state them clearly.
(5 marks)
This question was not answered as well as it should have been. Parts (a) and (b) i. are both
standard exercises. For part (a), see Proposition 4.7.3 for finding the moment generating function
of two independent random variables. Identifying the form of the answer is then the key to
knowing that the sum is still a Gamma distribution.
For part (b) i., the derivation of the moment generating function of a random sum is covered in
the subject guide in Section 5.6. See Lemma 5.6.2 iii. and Proposition 5.6.3 iii. on page 165 of
the subject guide. Indeed many candidates realised this which is good, and scored decent marks
already even though they could not work out the final answer in the end.
For part (b) ii., you can use Proposition 5.6.3 i., or differentiate the moment generating function
which you obtained in part (b) i. Of course, the former will give you the answer much quicker!
Approaching the question
(a) First, let W = X + Y . Since MX(t) = (β/(β − t))α for X ∼ Gamma(α, β), then:
MW (t) = E(e
tW ) = E(etX) E(etY ) =
(
β
β − t
)α1+α2
, for t < β.
This shows that, by the one-to-one correspondence between distribution and moment
generating function, W = X + Y has a Gamma(α1 + α2, β) distribution.
(b) i. We have:
MW (t) = E(e
tW ) = E
(
E
(
exp
(
t
N∑
i=1
Xi
)
|N
))
.
However, given N , we have:
E
(
exp
(
t
N∑
i=1
Xi
))
= E
(
N∏
i=1
exp(tXi)
)
=
N∏
i=1
E
(
etXi
)
=
(
β
β − t
)αN
, for t < β.
9
At the same time:
MN (s) = E(e
sN ) =
∞∑
n=1
(µes)ne−µ
n!
= eµ(e
s−1), for s ∈ R.
Hence:
MW (t) = E
(
exp
(
N log
(
β
β − t
)α))
= exp
(
µ
(
β
β − t

− µ
)
, for t < β.
ii. We have:
E(W ) = E(E(W |N)) = E(N E(Xi)) = µα
β
.
Question 4
Suppose we have a biased coin, which comes up heads with probability u. An
experiment is carried out so that X is the number of independent flips of the coin
required for r heads to show up, where r ≥ 1 is known.
(a) Show that the probability mass function of X is:
pX(x) =
{(x−1
r−1
)
ur (1− u)x−r for x = r, r + 1, . . .
0 otherwise.
(5 marks)
(b) Suppose U is uniformly distributed on (0, 1), and the distribution in part (a)
becomes
pX|U(x |u) =
{(x−1
r−1
)
ur (1− u)x−r for x = r, r + 1, . . .
0 otherwise.
i. Find the marginal probability mass function for X. You can use:∫ 1
0
ya(1− y)b dy = a! b!
(a+ b+ 1)!
for non-negative integers a and b without proof.
(6 marks)
ii. Show that the density of U |X = x is given by:
fU |X(u |x) =
(x+ 1)!
r!(x− r)! u
r (1− u)x−r, for 0 < u < 1.
Hence find the mean of U |X = x.
(5 marks)
(c) Another independent experiment is carried out, with Y denoting the number of
independent flips of the coin required for r heads to show up (the same r as for
the first experiment).
State (no need for a derivation) the density of U |(X,Y ) = (x, y) and its mean,
where U is still uniformly distributed on (0, 1) as in part (b).
(4 marks)
This question was not well-answered in general. Part (a) needs you to explain why the probability
mass function is as stated. Many candidates stated that this is a negative binomial distribution
10
Examiners’ commentaries 2018
and hence the density is as given, which is not a proof or an explanation at all. See Example
3.3.10 on page 56 of the subject guide for the justification of its probability mass function.
Part (b) i. needs you to work out the joint density function of X and U , then integrate out U to
obtain the marginal probability mass function of X. Some candidates were not careful in their
calculations even though they realised to integrate out u in pX|U (x |u) fU (u). Even though not
being able to do (b) i., you should be able to do (b) ii. using the answer given to you in (b) i. To
find the mean, you need to apply the integral formula given in (b) i.
Part (c) was done worst since it is supposed to be difficult. You need to realise X and Y are
independent experiments which can be seen as one, so that x is replaced by x+ y and r is
replaced by 2r in the answer in (b) ii.
Approaching the question
(a) To wait for r heads to show up, suppose x flips are required. The last flip must be a head,
with r − 1 heads randomly appearing in the first x− 1 flips. In each particular combination
of heads and tails, there must be r heads by definition of the experiment, as well as x− r
tails (so adding together to x flips in total), with probability:
ur (1− u)x−r.
Hence we have:
pX(x) =
(
x− 1
r − 1
)
ur (1− u)x−r, for x = r, r + 1, . . . .
(b) i. The joint probability density for X,U is:
fX,U (x, u) =
(
x− 1
r − 1
)
ur (1− u)x−r, for 0 < u < 1, x = r, r + 1, . . . .
Therefore, the marginal probability mass function of X is:
pX(x) =
∫ 1
0
(
x− 1
r − 1
)
ur (1− u)x−r du =
(
x− 1
r − 1
)∫ 1
0
ur (1− u)x−r du
=
(
x− 1
r − 1
)
r! (x− r)!
(x+ 1)!
=
r
x(x+ 1)
, for x = r, r + 1, . . . .
ii. We have:
fU |X(u |x) = fX,U (x, u)
pX(x)
=
x(x+ 1)
r
×
(
x− 1
r − 1
)
ur (1− u)x−r
=
(x+ 1)!
r! (x− r)!u
r(1− u)x−r, for 0 < u < 1.
The mean is:
E(U |X = x) = (x+ 1)!
r! (x− r)!
∫ 1
0
ur+1(1−u)x−r du = (x+ 1)!
r! (x− r)! ×
(r + 1)! (x− r)!
(x+ 2)!
=
r + 1
x+ 2
.
(c) Mathematically, we have:
pX,Y |U (x, y |u) = pX|U (x |u) pY |U (y |u)
11
so that with fU (u) = 1, we have:
pU |X,Y (u |x, y) =
pX|U (x |u) pY |U (y |u)∫ 1
0
pX|U (x |u) pY |U (y |u) du
=
u2r(1− u)x+y−2r∫ 1
0
u2r(1− u)x+y−2r du
=
(x+ y + 1)!
(2r)! (x+ y − 2r)!u
2r(1− u)x+y−2r, for 0 < u < 1
which is in parallel to the answer in part (b) ii. The mean is:
2r + 1
x+ y + 2
which is in parallel to the answer in (b) ii.
To see these two answers more quickly, note that X and Y can be seen as one experiment,
waiting for 2r heads to show up. So we need x+ y flips for 2r heads to come up, and hence
we can replace x by x+ y and r by 2r directly from answers in (b) ii.
12
Examiners’ commentaries 2018
Examiners’ commentaries 2018
Important note
This commentary reflects the examination and assessment arrangements for this course in the
academic year 2017–18. The format and structure of the examination may change in future years,
and any such changes will be publicised on the virtual learning environment (VLE).
references
Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).
You should always attempt to use the most recent edition of any Essential reading textbook, even if
the commentary and/or online reading list and/or subject guide refer to an earlier edition. If
none are available, please use the contents list and index of the new edition to find the relevant
section.
Comments on specific questions – Zone B
Candidates should answer all FOUR questions: QUESTION 1 of Section A (40 marks) and all
THREE questions from Section B (60 marks in total). Candidates are strongly advised to
divide their time accordingly.
Section A
Answer all three parts of question 1 (40 marks in total).
Question 1
(a) Let g(x) be a function taking on integer values of x, with:
g(x) =

2a for x = −3,−1
a for x = 0, 2
3a for x = 1, 3
0 otherwise.
i. Find a so that g(x) is a probability mass function.
(3 marks)
ii. Let X be a discrete random variable with probability mass function g(x).
Find E(X) and Var(X).
(5 marks)
iii. Write down the probability mass function of Y = X2 − 4|X|+ 4.
(4 marks)
13
This question is about discrete probability distributions and basic moment calculations, and
has been overall well-answered except for part iii. Discrete random variables are discussed in
Section 3.3.1 of the subject guide, with examples. The mean and variance are covered in
Sections 3.4.2 and 3.4.3, respectively.
Part i. needs the application of Claim 3.3.6 iii. that the sum of probabilities over the
support is 1 to find the value of a. This part was done well in general although many
candidates calculated the answer incorrectly because of careless mistakes. Some candidates
just equated 2a+ a+ 3a = 1, not realising that there are two values of x which have
probability 2a, a and 3a, respectively. This indeed shows a lack of understanding of discrete
probability distributions, which is disappointing.
Part ii. was done well in general even if a could be incorrect, and marks were awarded in full
if the value of a was the only thing that was wrong.
For part iii., there is no formal transformation formula like that in the continuous case. To
do this question, candidates should find out the support (Definition 3.3.2 on page 53 of the
subject guide) of Y first, which is 0, 1, 4. For instance, X = −3,−1, 1, 3 are all mapped to
Y = 1, so
P (Y = 1) = g(−3) + g(−1) + g(1) + g(3) = 2a+ 2a+ 3a+ 3a = 10a = 5
6
.
Approaching the question
i. We must have:
1 =

x
g(x) = (2a+ 2a) + (a+ a) + (3a+ 3a)
so that a = 1/12.
ii. We have:
E(X) =

x
x g(x) = (−3− 1)× 2a+ (0 + 2)× a+ (1 + 3)× 3a = 6a = 1
2
.
and:
Var(X) = E(X2)− (E(X))2 =

x
x2 g(x)− 1
4
= 10× 2a+ 4a+ 10× 3a− 1
4
= 54a− 1
4
=
51
12
.
iii. Since Y = (|X| − 2)2, it is easy to see that −3,−1, 1 and 3 are mapped to 1, 0 is mapped
to 4 and, finally, 2 is mapped to 0. Hence the probability mass function of Y is:
gY (y) =

a for y = 0
2a+ 2a+ 3a+ 3a for y = 1
a for y = 4
0 otherwise.
=

1/12 for y = 0
5/6 for y = 1
1/12 for y = 4
0 otherwise.
(b) The cumulative distribution function FX(·) for the continuous random variable
X is defined by:
FX(x) =

0 for x < 0
ax2/4 for 0 ≤ x < 1
((x− 1)3 + a)/4 for 1 ≤ x < 2
1 x ≥ 2.
14
Examiners’ commentaries 2018
i. Find the value of a.
(1 mark)
ii. Derive the probability density function of X.
(4 marks)
iii. Let W = X2. Derive the cumulative distribution function of W . Hence,
derive the probability density function of W .
(7 marks)
This question was answered badly in general. Part i. requires candidates to understand that
the distribution function of a continuous random variable is continuous itself. This can be
seen from Proposition 3.2.4 equation iv. on page 50 of the subject guide. Since for a
continuous random variable X we have P (X = x) = 0 for all x, then:
0 = FX(x)− FX(x−)
so that FX(x) = FX(x−). With this, we have FX(2) = FX(2−), which is:
1 =
(2− 1)3 + a
4
so that a = 3.
Part ii. was better answered in general, and even if a is left as unknown if the density
function is correct candidates received full marks. To find the probability density function
you just need to differentiate the distribution function with respect to x. See Claim 3.3.14
on page 58 of the subject guide.
For part iii., many candidates could get that:
P (W ≤ w) = P (X2 ≤ w) = P (−√w ≤ X ≤ √w) = FX(

w)− FX(−

w)
but failed to realise FX(−

w) = 0 since it is given that FX(x) = 0 for x < 0.
Approaching the question
i. For a continuous random variable, we must have lim
x↗2
F (x) = F (2) = 1, so that
1/4 + a/4 = 1, meaning a = 3.
ii. We have fX(x) = F

X(x), so that:
fX(x) =

3x/2 for 0 ≤ x < 1
3(x− 1)2/4 for 1 ≤ x < 2
0 otherwise.
iii. The cumulative distribution function for W is, for w > 0:
FW (w) = P (W ≤ w) = P (X ≤

w) = FX(

w) =

0 for w < 0
3w/4 for 0 ≤ w < 1
(

w − 1)3/4 + 3/4 for 1 ≤ w < 4
1 for w ≥ 4.
The probability density function of W is then fW (w) = F

W (w), which is:
fW (w) =

3/4 for 0 ≤ w < 1
3(

w − 1)2/8√w for 1 ≤ w < 4
0 otherwise.
15
(c) Let X follow an exponential distribution with rate λ, i.e. X has a density
function:
fX(x) =
{
λe−λx for x > 0
0 otherwise.
i. Derive the moment generating function of X.
(3 marks)
ii. Let Y be an independent and identically distributed copy of X. For w > 0,
show that:
P (X − Y ≤ w) = 1− e
−λw
2
.
(Hint: find the joint density of X and Y first. Determine the valid region in
the double integral involved.)
(5 marks)
iii. For w ≤ 0, show that:
P (X − Y ≤ w) = e
λw
2
.
(5 marks)
iv. Using parts ii. and iii. of question (c), show that the density function of
W = X − Y is given by:
fW (w) =
λe−λ|w|
2
, for w ∈ R.
(3 marks)
Part i. was well-answered in general, and finding moment generating functions for a random
variable is a basic technique which candidates need to practise. See Section 3.5 of the
subject guide for more details. For parts ii. and iii., many candidates were not able to
pinpoint the exact limits of integration which should be used in the double integration.
Some candidates did not even realise it should be a double integration because two random
variables are involved. The first thing you should do is to find the region of integration, then
to work out the joint probability density function of (X,Y ), which is just fX(x) fY (y), since
X and Y are independent. See Sections 4.1 and 4.2 of the subject guide for more basic
knowledge on joint density functions.
Some candidates were confused as to how w > 0 or w < 0 affects the calculation. For part
iii., since X ≤ Y + w and w < 0, the limits of Y cannot start from 0, otherwise X will be
negative, which is not allowed. To make sure X ≥ 0, Y has to start from −w. This is
exactly the difference between parts ii. and iii.
For part iv., many candidates could not work out the answer, which is disappointing since
you do not even need to know how to calculate the answers to parts ii. and iii. As long as
you realise we are calculating the distribution function of W in parts ii. and iii., you only
need to differentiate those given answers with respect to w, the result will then follow.
Approaching the question
i. The moment generating function of T is:
MT (s) = E(e
sT ) =
∫ ∞
0
λe−(λ−s)t dt =
λ
λ− s
∫ ∞
0
(λ− s)e−(λ−s)t dt = λ
λ− s , for s < λ.
ii. For w ≥ 0, X − Y ≤ w implies that 0 < X ≤ Y + w, where Y > 0. Hence:
P (X − Y ≤ w) =
∫ ∞
0
∫ y+w
0
fX,Y (x, y) dxdy =
∫ ∞
0
λe−λy[−e−λx]y+w0 dy
=
∫ ∞
0
λe−λy(1− e−λ(y+w)) dy
= 1− 1
2
e−λw.
16
Examiners’ commentaries 2018
iii. For w < 0, X − Y ≤ w implies that 0 < X ≤ Y + w where Y > −w. Hence:
P (X − Y ≤ w) =
∫ ∞
−w
∫ y+w
0
fX,Y (x, y) dxdy =
∫ ∞
−w
λe−λy(1− e−λ(y+w)) dy
= eλw − e−λw × 1
2
e2λw
=
1
2
eλw.
iv. Differentiating the answers in ii. and iii. with respect to w, we have:
fW (w) =
{
λeλw/2 for w < 0
λe−λw/2 for w ≥ 0 =
1
2
λe−λ|w|, for w ∈ R.
Section B
Answer all three questions in this section (60 marks in total).
Question 2
The conditional density of a random variable X given Y = y is given by:
fX|Y (x | y) =
{
x/(2y2) for 0 < x < 2y < 2
0 otherwise.
The conditional density of Y given X = x is given by:
fY |X(y |x) =
{
24y2/(8− x3) for 0 < x < 2y < 2
0 otherwise.
(a) Find the ratio fY (y)/fX(x), where fX(x) and fY (y) are the marginal densities
of X and Y , respectively.
(2 marks)
(b) By integrating out y first in the answer in (a), show that:
fX(x) =
{
(5x(8− x3))/48 for 0 < x < 2
0 otherwise.
(9 marks)
(c) Let U = XY and V = X/Y . Derive the joint density for U, V , and carefully
state the region for (U, V ) where this joint density is non-zero.
(9 marks)
This question was not well-answered in general, which is a little unexpected. You should look at
the marks allocated to each part to determine approximately how long the answers should be.
Part (a) is only worth two marks, so you should not expect the answer to have long derivations.
See Section 5.2 of the subject guide for the definition of continuous conditional distributions.
For part (b), many candidates knew to follow the hint and integrate out x, but the limits should
be the limits for the marginal density of X, i.e. the limits should not involve y. In fact, you only
know

fX(x) dx = 1, and the lower and upper limits are for the marginal density of X, which in
17
this case should be 0 and 3, respectively. Many candidates went on to prove if
fX,Y (x, y) = fX(x) fY (y), which is correct but is not a quick way to see if X and Y are
independent since you still need to calculate fX,Y (x, y) and fX(x), both of which are not given to
you. In the process, some candidates unfortunately got the wrong answers. To determine
independence between X and Y more quickly, you should check whether fY |X(y |x) = fY (y) or
not. This is equivalent to the criterion fX,Y (x, y) = fX(x) fY (y),l of course, but fY |X(y |x) and
fY (y) are both given to you! So you do not even need to calculate anything to know that X and
Y are not independent! See Section 4.4 of the subject guide for more details on independence of
a pair of random variables.
Part (c) was not done well because of inaccurate calculations mostly, especially the calculations
of the Jacobian. See Section 4.6 of the subject guide for more details.
Approaching the question
(a) We have:
fY |X(y |x)
fX|Y (x | y) =
fX,Y (x, y)/fX(x)
fX,Y (x, y)/fY (y)
=
fY (y)
fX(x)
so that:
fY (y)
fX(x)
=
48y4
8x− x4 .
(b) Since 0 < x < 2y < 2, integrating out the effect of y means that 0 < x < 2. Hence:
1
fX(x)
=
∫ 1
0
fY (y)
fX(x)
dy =
1
8x− x4
∫ 1
0
48y4 dy =
1
x(8− x3)
48
5
so that:
fX(x) =
5x(8− x3)
48
, for 0 < x < 2.
Since fX(x) 6= fX|Y (x | y), X is not independent of Y .
(c) We have:
X =

UV and Y =

U
V
so that 0 < X < 2Y < 2 implies:
0 <

UV < 2

U
V
< 2
meaning:
0 < U < V and V < 2.
Hence:
fU,V (u, v) = fX,Y (

uv,

u/v)
∣∣∣∣∣∣∣∣ √v/2√u √u/2√v1/2√uv −√u/2v3/2
∣∣∣∣∣∣∣∣
=
24u/v
8− (uv)3/2 ×
5
48
(8

uv − u2v2)× 1
2v
=
5u3/2
4v3/2
for u < v, 0 < v < 2.
Question 3
If X is Gamma distributed with parameters α and β, i.e. X ∼ Gamma(α, β), then it
has density:
fX(x) =
βα
Γ(α)
xα−1e−βx, for x > 0
and Γ(α) =
∫∞
0
yα−1e−y dy for α > 0.
18
Examiners’ commentaries 2018
(a) Suppose X ∼ Gamma(α1, β1), Y ∼ Gamma(α2, β2), and X is independent of Y .
Derive the distribution of β1X + β2Y . You may use the moment generating
function of a Gamma random variable without proof, as long as you state it
clearly.
(7 marks)
(b) Let Xi ∼ Gamma(α, βi), i = 1, . . . , N , be independent of each other and
α, βi > 0. Each Xi is also independent of N , which is Poisson distributed with
mean µ, so that the probability mass function for N is given by:
pN(n) =
µne−µ
n!
, for n = 0, 1, . . . .
Consider the random variable:
W =
N∑
i=1
βiXi
with the convention that W = 0 if N = 0.
i. Derive the moment generating function of W .
(8 marks)
ii. Find the mean of W . You can use the mean of a Poisson random variable
without proof. The mean of X ∼ Gamma(α, β) is α/β.
(5 marks)
This question was not answered as well as it should have been. Parts (a) and (b) i. are both
standard exercises. For part (a), see Proposition 4.7.3 for finding the moment generating function
of two independent random variables. Identifying the form of the answer is then the key to
knowing that the sum is still a Gamma distribution. Many candidates did not realise:
Mβ1X(t) = E(e
tβ1X) = MX(β1t) = (1− t)−α
and could not get the correct answer in the end by eliminating β1 in the moment generating
function. We also see that β1X ∼ Gamma(α, 1).
For part (b) i., the derivation of the moment generating function of a random sum is covered in
the subject guide in Section 5.6. See Lemma 5.6.2 iii. and Proposition 5.6.3 iii. on page 165 of
the subject guide. Indeed many candidates realised this which is good, and scored decent marks
already even though they could not work out the final answer in the end.
For part (b) ii., you can use Proposition 5.6.3 i. since the βiXis are all i.i.d. Gamma(α, 1) from
the calculations in part (a) and so have a common mean α, or differentiate the moment
generating function you obtained in part (b) i. Of course, the former will give you the answer
much quicker!
Approaching the question
(a) First, let W = β1X + β2Y . Since MX(t) = (β/(β − t))α for X ∼ Gamma(α, β), then:
MW (t) = E(e
tW ) = E(etβ1X) E(etβ2Y ) =
(
β1
β1 − β1t
)α1 ( β2
β2 − β2t
)α2
=
(
1
1− t
)α1+α2
for t < 1. This shows that, by the one-to-one correspondence between distribution and
moment generating function, β1X + β2Y has a Gamma(α1 + α2, 1) distribution.
19
(b) i. We have:
MW (t) = E(e
tW ) = E
(
E
(
exp
(
t
N∑
i=1
βiXi
)
|N
))
.
However, given N , we have:
E
(
exp
(
t
N∑
i=1
βiXi
))
= E
(
N∏
i=1
exp(tβiXi)
)
=
N∏
i=1
E
(
etβiXi
)
=
(
1
1− t
)αN
, for t < 1.
At the same time:
MN (s) = E(e
sN ) =
∞∑
n=1
(µes)ne−µ
n!
= eµ(e
s−1), for s ∈ R.
Hence:
MW (t) = E
(
exp
(
N log
(
1
1− t
)α))
= exp
(
µ
(
1
1− t

− µ
)
, for t < 1.
ii. We have:
E(W ) = E(E(W |N)) = E
(
N∑
i=1
βi E(Xi)
)
= µα.
Question 4
Suppose we have a biased coin, which comes up heads with probability u. An
experiment is carried out so that X is the number of independent flips of the coin
required for r heads to show up, where r ≥ 1 is known.
(a) Show that the probability mass function for X is:
pX(x) =
{(x−1
r−1
)
ur (1− u)x−r for x = r, r + 1, . . .
0 otherwise.
(5 marks)
(b) Suppose U is random and has a density given by:
fU(u) =
{
Γ(α+β)
Γ(α)Γ(β)
uα−1 (1− u)β−1 for 0 < u < 1
0 otherwise
where α, β > 0, and Γ(α) is defined in Question 3, which has the property that
Γ(α) = (α− 1)Γ(α− 1) for α ≥ 1, and Γ(k) = (k − 1)! for a positive integer k.
The distribution in part (a) thus becomes:
pX|U(x |u) =
{(x−1
r−1
)
ur (1− u)x−r for x = r, r + 1, . . .
0 otherwise.
i. Find the marginal probability mass function of X if α = β = 2.
(6 marks)
ii. With α = β = 2 still, show that the density of U |X = x is given by:
fU |X(u |x) =
{
(x+3)!
(r+1)! (x−r+1)! u
r+1 (1− u)x−r+1 for 0 < u < 1
0 otherwise.
Hence find the mean of U |X = x.
(5 marks)
20
Examiners’ commentaries 2018
(c) Another independent experiment is carried out, with Y denoting the number of
independent flips of the coin required for r heads to show up (the same r as for
the first experiment).
State (no need for a derivation) the density of U | (X,Y ) = (x, y) and its mean,
where U still has the density in part (b) with α = β = 2.
(4 marks)
This question was not well-answered in general. Part (a) needs you to explain why the probability
mass function is as stated. Many candidates stated that this is a negative binomial distribution
and hence the density is as given, which is not a proof or an explanation at all. See Example
3.3.10 on page 56 of the subject guide for the justification of its probability mass function.
Part (b) i. needs you to work out the joint density function of X and U , then integrate out U to
obtain the marginal probability mass function of X. Some candidates were not careful in their
calculations even though they realised to integrate out u in pX|U (x |u) fU (u). Even though not
being able to do (b) i., you should be able to do (b) ii. using the answer given to you in (b) i. To
find the mean, you need to find a general formula for
∫ 1
0
uα−1(1− u)β−1 du by realising that∫ 1
0
fU (u) du = 1 in the probability density function given in part (b).
Part (c) was done worst since it is supposed to be difficult. You need to realise X and Y are
independent experiments which can be seen as one, so that x is replaced by x+ y and r is
replaced by 2r in the answer in (b) ii.
Approaching the question
(a) To wait for r heads to show up, suppose x flips are required. Therefore, the last flip must be
a head, with r − 1 heads randomly appearing in the first x− 1 flips. In each particular
combination of heads and tails, there must be r heads by definition of the experiment, as
well as x− r tails (so adding together to x flips in total), with probability:
ur (1− u)x−r.
Hence we have:
pX(x) =
(
x− 1
r − 1
)
ur (1− u)x−r, for x = r, r + 1, . . . .
(b) i. The joint probability density for X,U is:
fX,U (x, u) =
(
x− 1
r − 1
)
Γ(α+ β)
Γ(α) Γ(β)
ur+α−1 (1−u)x−r+β−1, for 0 < u < 1, x = r, r+ 1, . . . .
Therefore, the marginal probability mass function of X is:
pX(x) =
∫ 1
0
(
x− 1
r − 1
)
Γ(α+ β)
Γ(α) Γ(β)
ur+α−1 (1− u)x−r+β−1 du
=
(
x− 1
r − 1
)
Γ(α+ β)
Γ(α) Γ(β)
∫ 1
0
ur+α−1 (1− u)x−r+β−1 du
=
(
x− 1
r − 1
)
Γ(α+ β)
Γ(α) Γ(β)
Γ(r + α) Γ(x− r + β)
Γ(x+ α+ β)
=
6r(r + 1)(x− r + 1)
x(x+ 1)(x+ 2)(x+ 3)
, for x = r, r + 1, . . . .
21
ii. We have:
fU |X(u |x) = fX,U (x, u)
pX(x)
=
ur+α−1(1− u)x−r+β−1∫ 1
0
ur+α−1(1− u)x−r+β−1 du
=
Γ(x+ 4)
Γ(r + 2) Γ(x− r + 2) u
r+1 (1− u)x−r+1, for 0 < u < 1.
The mean is:
E(U |X = x) = (x+ 3)!
(r + 1)! (x− r + 1)!
∫ 1
0
ur+2(1− u)x−r+1 du
=
(x+ 3)!
(r + 1)! (x− r + 1)! ×
(r + 2)! (x− r + 1)!
(x+ 4)!
=
r + 2
x+ 4
.
(c) Mathematically, we have:
pX,Y |U (x, y |u) = pX|U (x |u) pY |U (y|u)
so that:
pU |X,Y (u |x, y) =
pX|U (x |u) pY |U (y |u) fU (u)∫ 1
0
pX|U (x |u) pY |U (y |u) fU (u) du
=
u2r+1(1− u)x+y−2r+1∫ 1
0
u2r+1(1− u)x+y−2r+1 du
=
(x+ y + 3)!
(2r + 1)! (x+ y − 2r + 1)! u
2r+1 (1− u)x+y−2r+1, for 0 < u < 1
which is in parallel to the answer in part (b) ii. The mean is:
2r + 2
x+ y + 4
which is in parallel to the answer in (b) ii.
To see these two answers more quickly, note that X nd Y can be seen as one experiment,
waiting for 2r heads to show up. So we need x+ y flips for 2r heads to come up, and hence
we can replace x by x+ y and r by 2r directly from answers in (b) ii.
22  