STAT7055
Topic 3
Discrete Random Variables
STAT7055 - Topic 3 1 / 63
Random Variables
Random Variable
I Suppose we flip a fair coin three times.
I The sample space is:
S = {HHH,HHT,HTH,HTT,
THH, THT, TTH, TTT}
I Each outcome is equally likely to occur.
I Define a new quantity (call it X) which is equal to
the number of heads that occur in the three coin
flips.
STAT7055 - Topic 3 2 / 63
Random Variables
Random Variable
I X can take the value 0, 1, 2 or 3.
I The actual value that X takes is random and
depends on the outcome of the experiment.
I X is what we call a random variable.
I Formally, a random variable is a function that
assigns a numeric value to each simple event in a
sample space.
STAT7055 - Topic 3 3 / 63
Random Variables
Notation
I Denote random variables using uppercase letters,
e.g., X, Y , Z.
I Denote the actual observed or realised value of the
random variable by lowercase letters, e.g., x, y, z.
I Back to coin flipping example:
I X is the random variable that can take values 0, 1, 2 or
3.
I If we actually perform the experiment and observe the
outcome HHT , then the realised value of X is x = 2.
STAT7055 - Topic 3 4 / 63
Random Variables
Discrete Random Variable
I A discrete random variable is one that can take
on a countable number of possible values.
I For example:
I Flip a coin five times and let X be the number of heads
that occurs. The possible values are
X = 0, 1, 2, 3, 4 or 5.
I Flip a coin until it comes up tails and let X be the total
number of flips needed. The possible values are
X = 1, 2, 3, 4, 5, 6, 7, . . .
STAT7055 - Topic 3 5 / 63
Random Variables
Continuous Random Variable
I A continuous random variable is one that can
take on an uncountable number of possible values -
the number of possible values is infinite as a result
of continuous variation.
I For example:
I Let X be the time taken to finish a three hour exam.
I Let X be the weight of a boxer.
STAT7055 - Topic 3 6 / 63
Discrete Random Variables Discrete Probability Distributions
Discrete Probability Distribution
I For a discrete random variable X, how can we
determine P (X = x) for any given value of x?
I The probability that a discrete random variable X
takes the value x is denoted by p(x) and is equal to
the sum of all the probabilities of the simple events
for which X = x.
STAT7055 - Topic 3 7 / 63
Discrete Random Variables Discrete Probability Distributions
Discrete Probability Distribution
I A discrete probability distribution is a table or
formula listing all possible values that a discrete
random variable can take, together with the
corresponding probability for each value.
I A discrete probability distribution must satisfy two
requirements:
1. 0 ≤ p(x) ≤ 1 for all x.
2.
∑
all x p(x) = 1
STAT7055 - Topic 3 8 / 63
Discrete Random Variables Discrete Probability Distributions
Example
I Flip a coin three times, let X be the number of
heads.
p(0) = P (X = 0) = P ({TTT}) = 1
8
p(1) = P (X = 1) = P ({HTT, THT, TTH}) = 3
8
p(2) = P (X = 2) = P ({HHT,HTH, THH}) = 3
8
p(3) = P (X = 3) = P ({HHH}) = 1
8
STAT7055 - Topic 3 9 / 63
Discrete Random Variables Discrete Probability Distributions
Example
x 0 1 2 3
p(x) 18
3
8
3
8
1
8
I What is the probability of at most one head?
P (X ≤ 1) = p(0) + p(1) = 1
8
+
3
8
=
1
2
I What is the probability of at least one head?
P (X ≥ 1) = p(1) + p(2) + p(3) = 3
8
+
3
8
+
1
8
=
7
8
= 1− p(0) = 1− 1
8
=
7
8
STAT7055 - Topic 3 10 / 63
Discrete Random Variables Discrete Probability Distributions
Probability Distributions and Populations
I Probability distributions represent populations.
I Rather than recording every observation in the
population, a probability distribution summarises the
population by listing only the possible values that
appear in the population, together with their
corresponding probabilities.
I We can calculate population parameters such as the
population mean and population variance from a
probability distribution.
STAT7055 - Topic 3 11 / 63
Discrete Random Variables Expected Value
Expected Value
I Let X be a discrete random variable with probability
distribution p(x). The expected value (or
population mean) of X is defined to be:
µ = E(X) =
∑
all x
(x× p(x))
I Compare this to the formula for the population
mean given in topic 1:
µ =
1
N
N∑
i=1
Xi =
N∑
i=1
(
xi × 1
N
)
STAT7055 - Topic 3 12 / 63
Discrete Random Variables Expected Value
Expected Value
I It is straightforward to calculate the expected value
of any function of a discrete random variable X.
I Let g(X) be some function of X. Then the
expected value of g(X) is defined to be:
E(g(X)) =
∑
all x
(g(x)× p(x))
STAT7055 - Topic 3 13 / 63
Discrete Random Variables Expected Value
Example
x 0 1 2 3
p(x) 18
3
8
3
8
1
8
E(X) =
∑
all x
(x× p(x))
= 0× 1
8
+ 1× 3
8
+ 2× 3
8
+ 3× 1
8
=
12
8
= 1.5
STAT7055 - Topic 3 14 / 63
Discrete Random Variables Expected Value
Example
x 0 1 2 3
p(x) 18
3
8
3
8
1
8
E(X2) =
∑
all x
(
x2 × p(x))
= 02 × 1
8
+ 12 × 3
8
+ 22 × 3
8
+ 32 × 1
8
=
24
8
= 3
STAT7055 - Topic 3 15 / 63
Discrete Random Variables Expected Value
Laws of Expected Value
I If X and Y are random variables (discrete or
continuous) and c is any constant, then:
1. E(c) = c
2. E(cX) = cE(X)
3. E(X + Y ) = E(X) + E(Y )
4. E(X − Y ) = E(X)− E(Y )
I And if X and Y are independent, then:
5. E(XY ) = E(X)× E(Y )
STAT7055 - Topic 3 16 / 63
Discrete Random Variables Expected Value
Laws of Expected Value - Example
I Let Z = 3X + 2Y − 2XY + 3 with E(X) = 3,
E(Y ) = 5 and X and Y independent. Then:
E(Z) = E(3X + 2Y − 2XY + 3)
= E(3X) + E(2Y )− E(2XY ) + E(3)
= 3E(X) + 2E(Y )− 2E(X)E(Y ) + 3
= 3× 3 + 2× 5− 2× 3× 5 + 3
= −8
STAT7055 - Topic 3 17 / 63
Discrete Random Variables Variance
Variance
I Let X be a discrete random variable with probability
distribution p(x) and µ = E(X).
I The (population) variance of X is defined as:
σ2 = V (X) = E
(
(X − µ)2) =∑
all x
(
(x− µ)2 × p(x))
I A shortcut formula for the variance is given below:
V (X) = E
(
X2
)− (E(X))2
=
(∑
all x
(
x2 × p(x)))− µ2
STAT7055 - Topic 3 18 / 63
Discrete Random Variables Variance
Example
x 0 1 2 3
p(x) 18
3
8
3
8
1
8
V (X) =
(∑
all x
(
x2 × p(x)))− µ2
=
(
02 × 1
8
+ 12 × 3
8
+ 22 × 3
8
+ 32 × 1
8
)
− 1.52
= 0.75
SD(X) =
√
V (X) =
√
0.75 = 0.866 = σ
STAT7055 - Topic 3 19 / 63
Discrete Random Variables Variance
Laws of Variance
I If X and Y are random variables (discrete or
continuous) and c is any constant, then:
1. V (c) = 0
2. V (cX) = c2V (X)
3. V (X + c) = V (X)
I And if X and Y are independent, then:
4. V (X + Y ) = V (X) + V (Y )
5. V (X − Y ) = V (X) + V (Y )
STAT7055 - Topic 3 20 / 63
Discrete Random Variables Variance
Laws of Variance - Example
I Let Z = 3X − 2Y − 7 with V (X) = 2, V (Y ) = 1
and X and Y independent. Then:
V (Z) = V (3X − 2Y − 7)
= V (3X − 2Y )
= V (3X) + V (2Y )
= 9V (X) + 4V (Y )
= 9× 2 + 4× 1
= 22
STAT7055 - Topic 3 21 / 63
Discrete Random Variables Bivariate Distributions
Bivariate Distribution
I If X and Y are discrete random variables, then the
bivariate distribution of X and Y is a table or
formula that lists the joint probabilities
P ({X = x} ∩ {Y = y}), denoted p(x, y), for all
pairs of x and y.
I A bivariate distribution must satisfy two
requirements:
1. 0 ≤ p(x, y) ≤ 1 for all x and y
2.
∑
all x
∑
all y p(x, y) = 1
STAT7055 - Topic 3 22 / 63
Discrete Random Variables Bivariate Distributions
Example
I Flip a coin three times.
I Let X be the number of heads.
I Let Y be the number of sequence changes within
the three flips, i.e., the number of times we change
from H ⇒ T or T ⇒ H.
I For example:
I HHH: x = 3 (3 heads) and y = 0 (0 sequence changes
since H ⇒ H ⇒ H).
I HHT : x = 2 (2 heads) and y = 1 (1 sequence change
since H ⇒ H ⇒ T ).
I HTH: x = 2 (2 heads) and y = 2 (2 sequence changes
since H ⇒ T ⇒ H).
STAT7055 - Topic 3 23 / 63
Discrete Random Variables Bivariate Distributions
Example
Outcome x y
HHH 3 0
HHT 2 1
HTH 2 2
THH 2 1
TTH 1 1
THT 1 2
HTT 1 1
TTT 0 0
STAT7055 - Topic 3 24 / 63
Discrete Random Variables Bivariate Distributions
Example
y
0 1 2
x
0 1/8 0 0 1/8
1 0 2/8 1/8 3/8
2 0 2/8 1/8 3/8
3 1/8 0 0 1/8
2/8 4/8 2/8 1
STAT7055 - Topic 3 25 / 63
Discrete Random Variables Marginal Probability Distributions
Marginal Probability Distribution
I Just like we did last topic, we can calculate marginal
probabilities for X and Y by adding across the rows
and down the columns, respectively.
I Specifically, given p(x, y) (the bivariate distribution
of X and Y ), the marginal probability
distribution of X is:
pX(x) = P (X = x) =
∑
all y
p(x, y)
STAT7055 - Topic 3 26 / 63
Discrete Random Variables Marginal Probability Distributions
Marginal Probability Distribution
I So for our example,
pX(1) = P (X = 1) = p(1, 0) + p(1, 1) + p(1, 2) =
3
8
I Considering Y for the moment, notice that the
events {Y = 0}, {Y = 1} and {Y = 2} are a
partition of the sample space!
I So, calculating a marginal probability distribution is
just a direct consequence of the Law of Total
Probability.
STAT7055 - Topic 3 27 / 63
Discrete Random Variables Marginal Probability Distributions
Marginal Probability Distribution
I The marginal distribution of X is:
x 0 1 2 3
pX(x)
1
8
3
8
3
8
1
8
I The marginal distribution of Y is:
y 0 1 2
pY (y)
2
8
4
8
2
8
STAT7055 - Topic 3 28 / 63
Discrete Random Variables Independence
Independence of Random Variables
I Two discrete random variables, X and Y , are
independent if and only if
p(x, y) = pX(x)× pY (y)
for all x and y.
I Note that this has to be true for all x and y. If
there is just one pair of x and y for which the above
is not true, then X and Y are not independent.
STAT7055 - Topic 3 29 / 63
Discrete Random Variables Independence
Example
I In our previous coin flipping example, X and Y are
clearly not independent since if we consider the pair
x = 0 and y = 0:
p(0, 0) =
1
8
but
pX(0)× pY (0) = 1
8
× 2
8
=
1
32
I That is, we have found one pair for which
p(x, y) 6= pX(x)× pY (y)
STAT7055 - Topic 3 30 / 63
Discrete Random Variables Functions of Random Variables
Sum of Two Random Variables
I Consider two real estate agents, Albert and Bob.
I Let X be the number of houses sold by Albert in a week.
I Let Y be the number of houses sold by Bob in a week.
x
pY (y)0 1 2
y
0 0.12 0.42 0.06 0.6
1 0.21 0.06 0.03 0.3
2 0.07 0.02 0.01 0.1
pX(x) 0.4 0.5 0.1 1
STAT7055 - Topic 3 31 / 63
Discrete Random Variables Functions of Random Variables
Sum of Two Random Variables
I From the marginal probability distributions of X and
Y , it is straightforward to calculate the following:
I E(X) = 0.7
I V (X) = 0.41
I E(Y ) = 0.5
I V (Y ) = 0.45
I Suppose we are interested in the total number of
houses Albert and Bob sell in a week.
STAT7055 - Topic 3 32 / 63
Discrete Random Variables Functions of Random Variables
Sum of Two Random Variables
I That is, we are interested in the quantity X + Y ,
which itself is a random variable.
I From the bivariate distribution table, we know the
possible values of X + Y are 0, 1, 2, 3 or 4.
I Suppose we want to find the probability that a total
of two houses were sold in a week, i.e.,
P (X + Y = 2).
STAT7055 - Topic 3 33 / 63
Discrete Random Variables Functions of Random Variables
Sum of Two Random Variables
I From the table, we can find P (X + Y = 2) by
summing up all the joint probabilities for the values
of x and y which give x+ y = 2.
I That is,
P (X + Y = 2) = p(0, 2) + p(1, 1) + p(2, 0)
= 0.07 + 0.06 + 0.06
= 0.19
STAT7055 - Topic 3 34 / 63
Discrete Random Variables Functions of Random Variables
Sum of Two Random Variables
I We can repeat this for X + Y = 0, 1, 3 and 4 to
obtain the probability distribution for X + Y :
x+ y 0 1 2 3 4
p(x+ y) 0.12 0.63 0.19 0.05 0.01
I From this we can calculate the mean and variance
of X + Y :
I E(X + Y ) = 1.2
I V (X + Y ) = 0.56
STAT7055 - Topic 3 35 / 63
Discrete Random Variables Functions of Random Variables
Functions of Two Random Variables
I Note that we could use the same approach to
calculate the probability distribution of any function
of two discrete random variables.
I For example:
I g(X, Y ) = XY
I g(X, Y ) =
√
XY 3
I g(X, Y ) = X
Y+1I etc.
STAT7055 - Topic 3 36 / 63
Discrete Random Variables Functions of Random Variables
Expected Value
I If X and Y are two discrete random variables with
bivariate distribution p(x, y) and g(X, Y ) is some
function of X and Y , the expected value of
g(X, Y ) is given by:
E(g(X, Y )) =
∑
all x
∑
all y
(g(x, y)× p(x, y))
STAT7055 - Topic 3 37 / 63
Discrete Random Variables Covariance and Correlation
Covariance
I Let X and Y be discrete random variables with
joint probability distribution p(x, y).
I If we denote E(X) = µX and E(Y ) = µY , then the
(population) covariance between X and Y is:
σXY = Cov(X, Y )
= E ((X − µX)(Y − µY ))
=
∑
all x
∑
all y
((x− µX)(y − µY )× p(x, y))
STAT7055 - Topic 3 38 / 63
Discrete Random Variables Covariance and Correlation
Covariance
I Just like with the variance, there is a shortcut
formula for calculating the covariance:
Cov(X, Y ) = E (XY )− E(X)E(Y )
=
∑
all x
∑
all y
(xy × p(x, y))
− µXµY
STAT7055 - Topic 3 39 / 63
Discrete Random Variables Covariance and Correlation
Correlation Coefficient
I The (population) correlation coefficient is
defined in exactly the same way as before:
ρXY =
σXY
σXσY
I Remember that the correlation always lies between
−1 and 1, i.e., −1 ≤ ρXY ≤ 1.
STAT7055 - Topic 3 40 / 63
Discrete Random Variables Covariance and Correlation
Example
I Flip a coin three times.
I X is the number of heads, Y is the number of
sequence changes.
I We know that (check for yourself):
I µX = 32
I σ2X = 34I µY = 1
I σ2Y = 12
STAT7055 - Topic 3 41 / 63
Discrete Random Variables Covariance and Correlation
Example
Cov(X, Y ) =
∑
all x
∑
all y
(xy × p(x, y))
− µXµY
=
(
0× 0× 1
8
+ 1× 1× 2
8
+ 1× 2× 1
8
+2× 1× 2
8
+ 2× 2× 1
8
+ 3× 0× 1
8
)
− 3
2
× 1
= 0
STAT7055 - Topic 3 42 / 63
Discrete Random Variables Covariance and Correlation
Independence and Being Uncorrelated
I This implies ρXY = 0 so X and Y are uncorrelated.
I But remember we showed previously that X and Y
were not independent!
I Independence and being uncorrelated are not the
same thing.
I In fact, independence is a stronger condition than
being uncorrelated.
I Specifically, independence always implies a
correlation of zero, whereas being uncorrelated does
not always imply independence.
STAT7055 - Topic 3 43 / 63
Discrete Random Variables Linear Combinations
Linear Combination of Random Variables
I The quantity Z = aX + bY , where a and b are
constants, is called a linear combination of the
random variables X and Y .
I It can be shown that:
E(aX + bY ) = aE(X) + bE(Y )
V (aX + bY ) = a2V (X) + b2V (Y ) + 2ab× Cov(X, Y )
= a2σ2X + b
2σ2Y + 2abρXY σXσY
STAT7055 - Topic 3 44 / 63
Discrete Random Variables Linear Combinations
Portfolio Diversification
I In finance, variance or standard deviation is often
used to assess the risk of an investment.
I Analysts reduce risk by diversifying their
investments - that is, combining investments where
the correlation is small.
STAT7055 - Topic 3 45 / 63
Discrete Random Variables Linear Combinations
Portfolio Diversification
I An investor forms a portfolio by putting 25% of his
money in stock A and 75% in stock B, with
population parameters given below.
Expected
Value of
Return
Standard
Deviation of
Return
Stock A 8% 12%
Stock B 15% 22%
STAT7055 - Topic 3 46 / 63
Discrete Random Variables Linear Combinations
Expected Portfolio Return
I Let RA and RB denote the returns of stocks A and
B, respectively.
I If we let RP denote the return of the portfolio, then
we can write:
RP = 0.25RA + 0.75RB
I We are given that E(RA) = 8 and E(RB) = 15.
STAT7055 - Topic 3 47 / 63
Discrete Random Variables Linear Combinations
Expected Portfolio Return
I Therefore, the expected value of RP is:
E(RP ) = E(0.25RA + 0.75RB)
= 0.25× E(RA) + 0.75× E(RB)
= 0.25× 8 + 0.75× 15
= 13.25
I That is, the expected portfolio return is 13.25%.
STAT7055 - Topic 3 48 / 63
Discrete Random Variables Linear Combinations
Variance of Portfolio Return
I Calculate the variance when the two stock returns
are perfectly positively correlated, i.e., ρAB = 1.
V (RP ) = 0.25
2σ2A + 0.75
2σ2B
+ 2× 0.25× 0.75× ρABσAσB
= 0.252 × 122 + 0.752 × 222
+ 2× 0.25× 0.75× ρAB × 12× 22
= 281.25 + 99× ρAB
= 281.25 + 99× 1
= 380.25%2
STAT7055 - Topic 3 49 / 63
Discrete Random Variables Linear Combinations
Variance of Portfolio Return
I Calculate the variance when the two stock returns
are perfectly uncorrelated, i.e., ρAB = 0.
V (RP ) = 0.25
2σ2A + 0.75
2σ2B
+ 2× 0.25× 0.75× ρABσAσB
= 0.252 × 122 + 0.752 × 222
+ 2× 0.25× 0.75× ρAB × 12× 22
= 281.25 + 99× ρAB
= 281.25 + 99× 0
= 281.25%2
STAT7055 - Topic 3 50 / 63
Discrete Random Variables Binomial Distribution
Bernoulli Trial
I A Bernoulli trial is a random experiment that has
the following special properties:
I On each trial there are only two possible outcomes,
which we call success and failure.
I On any given trial, the probability of a success is p and
the probability of a failure is 1− p.
I The trials are independent - that is, the result of one
trial does not affect the result of any other trial.
STAT7055 - Topic 3 51 / 63
Discrete Random Variables Binomial Distribution
Binomial Distribution
I If a fixed number, n, of Bernoulli trials are
performed, the random variable representing the
number of successes in the n trials is called a
binomial random variable and its probability
distribution is called the binomial distribution.
I If X denotes a binomial random variable, then we
use the notation X ∼ Bin(n, p), where p is the
probability of success on any given trial.
STAT7055 - Topic 3 52 / 63
Discrete Random Variables Binomial Distribution
Some Examples
I Flip a coin ten times and let X be the number of
heads.
I X ∼ Bin(n = 10, p = 0.5).
I Pull a card from a deck, with replacement, eight
times and let X be the number of clubs.
I X ∼ Bin(n = 8, p = 0.25).
I Survey 1000 people and let X be the number of
people who think the current prime minister is doing
a good job.
I X ∼ Bin(n = 1000, p =?).
STAT7055 - Topic 3 53 / 63
Discrete Random Variables Binomial Distribution
Binomial Probability Distribution
I If X ∼ Bin(n, p) then the possible values that X
can take are 0, 1, 2, 3, . . . , n.
I The binomial probability distribution is given by
the following formula:
P (X = x) =
n!
x!(n− x)!p
x(1− p)n−x
Note that n! = n× (n− 1)× (n− 2)× . . .× 2× 1.
STAT7055 - Topic 3 54 / 63
Discrete Random Variables Binomial Distribution
Expected Value and Variance
I We could use the usual formula to calculate the
expected value:
E(X) =
∑
all x
(x× p(x))
=
n∑
x=0
(
x× n!
x!(n− x)!p
x(1− p)n−x
)
= . . .
I And similarly for the variance.
I But we don’t really want to.
STAT7055 - Topic 3 55 / 63
Discrete Random Variables Binomial Distribution
Expected Value and Variance
I Instead, let’s define a new random variable for each
Bernoulli trial as follows:
Xi =
{
1 if trial i is a success
0 if trial i is a failure
I Each Xi is called a Bernoulli or indicator variable.
I We know that the Xi are independent and we also
know that
X =
n∑
i=1
Xi
STAT7055 - Topic 3 56 / 63
Discrete Random Variables Binomial Distribution
Expected Value and Variance
I Using the laws of expected value and variance:
E(X) = E
(
n∑
i=1
Xi
)
=
n∑
i=1
E(Xi) =
n∑
i=1
p = np
V (X) = V
(
n∑
i=1
Xi
)
=
n∑
i=1
V (Xi) =
n∑
i=1
p(1− p) = np(1− p)
STAT7055 - Topic 3 57 / 63
Discrete Random Variables Binomial Distribution
Example
I A student sitting a statistics quiz decides to answer
each of the ten multiple choice questions entirely by
chance.
I Each question has five options, only one of which is
correct.
I Let X be the number of questions the student
answers correctly.
I Then X ∼ Bin(n = 10, p = 0.2).
STAT7055 - Topic 3 58 / 63
Discrete Random Variables Binomial Distribution
Example
I What is the probability the student gets half the
answers correct?
P (X = 5) =
10!
5!(10− 5)! × 0.2
5 × (1− 0.2)5 = 0.0264
I What is the probability that the student passes, i.e.,
gets five or more correct?
P (X ≥ 5) = P (X = 5) + P (X = 6) + P (X = 7)
+ P (X = 8) + P (X = 9) + P (X = 10)
= a lot of calculations!
STAT7055 - Topic 3 59 / 63
Discrete Random Variables Binomial Distribution
Binomial Tables
I There are tables available that list P (X ≤ k) for
different values of k, n and p.
I See Appendix B of the textbook.
I From tables, look up n = 10 and p = 0.2.
P (X ≥ 5) = 1− P (X ≤ 4)
= 1− 0.9672 (from tables)
= 0.0328
STAT7055 - Topic 3 60 / 63
Discrete Random Variables Binomial Distribution
Binomial Tables
I What is the probability the student gets half the
answers correct?
P (X = 5) = P (X ≤ 5)− P (X ≤ 4)
= 0.9936− 0.9672 (from tables)
= 0.0264
STAT7055 - Topic 3 61 / 63
Discrete Random Variables Binomial Distribution
Binomial Tables
I The binomial tables are a tool to make life easier by
helping us calculate binomial probabilities for
frequently used values of n and p.
I However, they are not a substitute for knowing and
being able to use the binomial probability
distribution formula - not all values of n or p will be
tabulated!
STAT7055 - Topic 3 62 / 63
Reference
Reference
I Keller 10e or 11e chapter 7.
STAT7055 - Topic 3 63 / 63