ELEC321-计算代写|学霸联盟

ELEC321-计算代写

时间：2022-12-05

STAT / ELEC 321
Part I
Introduction to Probability Theory
Chapter 2
Random Variables
© STAT/ELEC 321, UBC 1
Outline
Introduction: What is a random variable?
Module 3: Discrete random variables
D probability mass function (pmf); p(x) or pX (x)
D cumulative distribution function (cdf); F (x) or FX (x)
D expected value; E (X )
D expected value of a function of r.v.; E(g(X ))
D variance and standard deviation; Var(X ) and SD(X )
D Common discrete random variables (Bernoulli, binomial, geometric,
Poisson)
Module 4: Continuous random variables
© STAT/ELEC 321, UBC 2
Introduction
• A random variable is a real-valued function defined on the sample space
X : S → R
D We use uppercase letters to denote random variables (e.g., X , Y , Z )
• Example: Toss a fair coin three times
D S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
D Define random variable X to be the number of heads in the three tosses
Outcomes in S X
HHH 3
HHT 2
HTH 2
THH 2
HTT 1
THT 1
TTH 1
TTT 0
© STAT/ELEC 321, UBC 3
• The probability that X takes on a certain value k depends on the set of
outcomes that gives X = k
• In the coin tossing example, we have
D P(tossing no heads) = P(X = 0) = P({TTT}) = 1/8
D P(tossing exactly one head) = P(X = 1) =
P({HTT} ∪ {THT} ∪ {TTH}) = 3/8
D P(tossing exactly two head) = P(X = 2) =
P({HHT} ∪ {HTH} ∪ {THH}) = 3/8
D P(tossing exactly three heads) = P(X = 3) = P({HHH}) = 1/8
Note:
P
( 3⋃
k=0
{X = k}
)
=
3∑
k=0
P(X = k) = 1
since X must take on one of its possible values in {0, 1, 2, 3}
© STAT/ELEC 321, UBC 4
Discrete random variables
• Definition: A random variable (rv) that can take at most a countable
number of possible values is said to be discrete
For a discrete rv X , define:
• Probability mass function (pmf) of X :
pX (x) = p(x) = P(X = x)
Note: 0 ≤ pX (x) ≤ 1 for all x , and
∑
all x pX (x) = 1
• Cumulative distribution function (cdf) of X :
FX (x) = F (x) = P(X ≤ x) =
∑
k:k≤x
pX (k)
Note: FX (x) is a non-decreasing step function, and
0 ≤ FX (x) ≤ 1 for all x
© STAT/ELEC 321, UBC 5
Exercise:
• In the coin tossing experiment, let X be the number of heads in 3
tosses of a fair coin
• Compute the cumulative distribution function of X :
D F (0) =
D F (1) =
D F (2) =
D F (3) =
Note: F (k) = P(X ≤ k) = P(X ≤ a) = F (a) for a ≤ k < a + 1,
a = 0, 1, 2, 3
• Sketch the cdf of X
© STAT/ELEC 321, UBC 6
Expected Value
• Let X be a discrete random variable with pmf p(x)
• Definition: The expected value of X , denoted E (X ), is defined as
E (X ) =
∑
x :p(x)>0
x p(x)
• Synonyms: expectation of X , mean of X
• Interpretations:
D It is a weighted average of the values that rv X can take on with weights
given by p(x)’s
D In the special case where all values of X are equally likely, it is simply the
ordinary average of the values of X
D It is also a long-run average of observed values of rv X in an infinite
sequence of independent replications of an experiment
(law of large numbers)
© STAT/ELEC 321, UBC 7
Expected value:
E (X ) =
∑
x :p(x)>0
x p(x)
Exercise:
• Let X be the number of heads in 3 tosses of a fair coin
• Compute the expected value of X :
E (X ) =
© STAT/ELEC 321, UBC 8
Expectation of a function of random variable
• Proposition:
D Suppose X is a discrete rv taking on one of the values xi , i = 1, 2, . . .
with respective probabilities p(xi )
D Let g(X ) be a real function of X
D The expected value of g(X ) is given by
E
[
g(X )
]
=
∑
i
g(xi )p(xi )
• Proof: . . .
© STAT/ELEC 321, UBC 9
Expectation of a function of random variable (cont’d)
E
[
g(X )
]
=
∑
i
g(xi )p(xi )
• Note: E (g(X )) 6= g(E (X )), in general
• Let a, b be some real constants, X , X1, X2 be discrete random
variables, and g1 and g2 be real-valued functions.
Then
E (aX + b) = aE (X ) + b
E
[
ag1(X1) + bg2(X2)
]
= aE (g1(X1)) + bE (g2(X2))
© STAT/ELEC 321, UBC 10
Variance and standard deviation of X
• Let X be a discrete random variable with mean µ and pmf p(x)
• The variance of X is defined by
Var(X ) = E
[
(X − µ)2] = ∑
x :p(x)>0
(x − µ)2p(x)
• Standard deviation of X : SD(X ) = √Var(X )
• Interpretations:
D Var(X ) is the weighted sum of the squared deviations of x values from
the mean with weights given by p(x)’s
D Both the variance and standard deviation of X measure the spread or
dispersion of the x values taken on by rv X (around its mean) if the
experiment were to be repeated many times
© STAT/ELEC 321, UBC 11
• Alternative formula for variance of X :
Var(X ) = E (X 2)− [E (X )]2
Proof:
Var(X ) = E
[
(X − µ)2]
= E (X 2 − 2Xµ+ µ2)
= E (X 2)− 2µE (X ) + µ2
= E (X 2)− 2[E (X )]2 + [E (X )]2
= E (X 2)− [E (X )]2
• Exercise: For X being the number of heads in 3 tosses of a fair coin,
compute the variance of X
© STAT/ELEC 321, UBC 12
Let a, b be some real constants, X be a discrete random variable.
Then
Var
(
aX + b
)
= a2Var(X )
Proof: Let Y = aX + b. Then E (Y ) = E (aX + b) = aE (X ) + b
Var(Y ) = E (Y 2)− [E (Y )]2
= E (a2X 2 + 2abX + b2)− [aE (X ) + b]2
= a2E (X 2) + 2abE (X ) + b2 − [a2E (X )2 + 2abE (X ) + b2]
= a2E (X 2) + 2abE (X ) + b2 − a2E (X )2 − 2abE (X )− b2
= a2{E (X 2)− [E (X )]2} = a2Var(X )
© STAT/ELEC 321, UBC 13
Exercise 1:
An insurance policy pays $100 per day for up to 3 days of hospitalization
and $50 per day for each day of hospitalization thereafter. The number of
days of hospitalization, X , is a discrete random variable with probability
mass function
P(X = k) =
{
6−k
15 for k = 1, 2, 3, 4, 5
0 otherwise
(a) Determine the expected number of days of hospitalization. Also find
the standard deviation.
© STAT/ELEC 321, UBC 14
Exercise 1 (cont’d):
An insurance policy pays $100 per day for up to 3 days of hospitalization
and $50 per day for each day of hospitalization thereafter. The number of
days of hospitalization, X , is a discrete random variable with probability
mass function
P(X = k) =
{
6−k
15 for k = 1, 2, 3, 4, 5
0 otherwise
(b) Determine the expected payment for hospitalization under this policy.
© STAT/ELEC 321, UBC 15
Common Discrete Random Variables
© STAT/ELEC 321, UBC 16
Bernoulli random variable
• A Bernoulli trial is a random experiment that gives only one of two
possible outcomes (generally referred to as “success” and “failure”)
• Examples:
D sample one electronic component: “success” if it is defective, “failure” if
it is non-defective
D toss a fair coin: “success” if a head is tossed, “failure” if a tail is tossed
• The number of “successes” in a Bernoulli trial is a Bernoulli random
variable with parameter p, denoted
X ∼ Bernoulli(p),
where p is the probability of the “success” outcome, 0 ≤ p ≤ 1
© STAT/ELEC 321, UBC 17
Bernoulli random variable (cont’d)
For X ∼ Bernoulli(p),
pX (x) = p
x(1− p)1−x , x = 0, 1
E (X ) = p, Var(X ) = p(1− p)
© STAT/ELEC 321, UBC 18
Binomial random variable
• A binomial experiment consists of n (fixed in advance) identical and
independent Bernoulli trials
(Independence here means that the result of one trial does not affect
the result of any other trial)
• In the binomial experiment, let X1,X2, . . . ,Xn denote the corresponding
Bernoulli random variables, Xi ∼ Bernoulli(p), i = 1, . . . , n
(p is constant across all n trials)
D Define random variable X = X1 + X2 + ...+ Xn
D X is the total number of successes in the n trials
D X is a binomial random variable with parameters n and p, denoted
X ∼ Bin(n, p)
© STAT/ELEC 321, UBC 19
For X ∼ Bin(n, p),
pX (x) =
(
n
x
)
px(1− p)n−x , x = 0, 1, ..., n
where
(n
x
)
gives the total number of combinations that contain x successes
E (X ) = np, Var(X ) = np(1− p)
© STAT/ELEC 321, UBC 20
Example: Let X be the total number of heads obtained out of 5 tosses of a
biased coin (with a 30% chance of tossing a head)
Then X ∼ Bin(n = 5, p = 0.3) with pmf
pX (x) = P(X = x) =
(
5
x
)
0.30x(1− 0.30)5−x , x = 0, 1, 2, · · · , 5
x | 0 1 2 3 4 5
-------|-----------------------------------------
P(X=x) |0.168 0.360 0.309 0.132 0.028 0.002
© STAT/ELEC 321, UBC 21
More examples: Probability histograms of some other binomial random
variables
© STAT/ELEC 321, UBC 22
Exercise 2:
A firm sells items, randomly selected from a large lot that is known to
contain 10% defectives. Defective items will be returned for repair, and the
repair cost is given by
C = 3Y 2 + Y + 2
where Y is the number of defectives sold.
(a) Find the probability that at most one of the first four items sold is
defective.
© STAT/ELEC 321, UBC 23
Exercise 2 (cont’d):
A firm sells items, randomly selected from a large lot that is known to
contain 10% defectives. Defective items will be returned for repair, and the
repair cost is given by
C = 3Y 2 + Y + 2
where Y is the number of defectives sold.
(b) Find the expected repair cost for the four items sold.
© STAT/ELEC 321, UBC 24
Exercise 2 (cont’d):
A firm sells items, randomly selected from a large lot that is known to
contain 10% defectives. Defective items will be returned for repair, and the
repair cost is given by
C = 3Y 2 + Y + 2
where Y is the number of defectives sold.
(c) Find the probability that the first defective item sold is the seventh
item that is sold.
© STAT/ELEC 321, UBC 25
Geometric random variable
• Consider a setting identical to the binomial experiment except that the
number of Bernoulli trials, n, is not fixed in advance
• Our interest is in the number of trials until the “success” outcome
occurs for the first time
• Let X be the number of trials until the first “success” occurs, where
the probability of “success” is p on a single trial
• Then X is a geometric random variable with parameter p, denoted
X ∼ Geom(p)
© STAT/ELEC 321, UBC 26
Geometric random variable (cont’d)
For X ∼ Geom(p),
pX (x) = (1− p)x−1p, x = 1, 2, 3, ...
FX (x) =
∑x
k=1 pX (k) =
∑x
k=1(1− p)k−1p = 1− (1− p)x
E (X ) = 1p
Var(X ) = 1−p
p2
© STAT/ELEC 321, UBC 27
Poisson random variable
• A discrete rv X taking values 0, 1, 2, . . . is said to be a Poisson random
variable with parameter λ, λ > 0, denoted X ∼ Poisson(λ), if its pmf is
given by
pX (k) = P(X = k) = e
−λλk
k!
, k = 0, 1, 2, . . .
• Note: this is indeed a probability mass function since
∞∑
k=0
pX (k) =
∞∑
k=0
e−λ
λk
k!
= e−λ
∞∑
k=0
λk
k!︸︷︷︸
=eλ
= 1
• For X ∼ Poisson(λ),
E (X ) = Var(X ) = λ
Proof: . . .
© STAT/ELEC 321, UBC 28
Poisson approximation to the Binomial
• The Poisson rv is used to model the number of events occurring in a
fixed period of time provided that events occur with a known average
rate λ (e.g., number of misprints on a page in a textbook, number of
customers entering a store on a given day, number of phone calls at a
call centre per minute, etc.)
• The reason for wide applicability of the Poisson rv is that in many cases
it provides a good approximated to the binomial rv
• Let X ∼ Bin(n, p) such that n is large and p is small so that np is
moderate
• Then distribution of X can be well approximated by the Poisson
distribution with parameter λ = np
© STAT/ELEC 321, UBC 29
Poisson approximation to the Binomial (cont’d)
Proof: For X ∼ Bin(n, p), we have E (X ) = np. Let λ = np, or p = λ/n
P(X = k) =
(
n
k
)
pk(1− p)n−k
=
n!
k!(n − k)!
(λ
n
)k(
1− λ
n
)n−k
=
n(n − 1)(n − 2)...(n − k + 1)
nk
(λk
k!
) (1− λn )n
(1− λn )k
→λ
ke−λ
k!
as n→∞
since
• n(n−1)(n−2)...(n−k+1)
nk
→ 1 as n→∞
• (1− λn )n → e−λ and (1− λn )k → 1 as n→∞
© STAT/ELEC 321, UBC 30
Poisson approximation to the Binomial (cont’d)
In practice, a Bin(n, p) rv can be reasonably well approximated by the
Poisson(λ) rv with λ = np if
• n is large (n ≥ 20), and
• p is small enough such that np < 5
© STAT/ELEC 321, UBC 31
Example
• A certain disease occurs in 1.2% of the population. 100 people are
selected at random from the population.
• Note: this is a binomial experiment with the number of trials n = 100
being large, and the expected number of people with the disease
λ = np = 100× 0.012 = 1.2 being moderate.
• What is the probability of no people having the disease in a sample
of 100?
• Let X denote the number of people having the disease in a sample
of 100. Then X ∼ Bin(n = 100, p = 0.012) so that
P(X = 0) =
(
100
0
)
(0.012)0(1− 0.012)100 = 0.2990
• Here we can also approximate X by a Poisson(λ = 1.2) rv to get
P(X = 0) = e−λ
λ0
0!
= 0.3012
© STAT/ELEC 321, UBC 32
Example (cont’d)
• The probability that exactly 2 people in the sample of 100 have the
disease based on the binomial distribution is
P(X = 2) =
(
100
2
)
(0.012)2(1− 0.012)98 = 0.2183
• The Poisson approximation gives
P(X = 2) = e−λ
λ2
2!
= 0.2169
© STAT/ELEC 321, UBC 33
Exercise 4
At a local store that sells a large number of computers, only 0.1% of all
computers sold experience CPU failure during the warranty period. Consider
a sample of 4,000 computers. Use the Poisson approximation to answer the
following parts.
(a) What is the approximate probability that no sampled computers have
CPU defect?
© STAT/ELEC 321, UBC 34
Exercise 4 (cont’d)
At a local store that sells a large number of computers, only 0.1% of all
computers sold experience CPU failure during the warranty period. Consider
a sample of 4,000 computers. Use the Poisson approximation to answer the
following parts.
(b) Find the expected value and the standard deviation of the number of
computers in the sample that have CPU defect.
© STAT/ELEC 321, UBC 35
Outline
Introduction: What is a random variable?
Module 3: Discrete random variables
Module 4: Continuous random variables
D Introduction: Continuous random variables
D Probability density function; fX (x) or f (x)
D Cumulative distribution function; FX (x) or F (x)
D Expected value of a continuous r.v.
D Expected value of a function of a continuous r.v.
D Variance and standard deviation
D Common continuous distributions: uniform, normal, exponential, gamma
D Distribution of a function of a r.v.
© STAT/ELEC 321, UBC 36
Introduction
• Recall that discrete random variables take values on a finite or a
countable set of numbers (most commonly, integers)
• It is, however, possible for a random variable to take values on an
uncountable set of numbers (e.g., a real interval), in which case the
random variable is referred to as continuous
• Examples of continuous random variables: lifetime of batteries, daily
windspeed, weight of adults
© STAT/ELEC 321, UBC 37
Definition: A random variable X is said to be (absolutely) continuous if
there exists a nonnegative function f , defined on R = (−∞,∞) with the
property that for any set B ⊂ R
P(X ∈ B) =
∫
B
f (x)dx
The function f is called the
probability density function
(pdf) of rv X
Setting B = [a, b], we have
P(a ≤ X ≤ b) =
∫ b
a
f (x)dx
x
f(x)
P(B)
B
© STAT/ELEC 321, UBC 38
Properties of pdf f (x)
• f (x) ≥ 0 for all x
• 1 = P{X ∈ (−∞,∞)} = ∫∞−∞ f (x)dx
(i.e., total area under f (x) is equal to 1)
• P(X = a) = P(a ≤ X ≤ a) = ∫ aa f (x)dx = . . .
• P(a ≤ X ≤ b) = P(a < X ≤ b) = P(a ≤ X < b) = P(a < X < b)
© STAT/ELEC 321, UBC 39
Interpretation of pdf f (x)
• f (x) resembles the shape of the histogram of a large representative
data set of observed x values
• f (x) can be interpreted as the relative probability of X at x :
P(x − d2 ≤ X ≤ x + d2 )
d
=
1
d
∫ x+ d
2
x− d
2
f (t)dt ≈ f (x) for d small
© STAT/ELEC 321, UBC 40
• Cumulative distribution function (cdf) of X :
D Notation: FX (x) or F (x)
D F (x) = P(X ≤ x) = ∫ x−∞ f (t)dt for all x
D F (x) is non-decreasing
D 0 ≤ F (x) ≤ 1 for all x
D P(a ≤ X ≤ b) = P(X ≤ b)− P(X ≤ a) = F (b)− F (a)
• Relationship between f (x) and F (x):
F (x) =
∫ x
−∞
f (t)dt
f (x) = F ′(x) =
d
dx
F (x)
© STAT/ELEC 321, UBC 41
Exercise 1
The effectiveness of solar-energy heating units depends on the amount of
radiation available from the sun. During a typical October, daily total solar
radiation follows approximately the following probability density function
(units are hundreds of calories):
f (x) =
{
3
32 (x − 2)(6− x) for 2 ≤ x ≤ 6
0 elsewhere
(a) Find the cumulative distribution function.
© STAT/ELEC 321, UBC 42
Exercise 1 (cont’d)
(b) Five days in October are randomly chosen. What is the probability that
at least 3 days have solar radiation that exceeds 300 calories?
© STAT/ELEC 321, UBC 43
Expected Value
• Let X be a continuous random variable with pdf f (x)
• The expected value of X is defined as
E (X ) =
∫ ∞
−∞
x f (x)dx
• Synonyms of the expected value: the expectation of X , the mean of X
(often denoted by µ)
• Proposition: Let g(X ) be a real function of X . Then the expected
value of g(X ) is given by
E [g(X )] =
∫ ∞
−∞
g(x) f (x) dx
© STAT/ELEC 321, UBC 44
Expected value of functions of random variables
Proposition: Let a, b be some real constants, X , X1, X2 be continuous
random variables, and g1 and g2 be real functions. Then
E (aX + b) = a E (X ) + b
E
[
a g1(X1) + b g2(X2)
]
= a E [g1(X1)] + b E [g2(X2)]
© STAT/ELEC 321, UBC 45
Proposition: For a non-negative continuous random variable Y ,
E (Y ) =
∫ ∞
0
P(Y > y)dy =
∫ ∞
0
[1− F (y)]dy
Proof:
• Note: P(Y > y) = ∫∞y fY (x)dx for y > 0, where fY denotes pdf of Y∫ ∞
0
P(Y > y)dy =
∫ ∞
0
∫ ∞
y
fY (x) dx dy
=
∫ ∞
0
(∫ x
0
dy
)
fY (x)dx
=
∫ ∞
0
x fY (x)dx [x is a dummy variable here]
=
∫ ∞
0
yfY (y)dy
= E (Y )
© STAT/ELEC 321, UBC 46
Variance and standard deviation
• Let X be a continuous random variable with pdf f (x)
• The variance of X is given by
Var(X ) = E
[
(X − µ)2] = ∫ ∞
−∞
(x − µ)2f (x)dx
• Var(X ) = E (X 2)− [E (X )]2
• Standard deviation of X : SD(X ) = √Var(X )
• Let a, b be some real constants. Then
Var
(
aX + b
)
= a2Var(X )
© STAT/ELEC 321, UBC 47
Exercise 2
An insurance policy pays for a random loss X subject to a deductible
of D = 0.5. The loss amount is modelled as a continuous random variable
with the following cumulative distribution function:
F (x) =

0 for x < 1
x2 − 2x + 1 for 1 ≤ x < 2
1 for x ≥ 2
(a) Find the probability density function of the loss amount
© STAT/ELEC 321, UBC 48
Exercise 2 (cont’d)
An insurance policy pays for a random loss X subject to a deductible
of D = 0.5. The loss amount is modelled as a continuous random variable
with the following cumulative distribution function:
F (x) =

0 for x < 1
x2 − 2x + 1 for 1 ≤ x < 2
1 for x ≥ 2
(b) Find the probability that the insurance payment is between 0.9 and 1.2
© STAT/ELEC 321, UBC 49
Exercise 2 (cont’d)
An insurance policy pays for a random loss X subject to a deductible
of D = 0.5. The loss amount is modelled as a continuous random variable
with the following cumulative distribution function:
F (x) =

0 for x < 1
x2 − 2x + 1 for 1 ≤ x < 2
1 for x ≥ 2
(c) Find the expected insurance payment
© STAT/ELEC 321, UBC 50
Common Continuous Distributions
© STAT/ELEC 321, UBC 51
Uniform distribution
A random variable X is said to be uniformly distributed over an interval
(α, β) if its density function is given by
f (x) =
{ 1
β−α for α < x < β
0 otherwise
Notation: X ∼ U(α, β)
Parameters: α, β (β > α)
© STAT/ELEC 321, UBC 52
For X ∼ U(α, β),
f (x) =
{ 1
β−α for α < x < β
0 otherwise
F (x) =

0 for x ≤ α
x−α
β−α for α < x < β
1 for x ≥ β
E (X ) = α+β2 [mid-point of α and β]
Var(X ) = (β−α)
2
12
© STAT/ELEC 321, UBC 53
Exercise 3
An investment account earns an annual interest rate R that follows a
uniform distribution on the interval (0.04, 0.08). The value of a $10,000
initial investment in this account after one year is given by
V = 10, 000eR .
Determine the cumulative distribution function, F (v), of V for values that
satisfy 0 < F (v) < 1.
© STAT/ELEC 321, UBC 54
Normal distribution
A random variable X follows the normal distribution if its probability density
function is given by
f (x) =
1√
2piσ
exp
{
− (x − µ)
2
2σ2
}
, −∞ < x <∞
Notation: X ∼ N(µ, σ2)
Parameters: µ (mean) and σ2 (variance)
f (x) is bell-shaped and symmetric about the mean µ
© STAT/ELEC 321, UBC 55
• For X ∼ N(µ, σ2),
f (x) = 1√
2piσ
exp
{
− (x−µ)2
2σ2
}
F (x) =
∫ x
−∞
1√
2piσ
exp
{
− (u−µ)2
2σ2
}
du
E (X ) = µ
Var(X ) = σ2
© STAT/ELEC 321, UBC 56
• If Z ∼ N(0, 1), then Z is referred to as the standard normal random
variable
• Remark: If X ∼ N(µ, σ2), then Z = X−µσ ∼ N(0, 1) (will be shown later)
• The cdf of the standard normal rv, usually denoted by Φ(z), is useful
for probability calculations that involve normal random variables
Φ(z) values are tabulated in Table 5.1 (p.201 in 8th ed / p.190 in 9th ed)
• By symmetry of fZ about the value 0, we have
Φ(−z) = 1− Φ(z)
© STAT/ELEC 321, UBC 57
(Gauss) error function
• The error function is defined as
erf (z) =
2√
pi
∫ z
0
e−x
2
dx
• Interpretation: If X ∼ N(µ = 0, σ2 = 12 ), then
erf (z) = P(−z ≤ X ≤ z)
© STAT/ELEC 321, UBC 58
Exercise 4
The weight of a cruise ship passenger’s suitcase follows the normal
distribution with mean 25 kg and standard deviation 2.4 kg.
(a) Find the probability that a randomly chosen suitcase weighs between
22 kg and 27 kg.
© STAT/ELEC 321, UBC 59
Exercise 4 (cont’d)
The weight of a cruise ship passenger’s suitcase follows the normal
distribution with mean 25 kg and standard deviation 2.4 kg.
(b) If 15% of suitcases are overweight, find the maximum weight allowed
by the cruise line.
© STAT/ELEC 321, UBC 60
Normal approximation to the binomial distribution
• The normal distribution provides a reasonable approximation to a
binomial random variable X ∼ Bin(n, p) as long as n is considerably
large
• In practice, the approximation is good when both of the following
conditions are satisfied: np ≥ 5 and n(1− p) ≥ 5
• We then have:
X
·∼ N(µ = np, σ2 = np(1− p))
or, equivalently,
X − np√
np(1− p)
·∼ N(0, 1)
where ”
·∼” is used to denote approximate distribution
© STAT/ELEC 321, UBC 61
Normal approximation to the binomial distribution (cont’d)
© STAT/ELEC 321, UBC 62
Exponential distribution
A random variable X is said to be exponentially distributed with
parameter λ (λ > 0) if it has a probability density function:
f (x) =
{
λe−λx for x ≥ 0
0 otherwise
Notation: X ∼ Exp(λ)
© STAT/ELEC 321, UBC 63
• For X ∼ Exp(λ),
f (x) =
{
λe−λx for x ≥ 0
0 otherwise
F (x) =
∫ x
0 λe
−λudu = 1− e−λx for x ≥ 0
E (X ) = 1λ
Var(X ) = 1
λ2
© STAT/ELEC 321, UBC 64
• Exponential random variables satisfy the memoryless property:
P(X > s + t|X > t) = P(X > s) for all s, t ≥ 0
Proof: Let X ∼ Exp(λ).
P(X > s + t|X > t) = P({X > s + t} ∩ {X > t)})
P(X > t)
=
P(X > s + t)
P(X > t)
=
e−λ(s+t)
e−λt
= e−λs = P(X > s)
• One can also write the memoryless property as:
P(X > s + t) = P(X > t)P(X > s)
© STAT/ELEC 321, UBC 65
Exercise 6
The waiting time for the first claim from a good driver and the waiting time
for the first claim from a bad driver follow exponential distributions with
means 6 years and 3 years, respectively. What is the probability that the
first claim from a good driver will be filed within 3 years and the first claim
from a bad driver will be filed within 2 years? State any assumption you
make in your calculation.
© STAT/ELEC 321, UBC 66
Gamma distribution
• A random variable X is said to follow the gamma distribution if it has
the following probability density function:
f (x) =
{
λe−λx (λx)α−1
Γ(α) for x ≥ 0
0 otherwise
where Γ(α) =
∫∞
0 e
−yyα−1dy
• Notation: X ∼ Gamma(α, λ)
D α (α > 0) is the shape parameter
D λ (λ > 0) is the rate parameter
• One can show that Γ(α) = (α− 1) · Γ(α− 1)
If α is an integer, Γ(α) = (α− 1)!
© STAT/ELEC 321, UBC 67
For X ∼ Gamma(α, λ),
f (x) =
{
λe−λx (λx)α−1
Γ(α) for x ≥ 0
0 otherwise
F (x) =
∫ x
0
λe−λu(λu)α−1
Γ(α) du for x ≥ 0
E (X ) = αλ Var(X ) =
α
λ2
© STAT/ELEC 321, UBC 68
• Special case of Gamma(α, λ): when α = 1, Gamma(1, λ) = Exp(λ)
• Sum of n independent Exp(λ) random variables is a Gamma random
variable with shape parameter α = n and rate parameter λ
• Application:
D For an event that occurs according to a Poisson process with rate λ (to
be introduced in Part II) , the time between two consecutive events, T ,
is exponentially distributed
D The occurrences of events over non-overlapping intervals are independent
D Define X to be the total waiting time until n events have occurred:
X = T1 + T2 + ...+ Tn
D Then X ∼ Gamma(n, λ)
© STAT/ELEC 321, UBC 69
© STAT/ELEC 321, UBC 70
What if g(X ) is a strictly increasing or decreasing function?
• Suppose g(X ) is strictly increasing, i.e. g(x1) < g(x2) for all x1 < x2
FY (y) = P(Y ≤ y)
= P(g(X ) ≤ y)
= P(X ≤ g−1(y)) where g−1 is a differentiable inverse function
= FX (g
−1(y))
⇒ fY (y) = d
dy
FY (y) = fX (g
−1(y)) · d
dy
(g−1(y))
• Note that ddy (g−1(y)) is positive because g−1(y) is increasing for g(y)
being increasing
© STAT/ELEC 321, UBC 71
• Suppose g(X ) is strictly decreasing, i.e. g(x1) > g(x2) for all x1 < x2
FY (y) = P(Y ≤ y)
= 1− P(Y > y)
= 1− P(X < g−1(y))
= 1− P(X ≤ g−1(y))
= 1− FX (g−1(y))
⇒ fY (y) = d
dy
(1− FX (g−1(y))) = −fX (g−1(y)) · d
dy
(g−1(y))
• Note that ddy (g−1(y)) is negative because g−1(y) is decreasing for
g(y) being decreasing
• Hence, we can write:
fY (y) = fX (g
−1(y)) ·
∣∣∣ d
dy
(g−1(y))
∣∣∣
© STAT/ELEC 321, UBC 72
Proposition: Let X have a probability density function fX (x) and Y = g(X )
be a strictly increasing or decreasing, differentiable real function of X .
Then
fY (y) = fX (g
−1(y)) ·
∣∣∣ d
dy
(g−1(y))
∣∣∣
© STAT/ELEC 321, UBC 73
Example: Show that for X ∼ N(µ, σ2), Z = X−µσ ∼ N(0, 1)
Solution
• Let Z = g(X ) = X−µσ , and note that g(X ) is strictly increasing:
• x = g−1(z) = µ+ σz ⇒ ddz (g−1(z)) = σ
• Hence,
fZ (z) = fX (g
−1(z)) ·
∣∣∣ d
dz
(g−1(z))
∣∣∣
=
1√
2piσ
exp
{
− (µ+ zσ − µ)
2
2σ2
}
× σ
=
1√
2pi
exp
{
− z
2
2
}
=
1√
2pi · 1exp
{
− (z − 0)
2
2 · 12
}
• That is, X−µσ ∼ N(0, 1)
© STAT/ELEC 321, UBC 74
Exercise 3 (revisited)
An investment account earns an annual interest rate R that follows a
uniform distribution on the interval (0.04, 0.08). The value of a 10,000
initial investment in this account after one year is given by V = 10, 000eR .
(a) Determine the probability density function of V .
(b) Also find the probability that the amount of the investment after one
year exceeds 10,500.
© STAT/ELEC 321, UBC 75
Module 3 - Learning outcomes
Aim: Demonstrate an understanding of the basic concepts of discrete
random variables and a number of common discrete distributions
• Appreciate that a random variable is a function that maps outcomes in
a sample space to a numerical quantity
• Identify the random variable(s) of interest in a given scenario
• Tell whether a random variable is discrete or not
• Recall the definition and properties of the probability mass function of
a discrete random variable
• Recall the definition and properties of the cumulative distribution
function of a discrete random variable
© STAT/ELEC 321, UBC 76
Module 3 - Learning outcomes (cont’d)
• Obtain the probability mass function and cumulative distribution
function for a discrete random variable of interest
• Calculate probabilities associated with a discrete random variable
• Calculate the expected value of a discrete random variable and that of
a real function of a discrete variable
• Interpret the expected value of a discrete random variable
• Calculate the variance and standard deviation of a discrete random
variable
• Apply general properties of expectation and variance operators
• Recall the definitions of a Bernoulli trial and a Binomial experiment
© STAT/ELEC 321, UBC 77
Module 3 - Learning outcomes (cont’d)
• Recognize cases where the following distributions could be an applied
model: Bernoulli, Binomial, Geometric and Poisson
• Identify the parameters for the following distributions: Bernoulli,
Binomial, Geometric and Poisson
• Calculate probabilities, the mean and variance of the following random
variables: Bernoulli, Binomial, Geometric and Poisson
• Approximate Binomial probabilities using a Poisson distribution where
appropriate
© STAT/ELEC 321, UBC 78
Module 4 - Learning outcomes
Aim: Demonstrate an understanding of the basic concepts of continuous
random variables and a number of common continuous distributions
• Identify the random variable(s) of interest in a given scenario
• Differentiate between discrete and continuous random variables
• Recall the properties of the probability density function and cumulative
distribution function of a continuous random variable
• Calculate probabilities of a continuous random variable from a given
probability density function
• Recognize that the probability that a continuous random variable
whose value falls in a certain region is given by the area under the
probability density function over that region
© STAT/ELEC 321, UBC 79
Module 4 - Learning outcomes (cont’d)
• Recall the relationship between the probability density function and the
cumulative distribution function of a continuous random variable
• Obtain the cumulative distribution function from a probability density
function for a continuous random variable, and vice versa
• Calculate the expected value of a continuous random variable and that
of a real function of a continuous variable
• Calculate the variance and standard deviation of a continuous random
variable
• Apply general properties of expectation and variance operators
• Recognize cases where the following distributions could be an applied
model: Uniform, Normal, Exponential, Gamma and Beta
© STAT/ELEC 321, UBC 80
Module 4 - Learning outcomes (cont’d)
• Identify the parameters for the following distributions: Uniform,
Normal, Exponential, Gamma and Beta
• Describe how the probability density function changes with the
parameter(s) for the following distributions: Uniform, Normal,
Exponential, Gamma and Beta
• Calculate probabilities, the mean and variance of the following random
variables: Uniform, Normal, Exponential, Gamma and Beta
• Recall the properties of a Normal distribution, and those of the
standard Normal distribution
• Obtain probabilities related to Normal random variables using the
standard Normal table
• Approximate Binomial probabilities using a Normal distribution where
appropriate
© STAT/ELEC 321, UBC 81
Module 4 - Learning outcomes (cont’d)
• Apply continuity correction when approximating Binomial probabilities
using a Normal distribution
• Recall that the time between two consecutive events that occur
according to a Poisson process follows the Exponential distribution
• Explain the memoryless property of a continuous random variable, with
the Exponential random variable as an example
• Recall the relationship between the Exponential distribution and the
Gamma distribution
• Derive the cumulative distribution function and the probability density
function of a real function of a given continuous random variable
© STAT/ELEC 321, UBC 82