IEOR 4150 Introduction to Probability and Statistics Fall 2019
Dr. A. B. Dieker
Final exam
You have 2 hours and 50 minutes
No calculators, phones, or digital watches are allowed.
4
3

, etc.
This problem sheet will not be graded.
This exam consists of 7 pages.
1. (9 points) Use one of the following for each of the blanks:
S, R, [0, 1], the set of events on S.
(Some may need to be used multiple times or not at all.)
(a) A probability function is a function from to .
(b) A random variable is a function from to .
(c) A probability mass function is a function from to .
Solution.
(a) A probability function is a function from the set of events on S to [0, 1]. (Also correct:
a function from the set of events on S to R.)
(b) A random variable is a function from S to R.
(c) A probability mass function is a function from R to [0, 1]. (Also correct: a function from
R to R.)
,
2. (12 points) In the board game RISK, players battle each other using rolls of standard dice.
A battle is always between two players, with one of the players being the attacker and the
other the defender.
In a “1 on 1” battle, both players roll one die. To win such a battle, the roll of the attacker
be strictly higher than that of the defender; otherwise the attacker loses. In a “2 on 1” battle,
the attacker rolls two dice and the defender rolls one die. To win such a battle, at least one
of the rolls of the attacker must be strictly higher than the roll of the defender; otherwise the
attacker loses.
(a) Define events that are suitable to answer parts (b) and (c) in this problem.
(b) Calculate the probability that the attacker loses a “1 on 1” battle. Your solution needs to
use the events you defined in part (a) and translate the above game rules into appropriate
(conditional) probabilities involving these events.
(c) Calculate the probability that the attacker loses a “2 on 1” battle. Your solution needs to
use the events you defined in part (a) and translate the above game rules into appropriate
(conditional) probabilities involving these events.
Solution.
(a) Let Di be the event that the defender rolls i, L be the event that the attacker loses a “1
on 1” battle, and L0 be the event that the attacker loses a “2 on 1” battle.
(You can also solve these problems with Ai as the event that the attacker rolls i, instead
of Di.)
(b) We know that P (L|Di) = i/6, and conclude that P (L) =
P6
i=1 P (L|Di)P (Di) =P6
i=1
i
36 = 21/36.
(c) We know that P (L0|Di) = i2/36, and conclude that P (L0) =
P6
i=1 P (L
0|Di)P (Di) =P6
i=1
i2
216 . (This is 91/216.)
,
3. (5 points) Suppose X and Y are two independent Poisson random variables with parameters
X and Y , respectively. Show that X + Y is again Poisson and calculate its parameter.
Solution. For ` 0, we have
P (X + Y = `) =
X`
k=0
P (X = k)P (Y = ` k)
=
X`
k=0
eX
kX
k!
eY
`kX
(` k)!
= e(X+Y )
1
`!
X`
k=0

`
k

kX
`k
Y
= e(X+Y )
(X + Y )`
`!
,
where we use the binomial theorem in the last equality. We now see that X + Y is Poisson
with parameter X + Y . The parameter can also be found by comparing expectations. ,
4. (20 points) A random variable X has the following cumulative distribution function:
F (x) =
8>>>>>><>>>>>>:
0 x  1
1
2(x+ 1) 1 < x  0
1
2 0 < x  1
1
2x 1 < x  2
1 x > 2
(a) Is X continuous or discrete? Explain.
(b) Calculate P (X  1), P (X = 4), and P (1 < X  4).
(c) If X is continuous, calculate the probability density function of X. Otherwise calculate
the probability mass function of X.
2
(d) Calculate E(X).
(e) Calculate Var(3X + 2).
Solution.
(a) X is continuous because the cdf F is a continuous function.
(b) We have
P (X  1) = F (1) = 1
2
,
P (X = 4) = 0,
P (1 < X  4) = P (X  4) P (X  1) = F (4) F (1) = 1 1
2
=
1
2
.
The second probability follows from the fact that X is continuous, but it can also be
seen as follows:
P (X = 4) = P (X  4) P (X < 4) = F (4) lim
x!4F (x) = 1 1 = 0.
(c) The pdf of X is
f(x) =
dF (x)
dx
=
8>>>>>><>>>>>>:
0 x  1,
1
2 1 < x  0,
0 0 < x  1,
1
2 1 < x  2,
0 x > 2.
(d) We have
E(X) =
Z 1
1
xf(x)dx
=
Z 0
1
xf(x)dx+
Z 2
1
xf(x)dx
=
Z 0
1
x
1
2
dx+
Z 2
1
x
1
2
dx
=
1
2
Z 0
1
xdx+
1
2
Z 2
1
xdx =
1
2
.
(e) Since
E(X2) =
Z 1
1
x2f(x)dx
=
1
2
Z 0
1
x2dx+
1
2
Z 2
1
x2dx =
4
3
,
we have that
Var(X) = E(X2) E(X)2 = 4
3

1
2
◆2
=
13
12
.
Hence we find that
Var(3X + 2) = Var(3X) = 9Var(X) = 9 · 13
12
=
39
4
.
3
,
5. (5 points) Consider the following probability density function with an unknown parameter
✓ 2 (0, 1):
f✓(x) =
8><>:
2x
✓ 0 < x < ✓
2(1x)
1✓ ✓  x < 1
0 otherwise.
The diagram below depicts f✓ for ✓ = 1/4.
O
1
2
1/4 1 x
f✓(x)
Suppose X1, . . . , Xn is a random sample from the population probability density function f✓
for some unknown ✓. Calculate the moment estimator ⇥ of ✓.
Solution. First we need to calculate the expectation of X1. We find that
E(X1) =
Z 1
0
xf✓(x)dx =
Z ✓
0
2x2

dx+
Z 1

2x(1 x)
1 ✓ dx =
1 + ✓
3
.
By solving E(X1) = X, we find that the moment estimator ⇥ of ✓ is 3X 1. ,
6. (12 points) We have a random sampleX1, . . . , Xn. We are interested in the following estimator
for the population variance 2:
T 2 =
1
n(n 1)
X
1i(Xi Xj)2.
(a) Give the definition of a statistic and explain why T 2 is a statistic.
(b) Give the definition of bias and calculate the bias of this estimator.
(c) Compute T 2 S2, where S2 is the sample variance from class. Simplify as much as
possible.
Solution.
(a) Any function of the random sample X1, . . . , Xn is a statistic. The expression of 2 shows
that it is a function of X1, . . . , Xn, and thus it is a statistic.
(b) Suppose ✓ is an unknown parameter and ⇥ˆ is an estimator of ✓. Then the bias of ✓ˆ is
defined as
bias(⇥ˆ) = E(⇥ˆ) ✓.
By definition, the bias of T 2 is
bias(T 2) = E
0@ 1
n(n 1)
X
1i(Xi Xj)2
1A 2 = 1
2
E

(X1 X2)2
2.
4
Since X1, . . . , Xn is a random sample, we have that
E

(X1 X2)2

= E

X21

+ E

X22
2E (X1X2) = 2 µ2 + 2 2µ2 = 22.
You can also derive this from E

(X1 X2)2

= Var(X1 X2) = Var(X1) +Var(X2) =
22. Thus, the bias of T 2 is 0.
(c) We know that
S2 =
1
n 1
"
nX
i=1
X2i
(
Pn
i=1Xi)
2
n
#
.
It turns out that T 2 = S2 so T 2 S2 = 0. There are many ways to show this, and here
is one of them. Since the expressions for S2 and T 2 are symmetric in each of the Xi,
all we need to show is that the coecients in front of X21 and X1X2 match. The above
expression for S2 shows that the coecient in front of X21 is
1
n 1

1 1
n

=
1
n
,
and for T 2 it is also 1/n. Similarly, in S2 the coecient in front of X1X2 is 2/(n(n1))
and this also agrees with T 2.
,
7. (12 points) We are given two independent random samples. The first random sample is:
13, 8, 6, 13, 9, 11.
The sample variance for this data set is s21 = 8. Suppose µ1 and
2
1 are the unknown population
mean and variance, respectively. The second random sample is:
5, 10, 9, 13, 9, 11, 7, 8.
The sample variance for this data set is s22 = 6. Suppose µ2 and
2
2 are the unknown population
mean and variance, respectively.
We assume that these are normal samples (never mind that each data point is an integer).
(a) Calculate a two-sided 95% confidence interval for µ1.
(b) Assuming 1 = 2 for this part of the question only, test H0 : µ1 = µ2 against H1 : µ1 >
µ2 at significance level ↵ = 5%.
(c) Test H0 : 1 = 2 against H1 : 1 < 2 at significance level ↵ = 5%.
Solution.
(a) We have n1 = 6 and x¯1 = 10. The two-sided 95% confidence interval for µ1 is:24x¯1 ± t↵/2,n11
s
s21
n1
35 = "10± 2.571r4
3
#
,
where we found t0.025,5 = 2.571 from the tables.
5
(b) We have n1 = 6 and n2 = 8. The sample means are respectively x¯1 = 10 and x¯2 = 9.
We also have that
s2p =
(n1 1)s21 + (n2 1)s22
n1 + n2 2 =
5⇥ 8 + 7⇥ 6
12
=
41
6
.
The test statistic is
X¯1 X¯2r
S2p

1
n1
+ 1n2

and it has a tn1+n22 distribution under the null hypothesis. The realized value of the
test statistic is
p
144/287 and t0.05,12 = 1.782. We know that
p
144/287 < 1 < 1.782,
so we do not reject the null hypothesis.
(c) Under H0,
S21
S22
has an F distribution with parameters 5 and 7. The realized value of this
test statistic is 4/3. Low values of this statistic support the alternative hypothesis so
the critical region is (0, f0.95,5,7) = (0, 1/f0.05,7,5) = (0, 1/4.88). We do not reject.
,
8. (8 points) We are given the following random sample:
4.3, 6.6, 7.2, 3.4, 0.2, 8.3.
We want to test H0 : µ = 6.8 against H1 : µ < 6.8, where µ is the population mean.
(a) Assuming the sample is normal with known variance 13.5, carry out an appropriate test
by computing the P-value.
(b) Under the assumptions of part (a), calculate the type II error under the alternative that
µ = 6. Use ↵ = 5%. You do not need to give a numerical answer to this question.
Solution.
(a) The sample mean is x¯ = 5. Under the null hypothesis, we have that
Z =
X¯ 6.8p
13.5/6
⇠ N(0, 1).
The test has a left critical region since very negative realized values support the alter-
native but not very large. The realized value of this test statistic is 1.2, so the P-value
is (1.2) = 0.115.
(b) The critical region is (1, z0.95) ⇡ (1,1.645). The type II error probability is
Pµ=6

X¯ 6.8p
13.5/6
1.645
!
= Pµ=6

X¯ 6p
13.5/6
0.8
1.5
1.645
!
= 1 (1.645 + 8/15) = (1.645 8/15).
,
9. (8 points) The simple linear regression model takes the following form:
Yi = 0 + 1xi + ✏i,
6
where the ✏i are i.i.d. normal with mean 0 and variance 2. In this model, 0, 1, and 2
are unknown parameters. We found estimators B0 and B1 for 0 and 1 by minimizingPn
i=1(Yi 0 1xi)2 over 0 and 1.
In this problem, we replicate the above strategy in a simpler setting where we know that
1 = 0 and we need to find an estimator for 0. The other underlying assumptions are as for
the simple linear regression model.
(a) Find an estimator B0 for 0 by minimizing
Pn
i=1(Yi 0)2 over 0.
(b) What is the distribution of the estimator found in part (a)? Specify the parameters if it
is one of the common distributions from class.
Solution.
(a) Similar to the setting of standard simple linear regression, we compute the derivative
with respect to 1:
d
Pn
i=1(Yi 0)2
d0
= 2
nX
i=1
(Yi 0).
Setting the above derivative to 0 gives
B0 =
1
n
nX
i=1
Yi.
(b) From the model we deduce that the Yi are independent and normal with mean 0 and
variance 2. As a result, B0 is normal with mean 0 and variance 2/n.
,
10. (9 points) Project question – removed from this file.
7

(Xi Xj)2.
(a) Give the definition of a statistic and explain why T 2 is a statistic.
(b) Give the definition of bias and calculate the bias of this estimator.
(c) Compute T 2 S2, where S2 is the sample variance from class. Simplify as much as
possible.
Solution.
(a) Any function of the random sample X1, . . . , Xn is a statistic. The expression of 2 shows
that it is a function of X1, . . . , Xn, and thus it is a statistic.
(b) Suppose ✓ is an unknown parameter and ⇥ˆ is an estimator of ✓. Then the bias of ✓ˆ is
defined as
bias(⇥ˆ) = E(⇥ˆ) ✓.
By definition, the bias of T 2 is
bias(T 2) = E
0@ 1
n(n 1)
X
1i(Xi Xj)2
1A 2 = 1
2
E

(X1 X2)2
2.
4
Since X1, . . . , Xn is a random sample, we have that
E

(X1 X2)2

= E

X21

+ E

X22
2E (X1X2) = 2 µ2 + 2 2µ2 = 22.
You can also derive this from E

(X1 X2)2

= Var(X1 X2) = Var(X1) +Var(X2) =
22. Thus, the bias of T 2 is 0.
(c) We know that
S2 =
1
n 1
"
nX
i=1
X2i
(
Pn
i=1Xi)
2
n
#
.
It turns out that T 2 = S2 so T 2 S2 = 0. There are many ways to show this, and here
is one of them. Since the expressions for S2 and T 2 are symmetric in each of the Xi,
all we need to show is that the coecients in front of X21 and X1X2 match. The above
expression for S2 shows that the coecient in front of X21 is
1
n 1

1 1
n

=
1
n
,
and for T 2 it is also 1/n. Similarly, in S2 the coecient in front of X1X2 is 2/(n(n1))
and this also agrees with T 2.
,
7. (12 points) We are given two independent random samples. The first random sample is:
13, 8, 6, 13, 9, 11.
The sample variance for this data set is s21 = 8. Suppose µ1 and
2
1 are the unknown population
mean and variance, respectively. The second random sample is:
5, 10, 9, 13, 9, 11, 7, 8.
The sample variance for this data set is s22 = 6. Suppose µ2 and
2
2 are the unknown population
mean and variance, respectively.
We assume that these are normal samples (never mind that each data point is an integer).
(a) Calculate a two-sided 95% confidence interval for µ1.
(b) Assuming 1 = 2 for this part of the question only, test H0 : µ1 = µ2 against H1 : µ1 >
µ2 at significance level ↵ = 5%.
(c) Test H0 : 1 = 2 against H1 : 1 < 2 at significance level ↵ = 5%.
Solution.
(a) We have n1 = 6 and x¯1 = 10. The two-sided 95% confidence interval for µ1 is:24x¯1 ± t↵/2,n11
s
s21
n1
35 = "10± 2.571r4
3
#
,
where we found t0.025,5 = 2.571 from the tables.
5
(b) We have n1 = 6 and n2 = 8. The sample means are respectively x¯1 = 10 and x¯2 = 9.
We also have that
s2p =
(n1 1)s21 + (n2 1)s22
n1 + n2 2 =
5⇥ 8 + 7⇥ 6
12
=
41
6
.
The test statistic is
X¯1 X¯2r
S2p

1
n1
+ 1n2

and it has a tn1+n22 distribution under the null hypothesis. The realized value of the
test statistic is
p
144/287 and t0.05,12 = 1.782. We know that
p
144/287 < 1 < 1.782,
so we do not reject the null hypothesis.
(c) Under H0,
S21
S22
has an F distribution with parameters 5 and 7. The realized value of this
test statistic is 4/3. Low values of this statistic support the alternative hypothesis so
the critical region is (0, f0.95,5,7) = (0, 1/f0.05,7,5) = (0, 1/4.88). We do not reject.
,
8. (8 points) We are given the following random sample:
4.3, 6.6, 7.2, 3.4, 0.2, 8.3.
We want to test H0 : µ = 6.8 against H1 : µ < 6.8, where µ is the population mean.
(a) Assuming the sample is normal with known variance 13.5, carry out an appropriate test
by computing the P-value.
(b) Under the assumptions of part (a), calculate the type II error under the alternative that
µ = 6. Use ↵ = 5%. You do not need to give a numerical answer to this question.
Solution.
(a) The sample mean is x¯ = 5. Under the null hypothesis, we have that
Z =
X¯ 6.8p
13.5/6
⇠ N(0, 1).
The test has a left critical region since very negative realized values support the alter-
native but not very large. The realized value of this test statistic is 1.2, so the P-value
is (1.2) = 0.115.
(b) The critical region is (1, z0.95) ⇡ (1,1.645). The type II error probability is
Pµ=6

X¯ 6.8p
13.5/6
1.645
!
= Pµ=6

X¯ 6p
13.5/6
0.8
1.5
1.645
!
= 1 (1.645 + 8/15) = (1.645 8/15).
,
9. (8 points) The simple linear regression model takes the following form:
Yi = 0 + 1xi + ✏i,
6
where the ✏i are i.i.d. normal with mean 0 and variance 2. In this model, 0, 1, and 2
are unknown parameters. We found estimators B0 and B1 for 0 and 1 by minimizingPn
i=1(Yi 0 1xi)2 over 0 and 1.
In this problem, we replicate the above strategy in a simpler setting where we know that
1 = 0 and we need to find an estimator for 0. The other underlying assumptions are as for
the simple linear regression model.
(a) Find an estimator B0 for 0 by minimizing
Pn
i=1(Yi 0)2 over 0.
(b) What is the distribution of the estimator found in part (a)? Specify the parameters if it
is one of the common distributions from class.
Solution.
(a) Similar to the setting of standard simple linear regression, we compute the derivative
with respect to 1:
d
Pn
i=1(Yi 0)2
d0
= 2
nX
i=1
(Yi 0).
Setting the above derivative to 0 gives
B0 =
1
n
nX
i=1
Yi.
(b) From the model we deduce that the Yi are independent and normal with mean 0 and
variance 2. As a result, B0 is normal with mean 0 and variance 2/n.
,
10. (9 points) Project question – removed from this file.
7 