xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

宏观经济代写-CC0294

时间：2022-05-23

CC0294 Semester 1 – 2016 Page 1 of 14

SEAT NUMBER:

LAST NAME:

FIRST NAME:

SID:

CONFIDENTIAL EXAM PAPER

This paper is not to be removed from the examination room.

ECMT1010

Introduction to Economic Statistics

End of Semester Examination

Semester 1 – 2016

Total Duration: 2 hours and 10 minutes

Writing Time: 2 hours

Reading Time: 10 minutes

INSTRUCTIONS TO CANDIDATES

1. This is a closed book exam. Electronic devices, apart from a simple, non-programmable

calculator, are not permitted. Formulas, definitions, and distribution tables are provided

at the end of the exam.

2. This exam contains 30 Multiple Choice Questions and 2 Problems with multiple parts.

ANSWER ALL QUESTIONS.

3. Multiple Choice Questions are worth 0.5 marks each. Problems are worth 20 marks each.

The marks for each part of the Problems are indicated. Marks total 55.

4. Answer all Multiple Choice Questions on the answer sheet provided for that purpose.

Answer the Problems in the Exam Booklets provided.

5. This question paper must be returned with the Multiple Choice answer sheet and the

Exam Booklet answer script(s).

6. Take care to explain your answers and to write legibly.

Please check your examination paper is complete (14 pages) and indicate you have done

this by signing below.

I have checked the examination paper and affirm it is complete.

Student Signature: Date:

1

CC0294 Semester 1 – 2016 Page 2 of 14

30 Multiple Choice Questions [15 marks total—suggested time approx. 32 minutes].

1. A bank reports that 30% of households have a MasterCard, 20% have an American Express card, and

25% have a Visa card. Eight percent of households have both a MasterCard and an American Express

card. Twelve percent have both a Visa card and a MasterCard. Six percent have both a American

Express card and a Visa card. If a household has a MasterCard, what is the probability it also has a

Visa card?

A) 0.12

B) 0.25

C) 0.40

D) 0.43

E) 0.48

2. Lenovo Group Limited, a Hong Kong IT company, has a 30% share of the Hong Kong PC market.

Suppose 10 new PC buyers are selected at random from the Hong Kong population. What is the

probability that fewer than 3 bought their PC from Lenovo?

A) 0.028

B) 0.121

C) 0.233

D) 0.267

E) 0.382

3. A pair of (fair) dice is rolled once. What is the probability that the sum of the values on the two die

faces is 7?

A) 1/6

B) 7/36

C) 1/2

D) 1/18

E) 1/3

4. Calculate the expected value (mean) of the following discrete probability distribution:

x 1 2 3 4

p(x) 0.16 0.26 0.26 0.32

A) 1

B) 2.5

C) 1.86

D) 2.74

E) 2.0

2

CC0294 Semester 1 – 2016 Page 3 of 14

5. A psychologist is interested to test whether the IQ of statisticians is higher than 100. Based on a

random sample of 100 statisticians, the sample mean of IQ is 120. What is the p-value of the test

assuming a population standard deviation of 100?

A) 0.0000

B) 0.0228

C) 0.0456

D) 0.4207

E) 0.8414

Scenario 1 In a marketing research project, a major supermarket wants to study the relationship between

the annual consumption of ramen noodles (Y , in number of packs) and the annual income level of con-

sumers (X , in $000s). Based on a random sample of 100 customers, the linear regression model

Yi = β0 + β1X i + εi

is estimated with the following result:

Coefficient Standard Error

Intercept 55.4 32.3

Annual income (in $000s) −0.22 0.1

6. Refer to Scenario 1. What is the predicted annual consumption (in number of packs) of ramen

noodles for a consumer who earns $100,000 a year?

A) 2.2

B) 22.0

C) 33.4

D) 53.2

E) 55.4

7. Refer to Scenario 1. To see whether income level has an effect on the consumption of ramen noodles,

Adam, Simon and Tim consider the following hypotheses:

H0 : β1 = 0 Ha : β1 6= 0

They arrive at the following conclusions:

Adam: The null hypothesis is rejected at the 5% significance level.

Simon: The null hypothesis is rejected at the 2% significance level.

Tim: The null hypothesis is rejected at the 1% significance level.

Who is/are correct?

A) Tim only

B) Simon only

C) Adam only

D) Adam and Simon only

E) Adam, Simon and Tim

3

CC0294 Semester 1 – 2016 Page 4 of 14

8. Alcohol content in beer is believed to follow a normal distribution. A chemist takes a sample from

9 bottles of beer and measures the alcohol content, finding a sample mean of 7.5% and a sample

standard deviation of 1%. The chemist wishes to compute a 90% confidence interval for the mean.

However, the chemist mistakenly treats the sample standard deviation as if it were the population

standard deviation. What is the confidence interval constructed by the chemist?

A) (6.647, 8.153)

B) (6.952, 8.048)

C) (6.880, 8.120)

D) (7.073, 7.927)

E) (7.034, 7.966)

9. Suppose the chemist in the previous question realises he has made a mistake. If he correct his mistake

and recalculates the confidence interval using the same sample, how will the new confidence interval

compare to the previous one?

A) The new interval will be the same width as the previous one and will be shifted to the left to

account for small sample bias.

B) The new interval will be wider than the previous one and will be centered around the same

point estimate.

C) The new interval will be narrower than the previous one and will be centered around the same

point estimate.

D) The new interval will be narrower than the previous one and will be shifted to the left to account

for small sample bias.

E) This cannot be determined from the data given.

10. A lecturer hires a tutor to mark exam papers. To ensure that the tutor is grading correctly, the

lecturer marks a few exam papers herself and compares her mark with the mark given by the tutor.

She chooses these papers by physically going through the pile of exams and pulling out a paper “when

she feels like it”. This corresponds to which form of sampling?

A) Judgement sampling

B) Simple random sampling

C) Systematic sampling

D) Cluster sampling

E) Snowball sampling

11. A pair of (fair) die is rolled once. What is the probability that the sum of the values on the two die

faces is not a 7?

A) 1/6

B) 7/36

C) 1/3

D) 1/2

E) 5/6

4

CC0294 Semester 1 – 2016 Page 5 of 14

Scenario 2 The following table is derived from the Banerjee et al (2010) study on vaccination rates in

India.

Control Group Treatment Only Treatment plus

Incentive

Children not fully immunised 810 311 234

Children fully immunised 50 68 148

12. Refer to Scenario 2. What is the point estimate of the proportion of children that were fully immu-

nised in villages that received only the treatment and not the additional incentives?

A) 0.613

B) 0.219

C) 0.179

D) 0.821

E) 0.387

13. Refer to Scenario 2. Suppose a researcher wants to test the null hypothesis that the true proportion

of children that were fully immunised in villages that received only the treatment was exactly 0.2 at

a 99% level of significance. What critical value will the researcher have to look up in the appropriate

statistical table in order to do this?

A) 1.28

B) 1.645

C) 1.96

D) 2.33

E) 2.575

14. Complete the following sentence to arrive at the correct statement of the Central Limit Theorem: “If

samples of size n are drawn randomly from a population with mean µ and standard deviation σ . .

. ”

A) then repeated observations of the sample mean x¯ will follow a normal distribution with mean

µ and standard deviation σ, regardless of the underlying distribution.

B) then if the sample size is sufficiently large (n≥ 30), repeated observations of the sample mean

x¯ will follow a normal distribution with mean µ and standard deviation σ, regardless of the

underlying distribution.

C) then if the sample size is sufficiently large (n≥ 10), repeated observations of the sample mean

x¯ will follow a normal distribution with mean µ and standard deviation σ/

p

n, regardless of

the underlying distribution.

D) then if the sample size is sufficiently large (n≥ 30), repeated observations of the sample mean

x¯ will follow a normal distribution with mean µ and standard deviation σ/

p

n, regardless of

the underlying distribution.

E) then if the sample size is sufficiently large (n≥ 30), repeated observations of the sample mean

x¯ will follow a normal distribution with mean µ and standard deviation σ/

p

n, as long as the

underlying distribution is normal.

5

CC0294 Semester 1 – 2016 Page 6 of 14

15. A researcher is interested in the following hypotheses about the mean of a population:

H0 : µ≤ 2 Ha : µ > 2

Based on a sample of 45 observations and the researcher calculates a t statistic of 2.2. At a 1% level

of significance what is the researcher’s conclusion?

A) The researcher is unable to reject a false null hypothesis.

B) The researcher fails to reject the null hypothesis.

C) The researcher accepts the null hypothesis.

D) The researcher rejects the null hypothesis.

E) The researcher commits a type I error.

Scenario 3 On the first day of class, students in an introductory economics course were asked their sex

and eye color. The results are summarized in the table below.

Blue Brown Green Hazel All

Female 24 21 10 11 66

Male 20 17 8 10 55

Total 44 38 18 21 121

16. Refer to Scenario 3. What is the probability that a randomly selected student in the class is a female

or has brown eyes?

A) 0.660

B) 0.860

C) 0.314

D) 0.545

E) 0.686

17. Refer to Scenario 3. What is the probability that a randomly selected student in the class is a female

and has hazel eyes?

A) 0.634

B) 0.091

C) 0.149

D) 0.174

E) 0.545

18. Refer to Scenario 3. What is the probability that a randomly selected student is a male, if we know

that they have hazel eyes?

A) 0.476

B) 0.182

C) 0.083

D) 0.078

E) 0.455

6

CC0294 Semester 1 – 2016 Page 7 of 14

19. An article published in the Canadian Journal of Zoology presented a method for estimating the body

fat percentage of North American porcupines. The method was illustrated with a sample of n = 25

porcupines. Based on this sample, a 95% bootstrap confidence interval for the average body fat

percentage of porcupines is 17.4% to 25.8%. Which of the following null hypotheses would be

rejected based on this confidence interval?

A) H0 : µ= 18.6%.

B) H0 : µ= 26.6%.

C) H0 : µ= 20.0%.

D) H0 : µ= 22.9%.

E) H0 : µ= 24.6%.

Scenario 4 Admissions records at MIT indicates that 6.7% of the graduate students enrolled are from

Canada.

20. Refer to Scenario 4. What is the minimum sample size for which the Central Limit Theorem applies

in this case?

A) n= 30.

B) n= 40.

C) n= 50.

D) n= 100.

E) n= 200.

21. Refer to Scenario 4. Find the mean and standard error of the sample proportion of Canadian students

in random samples of 100 graduate students at MIT.

A) pˆ = 0.067, SE = 0.0625.

B) pˆ = 0.067, SE = 0.006.

C) pˆ = 0.067, SE = 0.025.

D) pˆ = 0.670, SE = 0.250.

E) pˆ = 0.067, SE = 0.0067.

22. Refer to Scenario 4. Roughly what percentage of samples of 100 randomly selected graduate students

at MIT will have at least 10% of students from Canada?

A) 5%.

B) 6.7%.

C) 10%.

D) 18%.

E) 25%.

23. For a N(0, 1) density, what is the area to the left of z = −1.645.

A) 2.5%.

B) 3.5%.

C) 5%.

D) 10%.

E) 11%.

7

CC0294 Semester 1 – 2016 Page 8 of 14

24. For a N(0, 1) density, what is the area outside of the interval z = −2.326 and z = 1.282.

A) 2.5%.

B) 3.5%.

C) 5%.

D) 10%.

E) 11%.

25. A sample of 148 university students reports sleeping an average of 6.85 hours on weeknights. The

sample size is large enough to use the normal distribution, and a bootstrap distribution shows that

the standard error is SE = 0.175. Use a normal distribution to construct a 95% confidence interval

for the mean amount of weeknight sleep students get at this university.

A) 6.68 to 7.03 hours.

B) 6.51 to 7.19 hours.

C) 6.50 to 7.20 hours.

D) 6.52 to 7.21 hours.

E) 6.85 to 7.85 hours.

26. Suppose that a 95% confidence interval for µ is (54.8,60.8). Which of the following is most likely

the p-value for the test of H0 : µ= 56 versus Ha : µ 6= 56?

A) 0.031

B) 0.001

C) 0.016

D) 0.231

E) 0.05

27. The randomization distribution for testing the hypotheses H0 : µ1 = µ2 versus Ha : µ1 6= µ2 is

provided. The sample statistic is x¯1 − x¯2 = −2.5. Use the provided randomization distribution

(based on 100 samples) to estimate the p-value for this test.

A) 10%

B) 2%

C) 5%

D) 1%

E) 4%

8

CC0294 Semester 1 – 2016 Page 9 of 14

Scenario 5 Refer to the following probability tree diagram to find the requested probabilities. (Round your

answers to two decimal places.)

28. Refer to Scenario 5. What is P(Y |A)?

A) 0.60

B) 0.50

C) 0.40

D) 0.20

E) 0.06

29. Refer to Scenario 5. What is P(A|Y )?

A) 0.82

B) 0.30

C) 0.20

D) 0.18

E) 0.06

30. Refer to Scenario 5. What is P(X )?

A) 0.66

B) 0.50

C) 0.48

D) 0.42

E) 0.44

9

CC0294 Semester 1 – 2016 Page 10 of 14

Problem 1 [20 marks total—suggested time approx. 44 minutes]

Using data from the United States for 1970–2009, a researcher obtains the following regression output for a

model to predict life expectancy based on the total number of vehicles produced (measured in thousands).

Predictor Coefficient SE coef. t stat

Intercept 65.8455 0.2434 270.5326

Vehicles 0.05015 0.001286 38.9868

Regression statistics

R square 0.9756 SD error 0.3311 Observations 40

Analysis of variance

Source df SS

Regression 1 166.6377

Residual 38 4.1660

Total 39 170.8037

a) What is the correlation between vehicles produced and life expectancy? [2 marks]

b) Test whether the correlation between vehicles and life expectancy is statistically significant at the 1%

level. Show all your steps. [3 marks]

c) State in words your conclusion from the correlation test of significance. [2 marks]

d) Give a interpretation of the slope coefficient. [2 marks]

e) Test whether the slope coefficient is statistically significant at the 1% level. Show all your steps. [3

marks]

The researcher uses the bootstrap to investigate the regression slope estimate. The following shows the

results from 1,000 bootstrap samples.

f) Briefly explain the purpose of the bootstrap distribution in this context. [2 marks]

g) Use the bootstrap distribution to build a 99% confidence interval for the slope parameter. [2 marks]

h) Comment on your findings in b), e), and g). [2 marks]

i) What do you think about the overall validity of this study? [2 marks]

10

CC0294 Semester 1 – 2016 Page 11 of 14

Problem 2 [20 marks total—suggested time approx. 44 minutes]

An upcoming biology quiz has 10 multiple choice questions, each with 4 choices. Eugene has not studied

for the quiz. In fact, he hasn’t even opened the textbook since the beginning of term. In short, he knows

nothing about biology and will have to guess the answer to every question. As it happens, Eugene is very

good at statistics and he is going to compute the probability that he passes the quiz (5 or more correct

answers). Let X be the number of questions Eugene correctly guesses on the biology quiz.

a) What is the name of the distribution of X? Specify the parameters of X . [2 marks]

b) Compute the mean of X . [1 mark]

c) What is the probability that Eugene gets at least 1 answer correct? [2 marks]

d) What is the probability that Eugene passes the quiz? [3 marks]

Eugene’s biology lecturer Sandy, who is also very good at statistics, wants to evaluate whether the marks

on the quiz have improved since another 10-question quiz carried out earlier in the term. The table below

gives a sample of 10 grades on the two quizzes. Sandy is interested in testing whether the mean mark on

the second quiz is significantly higher than the mean mark on the first quiz.

First quiz 7 9 6 9 8 10 7 7 8 6

Second quiz 8 9 7 9 8 9 9 8 9 7

e) Clearly define your notation and state the null and alternative hypothesis assuming that the marks

from the first quiz come from a random sample of 10 students in the class and the grades on the

second quiz come from a different random sample of 10 students in the class. [2 marks]

f) Complete the test in (e) and clearly state the conclusion. [3 marks]

g) Clearly define your notation and state the null and alternative hypothesis assuming that the marks

recorded for the first quiz and second quiz are from the same 10 students (so that the first student

got 7 on the first quiz and 8 on the second quiz, and so on). [2 marks]

h) Complete the test in (g) and clearly state the conclusion. [3 marks]

i) Why are the hypothesis test results so different? Which is a better way to collect the data to answer

the question of whether grades are higher on the second quiz? [2 marks]

END OF THE EXAM

11

CC0294 Semester 1 – 2016 Page 12 of 14

Formulas, definitions, and distribution tables

Population and sample statistics.

Statistic Population Sample

size N n

mean µ=

∑N

i=1 x i

N

x¯ =

∑n

i=1 x i

n

standard deviation σ =

√√√∑N

i=1(x i −µ)2

N

s =

√√√∑ni=1(x i − x¯)2

n− 1

correlation ρ =

1

N

N∑

i=1

(x i −µx)

σx

(yi −µy)

σy

r =

1

n− 1

n∑

i=1

(x i − x¯)

sx

(yi − y¯)

sy

Descriptive statistics.

Statistic Definition

z-score zi =

x i − x¯

s

range range = max−min

inter-quartile range IQR=Q3 −Q1

outliers x iQ3 + 1.5(IQR)

95% rule x¯ ± 2s

interval estimate statistic±margin of error

95% confidence interval statistic± 2× SE

Standard deviations and standard errors for various statistics.

Statistic Standard deviation Standard error

x¯

σp

n

sp

n

pˆ

√√ p(1− p)

n

√√ pˆ(1− pˆ)

n

x¯1 − x¯2

√√√σ21

n1

+

σ22

n2

√√√ s21

n1

+

s22

n2

x¯1 − x¯2

√√√ p1(1− p1)

n1

+

p2(1− p2)

n2

√√√ pˆ1(1− pˆ1)

n1

+

pˆ2(1− pˆ2)

n2

12

CC0294 Semester 1 – 2016 Page 13 of 14

Confidence intervals.

100(1−α)% confidence interval: statistic± z∗

α/2 × SE for N(0,1) distribution

statistic± t∗

α/2 × SE for t distribution with df = n− 1

Null hypothesis Test statistic

H0 : µ= µ0

x¯ −µ0

s/

p

n

∼ tn−1

H0 : p = p0

pˆ− p0p

p0(1− p0)/n

∼ N(0,1)

H0 : µ1 −µ2 = 0 x¯1 − x¯2È

s21

n1

+

s22

n2

∼ tn−1 where n= min(n1,n2)

H0 : p1 − p2 = 0 pˆ1 − pˆ2r

1

n1

+ 1n2

pˆ(1− pˆ)

∼ N(0,1) where pˆ = x1 + x2

n1 + n2

Selected percentiles from the N(0,1) distribution.

Right-tail probability Confidence level z∗

0.10 80% 1.282

0.05 90% 1.645

0.025 95% 1.960

0.01 98% 2.326

0.005 99% 2.575

Selected percentiles from t distributions with various degrees of freedom.

Right-tail probability

df 0.05 0.025 0.01 0.005

8 1.860 2.306 2.896 3.355

9 1.833 2.262 2.821 3.250

38 1.686 2.024 2.427 2.712

98 1.661 1.984 2.365 2.627

Probability rules.

Conditional probability: P(A|B) = P(A and B)

p(B)

Multiplicative rule: P(A and B) = P(A|B)P(B)

Independence: P(A|B) = P(A)

Mutual exclusion: P(A and B) = 0

13

CC0294 Semester 1 – 2016 Page 14 of 14

Law of total probability.

P(A) = P(A and B) + P(A and (not B))

P(A) = P(A and B1) + P(A and B2) + · · ·+ P(A and Bk) where (B1,B2, . . . ,Bk) are disjoint

Bayes’ rule for two cases.

P(A|B) = P(B|A)P(A)

P(B|A)P(A) + P(B|not A)P(not A)

Bayes’ rule for j = 1, 2, . . . , k.

P(A j|B) = P(B|A j)P(A j)P(B|A1)P(A1) + P(B|A2)P(A2) + · · ·+ P(B|Ak)P(Ak) where (A1,A2, . . . ,Ak) are disjoint.

Population statistics for a discrete random variable X with probability function p(x).

Mean: µ=

n∑

i=1

x ip(x i)

Standard deviation: σ =

√√√ n∑

i=1

(x i −µ)2p(x i)

Suppose X follows a binomial distribution with parameters n and p.

Binomial probability: P(X = k) =

n

k

pk(1− p)n−k = n!

k!(n− k)! p

k(1− p)n−k

Expected value: n× p

Standard deviation:

p

np(1− p)

Simple linear regression.

Population regression model: y = β0 + β1x + ε

Sample regression model: yˆ = b0 + b1x

100(1−α)% confidence interval for βk bk ± t∗df,α/2 × SEbk for k = 0,1 with df = n− 2

t statistic for H0 : βk = 0 t =

bk

SEbk

for k = 0,1 with df = n− 2

t statistic for H0 : ρ = 0 t =

r

p

n− 2p

1− r2 with df = n− 2

Goodness-of-fit: R2 = r2 =

SSR

SST

Standard deviation of the error: sε =

√√ SSE

n− 2

14

SEAT NUMBER:

LAST NAME:

FIRST NAME:

SID:

CONFIDENTIAL EXAM PAPER

This paper is not to be removed from the examination room.

ECMT1010

Introduction to Economic Statistics

End of Semester Examination

Semester 1 – 2016

Total Duration: 2 hours and 10 minutes

Writing Time: 2 hours

Reading Time: 10 minutes

INSTRUCTIONS TO CANDIDATES

1. This is a closed book exam. Electronic devices, apart from a simple, non-programmable

calculator, are not permitted. Formulas, definitions, and distribution tables are provided

at the end of the exam.

2. This exam contains 30 Multiple Choice Questions and 2 Problems with multiple parts.

ANSWER ALL QUESTIONS.

3. Multiple Choice Questions are worth 0.5 marks each. Problems are worth 20 marks each.

The marks for each part of the Problems are indicated. Marks total 55.

4. Answer all Multiple Choice Questions on the answer sheet provided for that purpose.

Answer the Problems in the Exam Booklets provided.

5. This question paper must be returned with the Multiple Choice answer sheet and the

Exam Booklet answer script(s).

6. Take care to explain your answers and to write legibly.

Please check your examination paper is complete (14 pages) and indicate you have done

this by signing below.

I have checked the examination paper and affirm it is complete.

Student Signature: Date:

1

CC0294 Semester 1 – 2016 Page 2 of 14

30 Multiple Choice Questions [15 marks total—suggested time approx. 32 minutes].

1. A bank reports that 30% of households have a MasterCard, 20% have an American Express card, and

25% have a Visa card. Eight percent of households have both a MasterCard and an American Express

card. Twelve percent have both a Visa card and a MasterCard. Six percent have both a American

Express card and a Visa card. If a household has a MasterCard, what is the probability it also has a

Visa card?

A) 0.12

B) 0.25

C) 0.40

D) 0.43

E) 0.48

2. Lenovo Group Limited, a Hong Kong IT company, has a 30% share of the Hong Kong PC market.

Suppose 10 new PC buyers are selected at random from the Hong Kong population. What is the

probability that fewer than 3 bought their PC from Lenovo?

A) 0.028

B) 0.121

C) 0.233

D) 0.267

E) 0.382

3. A pair of (fair) dice is rolled once. What is the probability that the sum of the values on the two die

faces is 7?

A) 1/6

B) 7/36

C) 1/2

D) 1/18

E) 1/3

4. Calculate the expected value (mean) of the following discrete probability distribution:

x 1 2 3 4

p(x) 0.16 0.26 0.26 0.32

A) 1

B) 2.5

C) 1.86

D) 2.74

E) 2.0

2

CC0294 Semester 1 – 2016 Page 3 of 14

5. A psychologist is interested to test whether the IQ of statisticians is higher than 100. Based on a

random sample of 100 statisticians, the sample mean of IQ is 120. What is the p-value of the test

assuming a population standard deviation of 100?

A) 0.0000

B) 0.0228

C) 0.0456

D) 0.4207

E) 0.8414

Scenario 1 In a marketing research project, a major supermarket wants to study the relationship between

the annual consumption of ramen noodles (Y , in number of packs) and the annual income level of con-

sumers (X , in $000s). Based on a random sample of 100 customers, the linear regression model

Yi = β0 + β1X i + εi

is estimated with the following result:

Coefficient Standard Error

Intercept 55.4 32.3

Annual income (in $000s) −0.22 0.1

6. Refer to Scenario 1. What is the predicted annual consumption (in number of packs) of ramen

noodles for a consumer who earns $100,000 a year?

A) 2.2

B) 22.0

C) 33.4

D) 53.2

E) 55.4

7. Refer to Scenario 1. To see whether income level has an effect on the consumption of ramen noodles,

Adam, Simon and Tim consider the following hypotheses:

H0 : β1 = 0 Ha : β1 6= 0

They arrive at the following conclusions:

Adam: The null hypothesis is rejected at the 5% significance level.

Simon: The null hypothesis is rejected at the 2% significance level.

Tim: The null hypothesis is rejected at the 1% significance level.

Who is/are correct?

A) Tim only

B) Simon only

C) Adam only

D) Adam and Simon only

E) Adam, Simon and Tim

3

CC0294 Semester 1 – 2016 Page 4 of 14

8. Alcohol content in beer is believed to follow a normal distribution. A chemist takes a sample from

9 bottles of beer and measures the alcohol content, finding a sample mean of 7.5% and a sample

standard deviation of 1%. The chemist wishes to compute a 90% confidence interval for the mean.

However, the chemist mistakenly treats the sample standard deviation as if it were the population

standard deviation. What is the confidence interval constructed by the chemist?

A) (6.647, 8.153)

B) (6.952, 8.048)

C) (6.880, 8.120)

D) (7.073, 7.927)

E) (7.034, 7.966)

9. Suppose the chemist in the previous question realises he has made a mistake. If he correct his mistake

and recalculates the confidence interval using the same sample, how will the new confidence interval

compare to the previous one?

A) The new interval will be the same width as the previous one and will be shifted to the left to

account for small sample bias.

B) The new interval will be wider than the previous one and will be centered around the same

point estimate.

C) The new interval will be narrower than the previous one and will be centered around the same

point estimate.

D) The new interval will be narrower than the previous one and will be shifted to the left to account

for small sample bias.

E) This cannot be determined from the data given.

10. A lecturer hires a tutor to mark exam papers. To ensure that the tutor is grading correctly, the

lecturer marks a few exam papers herself and compares her mark with the mark given by the tutor.

She chooses these papers by physically going through the pile of exams and pulling out a paper “when

she feels like it”. This corresponds to which form of sampling?

A) Judgement sampling

B) Simple random sampling

C) Systematic sampling

D) Cluster sampling

E) Snowball sampling

11. A pair of (fair) die is rolled once. What is the probability that the sum of the values on the two die

faces is not a 7?

A) 1/6

B) 7/36

C) 1/3

D) 1/2

E) 5/6

4

CC0294 Semester 1 – 2016 Page 5 of 14

Scenario 2 The following table is derived from the Banerjee et al (2010) study on vaccination rates in

India.

Control Group Treatment Only Treatment plus

Incentive

Children not fully immunised 810 311 234

Children fully immunised 50 68 148

12. Refer to Scenario 2. What is the point estimate of the proportion of children that were fully immu-

nised in villages that received only the treatment and not the additional incentives?

A) 0.613

B) 0.219

C) 0.179

D) 0.821

E) 0.387

13. Refer to Scenario 2. Suppose a researcher wants to test the null hypothesis that the true proportion

of children that were fully immunised in villages that received only the treatment was exactly 0.2 at

a 99% level of significance. What critical value will the researcher have to look up in the appropriate

statistical table in order to do this?

A) 1.28

B) 1.645

C) 1.96

D) 2.33

E) 2.575

14. Complete the following sentence to arrive at the correct statement of the Central Limit Theorem: “If

samples of size n are drawn randomly from a population with mean µ and standard deviation σ . .

. ”

A) then repeated observations of the sample mean x¯ will follow a normal distribution with mean

µ and standard deviation σ, regardless of the underlying distribution.

B) then if the sample size is sufficiently large (n≥ 30), repeated observations of the sample mean

x¯ will follow a normal distribution with mean µ and standard deviation σ, regardless of the

underlying distribution.

C) then if the sample size is sufficiently large (n≥ 10), repeated observations of the sample mean

x¯ will follow a normal distribution with mean µ and standard deviation σ/

p

n, regardless of

the underlying distribution.

D) then if the sample size is sufficiently large (n≥ 30), repeated observations of the sample mean

x¯ will follow a normal distribution with mean µ and standard deviation σ/

p

n, regardless of

the underlying distribution.

E) then if the sample size is sufficiently large (n≥ 30), repeated observations of the sample mean

x¯ will follow a normal distribution with mean µ and standard deviation σ/

p

n, as long as the

underlying distribution is normal.

5

CC0294 Semester 1 – 2016 Page 6 of 14

15. A researcher is interested in the following hypotheses about the mean of a population:

H0 : µ≤ 2 Ha : µ > 2

Based on a sample of 45 observations and the researcher calculates a t statistic of 2.2. At a 1% level

of significance what is the researcher’s conclusion?

A) The researcher is unable to reject a false null hypothesis.

B) The researcher fails to reject the null hypothesis.

C) The researcher accepts the null hypothesis.

D) The researcher rejects the null hypothesis.

E) The researcher commits a type I error.

Scenario 3 On the first day of class, students in an introductory economics course were asked their sex

and eye color. The results are summarized in the table below.

Blue Brown Green Hazel All

Female 24 21 10 11 66

Male 20 17 8 10 55

Total 44 38 18 21 121

16. Refer to Scenario 3. What is the probability that a randomly selected student in the class is a female

or has brown eyes?

A) 0.660

B) 0.860

C) 0.314

D) 0.545

E) 0.686

17. Refer to Scenario 3. What is the probability that a randomly selected student in the class is a female

and has hazel eyes?

A) 0.634

B) 0.091

C) 0.149

D) 0.174

E) 0.545

18. Refer to Scenario 3. What is the probability that a randomly selected student is a male, if we know

that they have hazel eyes?

A) 0.476

B) 0.182

C) 0.083

D) 0.078

E) 0.455

6

CC0294 Semester 1 – 2016 Page 7 of 14

19. An article published in the Canadian Journal of Zoology presented a method for estimating the body

fat percentage of North American porcupines. The method was illustrated with a sample of n = 25

porcupines. Based on this sample, a 95% bootstrap confidence interval for the average body fat

percentage of porcupines is 17.4% to 25.8%. Which of the following null hypotheses would be

rejected based on this confidence interval?

A) H0 : µ= 18.6%.

B) H0 : µ= 26.6%.

C) H0 : µ= 20.0%.

D) H0 : µ= 22.9%.

E) H0 : µ= 24.6%.

Scenario 4 Admissions records at MIT indicates that 6.7% of the graduate students enrolled are from

Canada.

20. Refer to Scenario 4. What is the minimum sample size for which the Central Limit Theorem applies

in this case?

A) n= 30.

B) n= 40.

C) n= 50.

D) n= 100.

E) n= 200.

21. Refer to Scenario 4. Find the mean and standard error of the sample proportion of Canadian students

in random samples of 100 graduate students at MIT.

A) pˆ = 0.067, SE = 0.0625.

B) pˆ = 0.067, SE = 0.006.

C) pˆ = 0.067, SE = 0.025.

D) pˆ = 0.670, SE = 0.250.

E) pˆ = 0.067, SE = 0.0067.

22. Refer to Scenario 4. Roughly what percentage of samples of 100 randomly selected graduate students

at MIT will have at least 10% of students from Canada?

A) 5%.

B) 6.7%.

C) 10%.

D) 18%.

E) 25%.

23. For a N(0, 1) density, what is the area to the left of z = −1.645.

A) 2.5%.

B) 3.5%.

C) 5%.

D) 10%.

E) 11%.

7

CC0294 Semester 1 – 2016 Page 8 of 14

24. For a N(0, 1) density, what is the area outside of the interval z = −2.326 and z = 1.282.

A) 2.5%.

B) 3.5%.

C) 5%.

D) 10%.

E) 11%.

25. A sample of 148 university students reports sleeping an average of 6.85 hours on weeknights. The

sample size is large enough to use the normal distribution, and a bootstrap distribution shows that

the standard error is SE = 0.175. Use a normal distribution to construct a 95% confidence interval

for the mean amount of weeknight sleep students get at this university.

A) 6.68 to 7.03 hours.

B) 6.51 to 7.19 hours.

C) 6.50 to 7.20 hours.

D) 6.52 to 7.21 hours.

E) 6.85 to 7.85 hours.

26. Suppose that a 95% confidence interval for µ is (54.8,60.8). Which of the following is most likely

the p-value for the test of H0 : µ= 56 versus Ha : µ 6= 56?

A) 0.031

B) 0.001

C) 0.016

D) 0.231

E) 0.05

27. The randomization distribution for testing the hypotheses H0 : µ1 = µ2 versus Ha : µ1 6= µ2 is

provided. The sample statistic is x¯1 − x¯2 = −2.5. Use the provided randomization distribution

(based on 100 samples) to estimate the p-value for this test.

A) 10%

B) 2%

C) 5%

D) 1%

E) 4%

8

CC0294 Semester 1 – 2016 Page 9 of 14

Scenario 5 Refer to the following probability tree diagram to find the requested probabilities. (Round your

answers to two decimal places.)

28. Refer to Scenario 5. What is P(Y |A)?

A) 0.60

B) 0.50

C) 0.40

D) 0.20

E) 0.06

29. Refer to Scenario 5. What is P(A|Y )?

A) 0.82

B) 0.30

C) 0.20

D) 0.18

E) 0.06

30. Refer to Scenario 5. What is P(X )?

A) 0.66

B) 0.50

C) 0.48

D) 0.42

E) 0.44

9

CC0294 Semester 1 – 2016 Page 10 of 14

Problem 1 [20 marks total—suggested time approx. 44 minutes]

Using data from the United States for 1970–2009, a researcher obtains the following regression output for a

model to predict life expectancy based on the total number of vehicles produced (measured in thousands).

Predictor Coefficient SE coef. t stat

Intercept 65.8455 0.2434 270.5326

Vehicles 0.05015 0.001286 38.9868

Regression statistics

R square 0.9756 SD error 0.3311 Observations 40

Analysis of variance

Source df SS

Regression 1 166.6377

Residual 38 4.1660

Total 39 170.8037

a) What is the correlation between vehicles produced and life expectancy? [2 marks]

b) Test whether the correlation between vehicles and life expectancy is statistically significant at the 1%

level. Show all your steps. [3 marks]

c) State in words your conclusion from the correlation test of significance. [2 marks]

d) Give a interpretation of the slope coefficient. [2 marks]

e) Test whether the slope coefficient is statistically significant at the 1% level. Show all your steps. [3

marks]

The researcher uses the bootstrap to investigate the regression slope estimate. The following shows the

results from 1,000 bootstrap samples.

f) Briefly explain the purpose of the bootstrap distribution in this context. [2 marks]

g) Use the bootstrap distribution to build a 99% confidence interval for the slope parameter. [2 marks]

h) Comment on your findings in b), e), and g). [2 marks]

i) What do you think about the overall validity of this study? [2 marks]

10

CC0294 Semester 1 – 2016 Page 11 of 14

Problem 2 [20 marks total—suggested time approx. 44 minutes]

An upcoming biology quiz has 10 multiple choice questions, each with 4 choices. Eugene has not studied

for the quiz. In fact, he hasn’t even opened the textbook since the beginning of term. In short, he knows

nothing about biology and will have to guess the answer to every question. As it happens, Eugene is very

good at statistics and he is going to compute the probability that he passes the quiz (5 or more correct

answers). Let X be the number of questions Eugene correctly guesses on the biology quiz.

a) What is the name of the distribution of X? Specify the parameters of X . [2 marks]

b) Compute the mean of X . [1 mark]

c) What is the probability that Eugene gets at least 1 answer correct? [2 marks]

d) What is the probability that Eugene passes the quiz? [3 marks]

Eugene’s biology lecturer Sandy, who is also very good at statistics, wants to evaluate whether the marks

on the quiz have improved since another 10-question quiz carried out earlier in the term. The table below

gives a sample of 10 grades on the two quizzes. Sandy is interested in testing whether the mean mark on

the second quiz is significantly higher than the mean mark on the first quiz.

First quiz 7 9 6 9 8 10 7 7 8 6

Second quiz 8 9 7 9 8 9 9 8 9 7

e) Clearly define your notation and state the null and alternative hypothesis assuming that the marks

from the first quiz come from a random sample of 10 students in the class and the grades on the

second quiz come from a different random sample of 10 students in the class. [2 marks]

f) Complete the test in (e) and clearly state the conclusion. [3 marks]

g) Clearly define your notation and state the null and alternative hypothesis assuming that the marks

recorded for the first quiz and second quiz are from the same 10 students (so that the first student

got 7 on the first quiz and 8 on the second quiz, and so on). [2 marks]

h) Complete the test in (g) and clearly state the conclusion. [3 marks]

i) Why are the hypothesis test results so different? Which is a better way to collect the data to answer

the question of whether grades are higher on the second quiz? [2 marks]

END OF THE EXAM

11

CC0294 Semester 1 – 2016 Page 12 of 14

Formulas, definitions, and distribution tables

Population and sample statistics.

Statistic Population Sample

size N n

mean µ=

∑N

i=1 x i

N

x¯ =

∑n

i=1 x i

n

standard deviation σ =

√√√∑N

i=1(x i −µ)2

N

s =

√√√∑ni=1(x i − x¯)2

n− 1

correlation ρ =

1

N

N∑

i=1

(x i −µx)

σx

(yi −µy)

σy

r =

1

n− 1

n∑

i=1

(x i − x¯)

sx

(yi − y¯)

sy

Descriptive statistics.

Statistic Definition

z-score zi =

x i − x¯

s

range range = max−min

inter-quartile range IQR=Q3 −Q1

outliers x i

95% rule x¯ ± 2s

interval estimate statistic±margin of error

95% confidence interval statistic± 2× SE

Standard deviations and standard errors for various statistics.

Statistic Standard deviation Standard error

x¯

σp

n

sp

n

pˆ

√√ p(1− p)

n

√√ pˆ(1− pˆ)

n

x¯1 − x¯2

√√√σ21

n1

+

σ22

n2

√√√ s21

n1

+

s22

n2

x¯1 − x¯2

√√√ p1(1− p1)

n1

+

p2(1− p2)

n2

√√√ pˆ1(1− pˆ1)

n1

+

pˆ2(1− pˆ2)

n2

12

CC0294 Semester 1 – 2016 Page 13 of 14

Confidence intervals.

100(1−α)% confidence interval: statistic± z∗

α/2 × SE for N(0,1) distribution

statistic± t∗

α/2 × SE for t distribution with df = n− 1

Null hypothesis Test statistic

H0 : µ= µ0

x¯ −µ0

s/

p

n

∼ tn−1

H0 : p = p0

pˆ− p0p

p0(1− p0)/n

∼ N(0,1)

H0 : µ1 −µ2 = 0 x¯1 − x¯2È

s21

n1

+

s22

n2

∼ tn−1 where n= min(n1,n2)

H0 : p1 − p2 = 0 pˆ1 − pˆ2r

1

n1

+ 1n2

pˆ(1− pˆ)

∼ N(0,1) where pˆ = x1 + x2

n1 + n2

Selected percentiles from the N(0,1) distribution.

Right-tail probability Confidence level z∗

0.10 80% 1.282

0.05 90% 1.645

0.025 95% 1.960

0.01 98% 2.326

0.005 99% 2.575

Selected percentiles from t distributions with various degrees of freedom.

Right-tail probability

df 0.05 0.025 0.01 0.005

8 1.860 2.306 2.896 3.355

9 1.833 2.262 2.821 3.250

38 1.686 2.024 2.427 2.712

98 1.661 1.984 2.365 2.627

Probability rules.

Conditional probability: P(A|B) = P(A and B)

p(B)

Multiplicative rule: P(A and B) = P(A|B)P(B)

Independence: P(A|B) = P(A)

Mutual exclusion: P(A and B) = 0

13

CC0294 Semester 1 – 2016 Page 14 of 14

Law of total probability.

P(A) = P(A and B) + P(A and (not B))

P(A) = P(A and B1) + P(A and B2) + · · ·+ P(A and Bk) where (B1,B2, . . . ,Bk) are disjoint

Bayes’ rule for two cases.

P(A|B) = P(B|A)P(A)

P(B|A)P(A) + P(B|not A)P(not A)

Bayes’ rule for j = 1, 2, . . . , k.

P(A j|B) = P(B|A j)P(A j)P(B|A1)P(A1) + P(B|A2)P(A2) + · · ·+ P(B|Ak)P(Ak) where (A1,A2, . . . ,Ak) are disjoint.

Population statistics for a discrete random variable X with probability function p(x).

Mean: µ=

n∑

i=1

x ip(x i)

Standard deviation: σ =

√√√ n∑

i=1

(x i −µ)2p(x i)

Suppose X follows a binomial distribution with parameters n and p.

Binomial probability: P(X = k) =

n

k

pk(1− p)n−k = n!

k!(n− k)! p

k(1− p)n−k

Expected value: n× p

Standard deviation:

p

np(1− p)

Simple linear regression.

Population regression model: y = β0 + β1x + ε

Sample regression model: yˆ = b0 + b1x

100(1−α)% confidence interval for βk bk ± t∗df,α/2 × SEbk for k = 0,1 with df = n− 2

t statistic for H0 : βk = 0 t =

bk

SEbk

for k = 0,1 with df = n− 2

t statistic for H0 : ρ = 0 t =

r

p

n− 2p

1− r2 with df = n− 2

Goodness-of-fit: R2 = r2 =

SSR

SST

Standard deviation of the error: sε =

√√ SSE

n− 2

14