WEEK 1-无代写
时间:2023-03-22
WEEK 1
INTRODUCTION AND GENERAL INFORMATION
ABOUT QUANTITATIVE METHODS 2 (QM2)
ESTIMATION AND HYPOTHESIS TESTING OF A
POPULATION MEAN
References:
S: § 9.3-9.4, 10.1-10.3, 10.5, 12.1-12.4
W: Ch 1-2, § 3.1-3.6, 3.8-3.9
Notes prepared by:
Dr László Kónya
Quantitative Methods 2
ECON 20003
UoM, ECON 20003, Week 1 2
INTRODUCTION AND GENERAL INFORMATION
Subject Coordinator
and Lecturer: Dr Mehmet Özmen
Office: 352, 3rd floor, FBE Building
Phone: 9035 8912;
Email: mehmet.ozmen@unimelb.edu.au
Consultation hour: WED 14:15 - 15:15 in my office or on Zoom
Lectures: There are two one-hour lectures a week across two streams.
(1) WED 11:00 – 12:00 (Medical C403 – Wright Theatre)
OR WED 15:15 – 16:15 (Medical C403 – Wright Theatre)
AND
(2) THURS 12:00 – 13:00 ( LAW GM15 – David P. Derham Theatre)
OR THURS 15:15 – 16:15 (Old Arts 122 – Public Lecture Theatre)
The lectures will be (i) in-person and (ii) recorded and made available on LMS, granted that some technical
problem does not prevent IT to do so.
UoM, ECON 20003, Week 1 3
Tutorials: Start in week 1.
Sign up for one and only one tutorial class by the end of
week 1 via the Student Portal.
Subject website: LMS
The subject guide, some review resources, the lecture notes and the
tutorial materials can be downloaded from this website in due time.
Also, time to time, important messages might be uploaded onto the
subject website, so visit it regularly.
Texts: In this subject you will learn topics from intermediate
statistics and introductory econometrics alike.
For this reason, I cannot endorse a single textbook that
covers all topics. Instead, we have a prescribed text
and a recommended text.
Although the lecture notes are fairly detailed, they are
not meant to substitute for these books.
Recommended text (S):
Selvanathan, E.A. et al. (2021):
Business Statistics, 8th edition,
Cengage Learning Australia.
Prescribed text (W):
Wooldridge, J.M. et al. (2021):
Introductory Econometrics Asia-
Pacific Edition, 2nd edition,
Cengage Learning Australia.
These books can be purchased online. See the details in the Subject Guide.
UoM, ECON 20003, Week 1 4
UoM, ECON 20003, Week 1 5
Software: R programming language and software environment for statistical
computing and graphics. It is not a menu driven program, you
need to learn to write R codes to operate it.
R is a fully functional standalone program, but RStudio, an
integrated development environment for R, can assist in writing,
compiling, debugging and executing R codes.
Both programs are free. R can be downloaded from the
Comprehensive R Archive Network (CRAN) website
(https://cran.r-project.org/), while RStudio is available at
https://www.rstudio.com/products/rstudio/download/.
You can find the instructions for downloading and installing R and
RStudio on a Windows computer in the Tutorial 1 handout and in
the first R video. If you have a Mac, you can get help from Student
IT at https://studentit.unimelb.edu.au/.
You will need to use these programs to complete the assignments.
Your knowledge on how to operate these programs will not be
tested on the exam, but some exam questions will be based on R
printouts.
UoM, ECON 20003, Week 1 6
Prerequisites:
The main subject prerequisite for Quantitative Methods 2 is Quantitative
Methods 1 (ECON10005), the first-year quantitative subject taught in the
Faculty of Business and Economics, or an equivalent subject taught at
other faculties of the university or at another tertiary institution.
QM2 students are supposed to master the topics discussed in Chapters
1-13, 17-18 of the S(elvanathan) book. Namely,
Graphical descriptive techniques and numerical descriptive statistics;
Joint, marginal and conditional probability;
Random variables and probability distributions (binomial, uniform,
normal and t);
Sampling distribution, estimation and hypothesis testing of one
population mean and the difference between two population means;
Simple linear regression and correlation.
Some but not all these topics are reviewed in the Revision slides, which
are available on Canvas. You might also look at
https://creativemaths.net/blog/videos-for-teaching-and-learning-statistics/.
UoM, ECON 20003, Week 1 7
In QM2 we have four aims:
• To learn quantitative skills that are essential in the majority of jobs that
business, commerce and economics graduates obtain.
• To implement these skills using examples from accounting, management,
marketing, economics and finance.
• To perform the various statistical procedures using the R statistical
software and RStudio.
• The focus is on choosing the appropriate technique for each problem,
implementing it correctly, and interpreting the results.
Subject overview:
Building on QM1 or some equivalent prerequisite unit, QM2 is structured
as follows.
Part 1: Statistical inference about one, two or more populations with
parametric and nonparametric techniques (6 weeks)
Part 2: Regression analysis with cross-sectional data (4 weeks)
Part 3: Regression analysis with time series data (2 weeks)
UoM, ECON 20003, Week 1 8
Week 1: Introduction
Review of estimation and hypothesis testing of a single
population mean
Desirable properties of point estimators
Parametric and nonparametric techniques
The assumption of normality
Comparing two population means or central locations with
parametric and nonparametric techniques
The chi-square, t and F distributions
Inferences about one or two population variances
Inferences about one or two population proportions
Comparing several population means with analysis of variance
Chi-square tests for the analysis of frequencies
…
Week 3:
Week 4:
Week 5:
Week 6:
Part 1
Week 2:
UoM, ECON 20003, Week 1 9
Week 10: Dummy dependent variable regression models
Part 2
Week 6: …
Measures of association
Week 7: Linear regression: specification, estimation and assessment
Week 8: General F-test
Omitted and irrelevant variables
Alternative functional forms
Multicollinearity
Week 9: Heteroskedasticity
Using the sample regression equation
Dummy independent variables in regression models
UoM, ECON 20003, Week 1 10
Part 3
Week 11: Cross-sectional vs. time-series data
Regression analysis with time series data
Autocorrelation
Week 12: Stationary and non-stationary processes
Spurious regression
Dickey-Fuller unit root tests
UoM, ECON 20003, Week 1 11
Assessment: Tutorial participation and homeworks 10%
Three assignments 15%
Mid-semester test 5%
Final exam 70%
Tutorial classes:
The primary aim of the tutorials is to learn and practice the selection and
the implementation of appropriate statistical techniques both manually
(i.e., with a calculator) and with R in a wide range of examples.
You are supposed to have a Casio FX82 (with any suffix) calculator, or
something similar, and to be able to use it efficiently (including its STAT
mode). If you do not know how to do so, see the manual on Canvas.
Before each tutorial
i. Attend / watch the previous week’s lecture.
ii. Go through the relevant tutorial handout. They are self-explanatory
and sufficiently detailed, so follow the instructions, and reproduce
the illustrative example(s).
UoM, ECON 20003, Week 1
12
iii. If you run into some problem and need help, ask your tutor before or
during the tutorial class for assistance, but do not expect your tutor
to cover the entire handout.
After each tutorial but the last, attempt the additional “Exercises for
assessment”, type and submit your solutions and answers via Canvas
Quiz by 10am on Wednesday the following week (see further details in
the Subject Guide).
To get the 10% credit for tutorial participation and homeworks, at least 10
weeks you must (i) attend the tutorial and (ii) submit the homework from
the previous tutorial in time.
Three assignments:
There will be three assignments for 5% credit each.
i. Online submission via Canvas.
ii. Students can work alone or in a group of two (not three or four …).
iii. Students in a group must submit a single copy of their assignment
and will get the same assignment marks.
iv. No late assignments are accepted and no extensions will be given.
UoM, ECON 20003, Week 1 13
Mid-semester test:
There will be one online mid-semester test for 5%.
i. The test will be held during week 6 of the semester.
ii. Students can undertake the test online via LMS at any time of their
choosing between 8am and 5pm 3 April.
iii. The test will cover the material presented during lectures up to the
end of week 4 and in the tutorials up to the end of week 5.
iv. The test will consist of 10 multiple choice questions and 5 true or
false questions.
v. There is a time limit of 30 minutes to complete the test.
vi. To complete the test successfully students will need to have critical
value tables for each of the distributions covered during the lectures,
a formula sheet and a calculator.
Note: Students who lose some internal (i.e., tutorial homework, assignment
or mid-semester test) marks for valid reasons can apply for special
consideration to get the lost marks transferred to the final exam.
UoM, ECON 20003, Week 1 14
Final exam at the end of the semester:
It is worth 70% of the final grade for this subject.
i. It will be a 2-hour in-person exam during the University's normal
end of semester assessment period. The date, time and the exact
format of the exam will be provided by the University's
administration later in the semester.
ii. The exam will cover all materials discussed during lectures and
tutorials throughout the semester. There will be no surprises. The
questions and tasks on the final exam will be similar in terms of style
and difficulty to those in the tutorial problem sets, in the assignments
and in the mid-semester test.
iii. It will be an closed-book exam, so formula sheet and statistical
tables will be provided with the exam.
iv. On the exam students will neither be asked nor tested on how to use
R / RStudio, but they will need to be familiar with R printouts.
v. Students must pass the exam, i.e. to achieve 50% of the total exam
mark, to successfully complete the subject.
UoM, ECON 20003, Week 1 15
vi. Supplementary exam will not be provided in case of absence during
the examination period, unless it is due to serious illness or some
other legitimate reason. In those exceptional cases apply for special
consideration.
(https://students.unimelb.edu.au/admin/special-consideration).
If you wish to complete this unit successfully,
1) Watch the lecture videos and participate in the tutorials every week.
2) Read the relevant chapter(s) from the prescribed and/or recommended
textbooks every week.
3) Work on the tutorial problems every week.
4) Cooperate with your fellow students.
5) If you do not understand something, get help as soon as possible
before you miss too much – ask your tutor, your lecturer, or visit the Ed
Discussion Board on LMS.
6) Repeatedly review what you learnt previously in order to reinforce
everything several times.
characterized by their mean, variance (standard deviation), and shape.
UoM, ECON 20003, Week 1 16
DESCRIBING A SINGLE POPULATION:
ESTIMATION AND HYPOTHESIS TESTING OF A
POPULATION MEAN
• In QM1 you already learnt about how to describe a single population by
estimating its mean and by performing a hypothesis test on it.
You are supposed to be familiar with these topics, so in QM2 we just
briefly review them. If you need more assistance, study the Review 1,
Review 2, …, Review 5 sets of slides and the recommended texts.
• In case of random sampling, any statistic is a random variable because
it is a function of some randomly selected sample items.
The probability distribution of all possible values of a statistic generated
by random samples of the same size is called the sampling distribution
of the given statistic.
Like probability distributions in general, sampling distributions can be
SAMPLING DISTRIBUTION OF THE SAMPLE MEAN
• Consider a random sample drawn from population X : (µ ;σ).
(identically and independently distributed)
The point estimator of the population mean (µ) is the sample mean:
It has the following properties.
i)
The expected value of the sample mean is equal to the
population mean.
UoM, ECON 20003, Week 1 17
ii)
The variance of the sample mean is equal to the
population variance divided by the sample size
(assuming, that the sampled population is infinitely large,
or is finite but sampling is with replacement).
The standard deviation of a statistic is called its standard error.
iii)
The standard error of the sample mean (just
like its variance) is a decreasing function of the
sample size.
iv) Shape / form of the sampling distribution:
a) If the sampled population is normally distributed, X : N(µ ;σ),
then the sample mean is also normally distributed.
b) If the sampled population is not normal but n ≥ 30, then
according to the Central Limit Theorem (CLT), the sample
mean is still approximately normally distributed.
UoM, ECON 20003, Week 1 18
ESTIMATING THE POPULATION MEAN
• Assume that we repeatedly draw random samples of the same size from
a normally distributed population, or that the sampled population is not
normal but we draw reasonably large samples and thus the CLT holds.
If zα/2 denotes the (1-α /2)×100% percentile of the standard normal
distribution, that is P(Z > zα/2) = α /2, then
Ex ante (before drawing a sample) Ex post (after drawing a sample)
Unknown
constant
Random interval limits
X-bar is an estimator, while x-bar
is an estimate, i.e. a particular
number, so this is not a random
interval and it either contains µ or
(even if n, σ and α are fixed) not.
UoM, ECON 20003, Week 1 19
When σ is unknown but the sample size is large enough to estimate σ
satisfactorily, we can replace σ with its estimate s to obtain an estimate
of the standard error of the sample mean,
When σ is known and the sample mean is at least approximately
normally distributed, or σ is unknown but the sampled population itself
is at least approximately normal, the (1-α)×100% confidence interval
estimate of the population mean is given by
if σ is known and X-bar ~ N
if σ is unknown but X ~ N
Note: Commenting on these intervals we cannot talk about ‘probability’.
Instead we use the term ‘confidence’ as we have (1-α) degree of
confidence that the single interval obtained from the sample at hand is
not extreme but indeed contains µ.
UoM, ECON 20003, Week 1 20
Ex 1: (Selvanathan et al., p. 401. ex. 10.58)
A NSW Department of Consumer Affairs officer responsible for enforcing laws
concerning weights and measures routinely inspects containers to determine if
the contents of 10kg bags of potatoes weigh at least 10kg as advertised on the
container. A random sample of 25 bags whose container claims that the net
weight is 10kg yielded the following statistics: x-bar =10.52, s2 =1.43. Estimate
with 95% confidence the mean weight of a bag of potatoes. Assume that the
weights of 10kg bags of potatoes are normally distributed.
Let X denote the weight of a bag of potatoes. We do not know its population
mean and standard deviation, but we are told that it is normally distributed.
Hence, we can develop the 95% confidence interval using
From the sample
From Selvanathan et al. Appendix B, p.1097, tα/2,df=n-1 = t0.025,24 = 2.064.
With 95% confidence the mean weight of a bag of potatoes is
somewhere between 10.02kg and 11.02kg.
UoM, ECON 20003, Week 1
21
TESTING THE POPULATION MEAN
• There are two types of statistical inference.
Estimation Hypothesis testing
Point estimation Interval estimation
• Hypothesis testing, in general, is a six-step procedure:
1) Set up the null and alternative hypotheses;
2) Determine the test statistic and its sampling distribution;
3) Specify the significance level;
4) Define the decision rule;
5) Take a sample and calculate the value of the test statistic;
6) Make a statistical decision and draw the conclusion.
UoM, ECON 20003, Week 1 22
The details are discussed on the Review 2 slides, here we just consider an illustrative example.
Ex 2: (Selvanathan et al., p. 510, ex. 12.53)
A diet doctor claims that the average Australian is more than 10kg overweight.
To test his claim, a random sample of 100 Australians were weighed, and the
difference between their actual weight and their ideal weight was calculated and
recorded.
a) Do the data allow us to infer at the 5% significance level that the doctor’s
claim is true?
i. Let diff denote the difference between actual weight and ideal weight (kg).
HA : µ > 10 and H0 : µ = 10
ii. Since n is 100, we can rely on CLT. However, σ is unknown, so let’s
assume that the population of diff is not extremely non-normal.
Given this assumption, the test statistic is t.
UoM, ECON 20003, Week 1 23
iii. The significance level is given, α= 0.05.
iv. This is a right-tail test, so the entire rejection region is located under the
right tail of the sampling distribution.
Reject H0 if the value of the test statistic calculated from the sample
is greater than tα,df=n-1 = t0.05,99 ≈ t0.05,100 = 1.660.
v. The sample mean and standard deviation are 12.175 and 7.898,
respectively.
vi. Since tobs = 2.7539 > 1.660 = tα, we reject H0. Hence, at α= 0.05 there is
enough evidence to conclude that the diet doctor is right, the average
Australian is more than 10kg overweight.
UoM, ECON 20003, Week 1 24
b) Find the p-value of the test. What does it suggest?
The t-table does not show the exact p-value. However, it is certainly smaller
than 0.005, so H0 can be rejected even at the 0.5% significance level.
c) Perform the test in part (a) with R this time.
You will learn in the tutorials how to import the data from an Excel file and how
to run the t-test with R/RStudio. The t.test(diff, mu = 10, alternative = "greater")
command returns the following printout:
R reports the test statistic (t), the
degrees of freedom, the p-value,
the alternative hypothesis, the
95% ‘one-sided’ confidence
interval (do not worry about it)
and the sample mean.
Check whether R performed the required test (i.e. a right-tail t-test this time)
and whether the p-value < α = 0.05. Since p-value ≈ 0.0035, we reject H0.
UoM, ECON 20003, Week 1 25
UoM, ECON 20003, Week 1 26
• The (single-sample) Z / t test and the corresponding confidence interval
for a population mean are based on the following assumptions:
i. The data is a random sample of independent observations (see
Review 1, slide 3).
ii. The variable of interest is quantitative and continuous (see Review
1, slides 5-6) …
iii. … and is measured on an interval or ratio scale (see Review 1,
slides 7-8).
iv. Either (Z test) the population standard deviation, σ, is known and
the sample mean is at least approximately normally distributed
(because the sampled population itself is normally distributed or
the sample size is large and thus CLT holds),
or (t-test) σ is unknown but the sampled population is normally
distributed (at least approximately).
Note: Despite assumption (ii), in practice the Z / t test can be used for discrete
variables as well, granted that they assume a large number of different
values.
UoM, ECON 20003, Week 1 27
• Important definitions and concepts, like
population, sample, parameter, statistic, descriptive statistics,
inferential statistics, sampling error, non-sampling error, types of
data/variable, measurement scales, estimator, estimate, etc.
• To compute normal probabilities. To use the standard normal and t
tables (Tables 3 and 4 in Appendix B of Selvanathan et al., 2021).
• The sampling distribution of the sample mean.
• To estimate a population mean with a single value and with a confidence
interval.
• To carry out the six steps of hypothesis testing and to apply this
procedure to testing a population mean.
• To estimate a population mean and to run a test for a population mean
with R/RStudio.
• To be able to use your calculator efficiently.
WHAT SHOULD YOU KNOW?