1INF1344H-无代写
时间:2023-11-14
1INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
INF 1344H:
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, &
t-Test of Sample Mean
Tao Wang, MSc., PhD
Faculty of Information
University of Toronto
2INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Review of Last Week
• Sampling Distribution of Proportion
• Bootstrap
• Confidence Interval
• Hypotheses Testing (I)
3INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Topic of This Week
• Hypothesis Testing (II)
• Comparing proportion or mean
– One-Proportion z-Test
– One-Sample t-Test
– Two-Sample t-Test
4INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Reasoning of Hypothesis Testing (1 of 8)
• There are four basic parts to a hypothesis test:
1. Hypotheses
2. Model
3. Mechanics
4. Conclusion
• Let’s look at each part in detail…
5INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Reasoning of Hypothesis Testing (2 of 8)
1. Hypotheses
– The null hypothesis: To perform a hypothesis test, we must
first translate our question of interest into a statement about
model parameters
§ In general, we have
– The alternative hypothesis: The alternative hypothesis, HA,
contains the values of the parameter we consider plausible
when we reject the null
6INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Reasoning of Hypothesis Testing (3 of 8)
2. Model
– To plan a statistical hypothesis test, specify the model you will use to test
the null hypothesis and the parameter of interest
– All models require assumptions, so state the assumptions and check any
corresponding conditions
– Your plan should end with a statement like
§ Because the conditions are satisfied, I can model the sampling distribution of
the proportion with a Normal model
7INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Reasoning of Hypothesis Testing (4 of 8)
2. Model
§ Watch out, though
§ It might be the case that your model step ends with “Because the conditions
are not satisfied, I can’t proceed with the text”
§ If that’s the case, stop and reconsider
– Each test we discuss in the book has a name that you should include in
your report
– The test about proportions is called a one-proportion z-test
8INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
One-Proportion z-Test
• The conditions for the one-proportion z-test are the same
as for the one proportion z-interval. We test the hypothesis
using the statistic
where
• When the conditions are met and the null hypothesis is
true, this statistic follows the standard Normal model, so
we can use that model to obtain a P-value
9INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Reasoning of Hypothesis Testing (5 of 8)
3. Mechanics
– Under “mechanics” we place the actual calculation of our test statistic from
the data
– Different tests will have different formulas and different test statistics
– Usually, the mechanics are handled by a statistics program or calculator,
but it’s good to know the formulas
10
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Reasoning of Hypothesis Testing (6 of 8)
3. Mechanics
– The ultimate goal of the calculation is to obtain a P-value
– Definition of P-value
§ The P-value is the probability that the observed statistic value (or an even more
extreme value) could occur if the null model were correct
§ If the P-value is small enough, we’ll reject the null hypothesis
§ Note: The P-value is a conditional probability—it’s the probability that the
observed results could have happened if the null hypothesis is true
11
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
For Example: Finding A P-Value (1 of 2)
RECAP: The Ministry of Transportation claims that 80% of
candidates pass driving tests, but a survey of 90 randomly
selected local teens who have taken the test found only 61 who
passed.
QUESTION: What’s the P-value for the one-proportion z-test?
ANSWER: I have n = 90, x = 61, and a hypothesized p = 0.80.
12
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
For Example: Finding A P-Value (2 of 2)
calculated from the z-table
13
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Reasoning of Hypothesis Testing (7 of 8)
4. Conclusion
– The conclusion in a hypothesis test is always a statement about the null
hypothesis
– The conclusion must state either that we reject or that we fail to reject the
null hypothesis
– And, as always, the conclusion should be stated in context
14
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Reasoning of Hypothesis Testing (8 of 8)
4. Conclusion
– Your conclusion about the null hypothesis should never be the end of a
testing procedure
– Often there are actions to take or policies to change
15
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Alternative Alternatives (1 of 3)
• There are three possible alternative hypotheses:
– HA: parameter < hypothesized value
– HA: parameter ≠ hypothesized value
– HA: parameter > hypothesized value
16
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Alternative Alternatives (2 of 3)
• HA: parameter ≠ value is known as a two-sided alternative
because we are equally interested in deviations on either
side of the null hypothesis value.
• For two-sided alternatives, the P-value is the probability of
deviating in either direction from the null hypothesis value.
17
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Alternative Alternatives (3 of 3)
• The other two alternative hypotheses are called one-sided
alternatives
• A one-sided alternative focuses on deviations from the null
hypothesis value in only one direction
• Thus, the P-value for one-sided alternatives is the probability of
deviating only in the direction of the alternative away from the
null hypothesis value
18
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
P-Values and Decisions: What to Tell About a Hypothesis
Test (1 of 2)
• How small should the P-value be in order for you to reject the null hypothesis?
• It turns out that our decision criterion is context-dependent
– When we’re screening for a disease and want to be sure we treat all those who
are sick, we may be willing to reject the null hypothesis of no disease with a fairly
large P-value
– A longstanding hypothesis, believed by many to be true, needs stronger evidence
(& a correspondingly small P-value) to reject it
• Another factor in choosing a P-value is the importance of the issue being tested
19
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
P-Values and Decisions: What to Tell About a Hypothesis
Test (2 of 2)
• Your conclusion about any null hypothesis should be accompanied by
the P-value of the test
– If possible, it should also include a confidence interval for the parameter of
interest
• Don’t just declare the null hypothesis rejected or not rejected
– Report the P-value to show the strength of the evidence against the
hypothesis
– This will let each reader decide whether or not to reject the null hypothesis
20
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Summary: One-Proportion z-Test
• Be able to perform a hypothesis test for a proportion
– The null hypothesis has the form H0: p = p0
– We find the standard deviation of the sampling distribution of
the sample proportion by assuming that the null hypothesis
is true:
– We refer the statistic
to the standard Normal model
21
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Why We Often Set Our Claims as Alternative Hypotheses?
• There is a temptation to state your claim as the null hypothesis
– However, you cannot prove a null hypothesis true
• So, it makes more sense to use what you want to show as the alternative
– This way, when you reject the null, you are left with what you want to show
22
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Alpha Levels and Significance (1 of 5)
• Sometimes we need to make a firm decision about whether or not to
reject the null hypothesis
• When the P-value is small, it tells us that our data are rare given the null
hypothesis
• How rare is “rare”?
23
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Alpha Levels and Significance (2 of 5)
• We can define “rare event” arbitrarily by setting a threshold for our P-
value
– If our P-value falls below that point, we’ll reject H0. We call such results
statistically significant
– The threshold is called an alpha level, denoted by α
24
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Alpha Levels and Significance (3 of 5)
• Common alpha levels are 0.10, 0.05, and 0.01
– You have the option—almost the obligation—to consider your alpha level
carefully and choose an appropriate one for the situation
• The alpha level of the test is also called the significance level
– When we reject the null hypothesis, we say that the test is “significant at
that level”
25
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Alpha Levels and Significance (4 of 5)
• What can you say if the P-value does not fall below α?
– You should say that “The data have failed to provide sufficient evidence to
reject the null hypothesis”
– Don’t say that you “accept the null hypothesis”
• Recall that, in a jury trial, if we do not find the defendant guilty, we say
the defendant is “not guilty”—we don’t say that the defendant is
“innocent”
26
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Alpha Levels and Significance (5 of 5)
• P-value gives the reader far more information than just
stating you reject or fail to reject the null
• In fact, by providing a P-value to the reader, you allow that
person to make his or her own decisions about the test
– What you consider to be statistically significant might not be
the same as what someone else considers statistically
significant
– There is more than one alpha level that can be used, but
each test will give only one P-value
27
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Practical vs. Statistical Significance
• What do we mean when we say that a test is statistically significant?
– All we mean is that the test statistic had a P-value lower than our alpha
level
• Don’t be lulled into thinking that statistical significance carries with it any
sense of practical importance or impact
• For large samples, even small, unimportant (“insignificant”) deviations
from the null hypothesis can be statistically significant
• In addition to P-value, it’s good practice to report a confidence interval, to
indicate the range of plausible values for the parameter
28
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Critical Values for Hypothesis Tests (1 of 4)
• When we make a confidence interval, we find a critical value, z*, to
correspond to our selected confidence level
• Prior to the use of technology, P-values were difficult to find, and it was easier
to select a few common alpha values and learn the corresponding critical
values for the Normal model
29
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Critical Values for Hypothesis Tests (4 of 4)
• When the alternative is one-
sided, the critical value puts
all of α on one side:
• When the alternative is two-
sided, the critical value splits
α equally into two tails:
30
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Relationship between Intervals and Tests (1 of 2)
• More precisely, a level C confidence interval contains all of the
possible null hypothesis values that would not be rejected by a two-
sided hypothesis test at alpha level 1 − C
– So, a 95% confidence interval matches a 0.05 level test for these data
• Confidence intervals are naturally two-sided, so they match exactly
with two-sided hypothesis tests
– When the hypothesis is one sided, the corresponding alpha level is (1 − C)/2
31
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Relationship between Intervals and Tests (3 of 2)
32
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Decision Errors (1 of 6)
• Nobody’s perfect—even with lots of evidence we can still make the
wrong decision
• When we perform a hypothesis test, we can make mistakes in two
ways:
I. The null hypothesis is true, but we mistakenly reject it (Type I error)
II.The null hypothesis is false, but we fail to reject it (Type II error)
33
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Decision Errors (2 of 6)
• Which type of error is
more serious
depends on the
situation at hand
• In other words, the
gravity of the error is
context dependent
34
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Decision Errors (3 of 6)
• How often will a Type I error occur?
– Since a Type I error is rejecting a true null hypothesis, the probability of a
Type I error is our α level
• When you choose level α, you’re setting the probability of a Type I error
to α
35
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Decision Errors (4 of 6)
• When H0 is false and we fail to reject it, we have made a Type II error
– We assign the letter β to the probability of this mistake
– It’s harder to assess the value of β because we don’t know what the value
of the parameter really is
– There is no single value for β - we can think of a whole collection of βs, one
for each incorrect parameter value
36
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Decision Errors (5 of 6)
• We could reduce β for all alternative parameter values by
increasing α
– This would reduce β but increase the chance of a Type I error
– This tension between Type I and Type II errors is inevitable
• The only way to reduce both types of errors is to collect more
data
37
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Decision Errors (6 of 6)
• What we really want is to detect a false null hypothesis
• When H0 is false and we reject it, we have done the right thing
– A test’s ability to detect a false hypothesis is called the power of the test
38
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Power and Sample Size
• The power of a test is the probability that it correctly rejects a false null
hypothesis
• When the power is high, we can be confident that we’ve looked hard
enough at the situation
• The power of a test is 1 − β
39
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Effect Size (1 of 2)
• The distance between the null hypothesis value, p0, and the truth, p, is
called the effect size
• Cohen’d is a common measurement for effect size of one-sample t-test
Cohen’d =
40
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Effect Size (2 of 2)
• An example using R
T-test conventional effect sizes, proposed by Cohen are:
0.2 or smaller (small effect)
0.5 (moderate effect)
0.8 or above (large effect)
(Cohen, 1998; Navarro, 2015).
This means that if two groups' means don't differ
by 0.2 standard deviations or more, the difference is trivial,
even if it is statistically significant.
41
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Power and Sample Size (3 of 3)
• The previous figure seems to show that if we reduce Type I error, we
must automatically increase Type II error
• But, we can reduce both types of error by making both curves narrower
• How do we make the curves narrower? Increase the sample size
42
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Summary
• Sometimes P-values are compared to a pre-determined α-level, to
decide whether to reject the null hypothesis
• α is the probability of making a Type I error—rejecting the null hypothesis
when it is, in fact, true
• Statistical significance simply implies strong evidence that the null
hypothesis is false, not that the difference is important
• Knowing what is the power of test and effect size
43
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Comparing Mean of Samples
44
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Sampling Model for the Sample Mean (1 of 4)
• Just as we did before, we will base both our confidence interval and
our hypothesis test on the sampling distribution model
• When a random sample is drawn from any population with mean μ
and standard deviation σ, the sample mean has a sampling
distribution with mean μ and standard deviation,
• The Central Limit Theorem also told us that the sampling
distribution model for mean of sufficiently large samples is Normal
no matter what population distribution is
45
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
• All we need is a random sample of quantitative data
• And the true population standard deviation, σ
– Well, that’s a problem…
We don’t know the value of σ
The Sampling Model for the Sample Mean (2 of 4)
46
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
• Proportions have a link between the proportion value and the
standard deviation of the sample proportion
• We’ll do the best we can: estimate the population parameter
σ with the sample statistic s
• Our resulting standard error is
The Sampling Model for the Sample Mean (3 of 4)
Another problem: when n is very small, the result can be far from actual.
47
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
• We now have extra variation in our standard error from s, the sample
standard deviation
– We need to allow for the extra variation so that it does not mess up the
margin of error and P-value, especially for a small sample
• And, the shape of the sampling model changes—the model is no longer
Normal
• So, how can we handle this?
The Sampling Model for the Sample Mean (4 of 4)
48
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Gosset’s t
• William S. Gosset, an employee of the Guinness
Brewery in Dublin, Ireland, worked long and hard to
find out what the sampling model was
• The sampling model that Gosset found has been
known as Student’s t
• The Student’s t-models form a whole family of related
distributions that depend on a parameter known as
degrees of freedom
– We often denote degrees of freedom as df, and the
model as tdf
49
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
A t - Interval for Means (1 of 3)
A practical sampling distribution model for means
When the conditions are met, the standardized sample mean
follows a Student’s t-model with n − 1 degrees of freedom.
We estimate the standard error with
50
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
A t - Interval for Means (2 of 3)
• When Gosset corrected the model for the extra uncertainty, the margin of
error got bigger
– Your confidence intervals will be just a bit wider and your P-values just a bit
larger than they were with the Normal model
• By using the t-model, you’ve compensated for the extra variability in
precisely the right way
51
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
A t - Interval for Means (3 of 3)
One-sample t-interval for the mean
• When the conditions are met, we are ready to find the
confidence interval for the population mean, μ.
• The confidence interval is
where the standard error of the mean is
• The critical value t*n−1 depends on the particular confidence level, C,
that you specify and on the number of degrees of freedom, n − 1,
which we get from the sample size
52
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Using the t-Table to Find Critical Values (1 of 3)
• The student’s t-model is
different for each value of
degrees of freedom
• Because of this, statistics books
usually have one table of t-
model critical values for
selected confidence levels
53
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Using the t-Table to Find Critical Values (2 of 3)
• The tables run down the page for as many degrees of freedom as can fit
• For enough degrees of freedom, the t-model gets closer and closer to
the Normal, so the tables give a final row with the critical values from the
Normal model (labelled as infinity)
– This table will not provide the critical values for all the degrees of freedom
– We need approximate the critical value in such cases or use software
54
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Using the t-Table to Find Critical Values (3 of 3)
• For example, consider df = 39
– The correct value lies between 1.684 and 1.690
– Either be conservative and go with the bigger value,
1.690, or use software
55
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
For Example: A One-Sample T-interval For the Mean (1 of 3)
• In 2004, a team of researchers published a study of contaminants in farmed
salmon. Fish from many sources were analyzed for 14 organic contaminants.
The study expressed concerns about the level of contaminants found.
• One of those was the insecticide mirex, which has been shown to be
carcinogenic and is suspected to be toxic to the liver, kidneys, and endocrine
system. One farm in particular produced salmon with very high levels of mirex.
56
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
For Example: A One-Sample T-interval For the Mean (2 of 3)
57
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
For Example: A One-Sample T-interval For the Mean (3 of 3)
58
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
A Confidence Interval for Means (1 of 2)
• Student’s t-models are unimodal, symmetric, and bell shaped, just like
the Normal
• But t-models with only a few degrees of freedom have much fatter tails
than the Normal
59
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
A Confidence Interval for Means (2 of 2)
Figure 18.2
t-model (solid curve) Normal model (dashed curve)
– As the degrees of freedom increase, the t-models look
more and more like the Normal
– In fact, the t-model with infinite degrees of freedom is
exactly Normal
60
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Assumptions and Conditions (1 of 4)
• Gosset found the t-model by simulation
• Years later, when Sir Ronald A. Fisher showed mathematically that
Gosset was right, he needed to make some assumptions to make the
proof work
• We will use these assumptions when working with Student’s t
61
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Assumptions and Conditions (2 of 4)
• Independence Assumption:
– Independence Assumption: The data values should be independent
– Randomization Condition: The data arise from a random sample or suitably
randomized experiment.
§ Randomly sampled data (particularly from an SRS) are ideal
– 10% Condition: When a sample is drawn without replacement, the sample
should be no more than 10% of the population
62
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Assumptions and Conditions (3 of 4)
• Normal Population Assumption:
– We can never be certain that the data are from a population that follows a
Normal model, but we can check the:
– Nearly Normal Condition: The data come from a distribution that is
unimodal and symmetric
§ Check this condition by making a histogram or Normal probability plot
63
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Assumptions and Conditions (4 of 4)
• Nearly Normal Condition:
– The smaller the sample size (n < 15 or so), the more closely the data
should follow a Normal model
– For moderate sample sizes (n between 15 and 40 or so), the t works well
as long as the data are unimodal and reasonably symmetric
– For sample sizes larger than 40 or 50, the t methods are safe to use unless
the data are extremely skewed
Important! When asking you to conduct a t-Test, you should first analyze the four
conditions and explain if they are satisfied.
64
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Make a Picture, Make a Picture, Make a Picture
• Pictures tell us far more about our data set than a list of the data ever
could
• The only reasonable way to check the Nearly Normal Condition is with
graphs of the data
– Make a histogram of the data and verify that its distribution is unimodal and
symmetric with no outliers
– You may also want to make a Normal probability plot to see that it’s
reasonably straight
65
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Summary: One-sample t-test for the mean
• The assumptions and conditions for the one-sample t-test for the
mean are the same as for the one-sample t-interval
• We test the hypothesis using the statistic
• The standard error of the sample mean is
• When the conditions are met and the null hypothesis is true, this
statistic follows a Student’s t model with n − 1 df. We use that
model to obtain a P-value
66
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Determining the Sample Size (1 of 2)
• To find the sample size needed for a particular confidence level with a
particular margin of error (ME), solve this equation for n:
• The problem with using the equation above is that we
don’t know most of the values
• We can overcome this:
– We can use s from a small pilot study
– We can use z* in place of the necessary t value
67
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Example
A company claims its program will allow your computer to download movies quickly. We’ll test the
free evaluation copy by downloading a movie several times, hoping to estimate the mean download
time with a margin of error of only 4 minutes. We think the standard deviation of download times is
about 5 minutes.
How many trial downloads must we run if we want 95% confidence in our estimate with a margin of
error of only 4 minutes?
68
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Example
That’s a small sample size, so we’ll use (6-1)=5 degrees of freedom to substitute an appropriate value.
At 95%, Solving the equation one more time
69
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Determining the Sample Size (2 of 2)
• Sample size calculations are never exact
– The margin of error you find after collecting the data won’t match exactly
the one you used to find n
• The sample size formula depends on quantities you won’t have until you
collect the data, but using it is an important first step
• Before you collect data, it’s always a good idea to know whether the
sample size is large enough to give you a good chance of being able to
tell you what you want to know
70
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
What Can Go Wrong? (1 of 3)
• Don’t confuse proportions and means
Ways to Not Be Normal:
• Beware multimodality
– The Nearly Normal Condition clearly fails if a histogram of the data has two or
more modes
• Beware of severely skewed data
– If the data are very skewed, try re-expressing the variable
• Set outliers aside – respectfully
– But remember to report on these outliers individually
71
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Comparing the Means of Independent Samples
Two-Sample t-Test
72
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
73
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Should you buy generic rather than brand-name batteries?
• A Statistics student designed a study to test battery life. He wanted to know whether there was
any real difference between brand-name batteries and a generic brand.
• To estimate the difference in mean lifetimes, he kept a battery-powered CD player continuously
playing the same CD, with the volume control fixed at level five, and measured the time until no
more music was heard through the headphones.
• For his trials, he used six sets of AA alkaline batteries from two major battery manufacturers: a
well-known brand name and a generic brand. He measured the time in minutes until the sound
stopped. To account for possible changes in the CD player’s performance over time, he
randomized the run order by choosing sets of batteries at random
74
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Test Results
75
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Comparing the Means of Independent Samples – Plot the Data
• The natural display for
comparing two groups is
boxplots of the data for the
two groups, placed side-by-
side. For example:
• From this plot, we can get
some idea about two sets of
data, their mean, variation,
and outliers, if any are
present
Boxplots comparing the brand-name and generic
batteries suggest a difference in duration.
76
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Standard Error and Sampling Model (1 of 3)
• Once we have examined the side-by-side boxplots, we can turn to the
comparison of two means
• This time the parameter of interest is the difference between the two
means, μ1 − μ2
• The statistic of interest is the difference in the two observed means
77
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Standard Error and Sampling Model (2 of 3)
• Remember that, the variance of the sum or difference of
two independent random variables is the sum of their
variances
• So, the standard deviation of the difference between two
sample means is
• We still don’t know the true standard deviations of the two
groups, so we need to estimate of the standard error as
78
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Standard Error and Sampling Model (3 of 3)
• Because we are working with means and estimating the standard
error of their difference using the data, we shouldn’t be surprised
that the sampling model is a Student’s t
• So confidence interval for difference in two means is
where
– The confidence interval we build is called a two-sample t-interval (for
the difference in means)
– The corresponding hypothesis test is called a two-sample hypothesis
test
79
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Sampling Distribution for the Difference Between Two Means
• When the conditions are met, the standardized
sample difference between the means of two
independent groups
can be modeled by a Student’s t-model with a number
of degrees of freedom found with a special formula
• We estimate the standard error with
80
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Assumptions and Conditions (1 of 2)
• Independence Assumption (each condition needs to be checked for both
groups): not only the sample but also the two groups are independent
• Randomization Condition: Were the data collected with suitable
randomization (representative random samples or a randomized
experiment)?
• 10% Condition: We don’t usually check this condition for differences of
means
§ We will check it for means only if we have a very small population or an
extremely large sample
81
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Assumptions and Conditions (2 of 2)
• Normal Population Assumption:
– Nearly Normal Condition: This must be checked for both groups. A violation
by either one violates the condition
Important! When asking you to conduct a t-Test, you should first analyze the four
conditions and explain if they are satisfied.
82
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Confidence Interval (1 of 2)
• When the conditions are met, we are ready to find the
confidence interval for the difference between means
of two independent groups.
• The confidence interval is
83
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Confidence Interval (2 of 2)
– Where the standard error of the difference of the
means is
• The critical value depends on the particular
confidence level, C, that you specify and on the
number of degrees of freedom, which we get from the
sample sizes and a special formula
84
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Degrees of Freedom
• The special formula for the degrees of freedom for our
t critical value is a bear:
• Because of this, we will let technology calculate
degrees of freedom for us!
85
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Two Sample t-Test (1 of 2)
• The hypothesis test we use is the two-sample t-test for means
• The conditions for the two-sample t-test for the difference between the
means of two independent groups are the same as for the two-sample t-
interval
86
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Two Sample t-Test (2 of 2)
87
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Pooled t-Test (1 of 3)
• When testing the null hypothesis that two proportions were equal, we
could assume their variances were equal as well
– This led us to pool our data for the hypothesis test and the corresponding
test is called the pooled t-test
• The assumption of equal variances is a strong one, is often not true, and
is difficult to check
• For these reasons, we recommend that you generally use the (unpooled)
two-sample t-test instead
88
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Pooled t-Test (2 of 3)
• Equal Variance Assumption states that the variances of the two
populations from which the samples have been drawn are equal
• The corresponding Similar Spreads Condition really just consists of
looking at the boxplots to check that the spreads are not very different
89
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Pooled t-Test (3 of 3)
• If we assume that the variances are equal, we can
estimate the common variance from the numbers we
already have:
• Substituting into our standard error formula, we get:
• Our degrees of freedom are now df = n1 + n2 − 2
90
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
The Pooled t-Test and Confidence Interval
• The conditions for the pooled t-test and corresponding
confidence interval are the same as for our earlier two-
sample t procedures, with the assumption that the
variances of the two groups are the same
• For the hypothesis test, our test statistic is
which has df = n1 + n2 − 2
• Our confidence interval is
91
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Everybody Out of the Pool?
• When should you use pooled-t methods rather than two-sample t
methods?
• Never (Well, hardly ever)
• Because the advantages of pooling are small, and you are allowed to
pool only rarely (when the equal variance assumption is met)
• It’s never wrong not to pool
92
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Pooling (1 of 2)
• Pooled methods show up in several places in Statistics, and often in
more complex situations than a two-sample test
• The most sensitive part of most inference procedures is how we estimate
the standard deviation of the test statistic so we can use it as a ruler to
judge significance or construct a confidence interval
• So if you think the groups you are comparing have the same standard
deviation, you should look for a pooled method that takes advantage of
that
93
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Pooling (2 of 2)
• In a randomized comparative experiment, we start by assigning our
experimental units to treatments at random
• Each treatment group therefore begins with the same population
variance
• In this case assuming the variances are equal is still an assumption, and
there are conditions that need to be checked, but at least it’s a plausible
assumption
94
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Determining the Sample Size
• Optimal sample size depends on:
– Alpha level of the test
– Nature of the alternative hypothesis (the null is a difference of zero)
– Size of the difference you want to detect, if present.
– Your guess at the common sigma
– Desired power
• To control the margin of error, set the appropriate formula equal to the
desired margin of error and solve for sample size
95
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
What Have We Learned? (1 of 4)
• Know how to construct and interpret the two-sample t-interval for the
difference between the means of two populations
– Know the requisite Assumptions and Conditions:
§ (i) Independence and Normality of the individual responses
§ (ii) Independence of the samples
96
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
What Have We Learned? (2 of 4)
• Be able to perform and interpret a two-sample t-test of the difference
between the means of two populations
– Understand the relationship between testing and providing a confidence
interval
– The most common null hypothesis is that the means are equal
97
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
What Have We Learned? (3 of 4)
• Recognize that in special cases in which it is reasonable to assume
equal variances between the groups, we can also use a pooled t-test for
testing the difference between means
– This may make sense particularly for randomized experiments in which the
randomization has produced groups with equal variance to start with and
the null hypothesis is that a treatment under study has had no effect
98
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
What Have We Learned? (4 of 4)
• Keep in mind that these two sample methods require both independence
of observations within each group and independence between the
groups themselves
– These methods are not appropriate for paired or matched data
99
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
Review Reading
• De Veaux et al. 2021: chapter 16, 18, and 19
or
• Witte & Witte 2017: chapter 11, 13, and 14
100
INF 1344H
Introduction to Statistics for Data Science
Lecture 9
Hypothesis Test, & t-Test of Sample Mean
About the Final Exam
• Time: December 9 (Sat), 9am to 11am
Venue: please check your Acorn (McLennan Physical Labs)
• Format: Multiple-choices and short-answers (like longer version of assignment 5)
• Content: Class 1 to class 5 (30% approximately)
Class 6 to class 11 (70% approximately)
No test on R coding
• You can bring one A4 size ‘cheatsheet’ for formulas.
– Formulas ONLY. NO written expressions on definitions and assumptions, etc. Anyone violating
this will be given zero score for final and be reported to the Graduate Studies.


essay、essay代写