xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

扫码添加客服微信

扫描添加客服微信

程序代写案例-1CIS 315

时间：2021-04-15

1CIS 315

Introduction to Business Data Analytics

WEEK 9

MARCH 8, 2021

2Course Roadmap

Data Analytics

Chapter 3:

Data Visualization

Chapter 2:

Descriptive Analytics

Chapter 7:

Regression

Chapter 8:

Time Series Analysis &

Forecasting

Chapter 12, 13, & 14:

Optimization & Prescriptive

Analytics

Experimental Design

Chapter 11:

Simulation

3Today’s Agenda

• Experimental Design

• Sampling

4Sampling

• So far you have been given

observational data

• Little to no control over variables

• Merely observe their values

• For example, Age, Income,

etc…

5Sampling

• But if you designed an experiment…

• Then you could control one or

more variables

• And observe their effect

6Sampling

• Experiment

• Apply treatments to experimental

units (such as people, animals,

land, etc.) and then observe the

effect of the treatments on the

experimental units

7Sampling

• Observational Studies

• Observe subjects and measure

variables of interest without

assigning treatments to subjects.

8Sampling

Experiment

How would you design an experiment?

Observational Study

How would you design an observational study?

Suppose you want to study the effect of smoking on

lung capacity in women

9Sampling

Experiment

Find 100 women, age 20, who do not currently

smoke

Randomly assign half (50) to the smoking treatment

and the other half to the no smoking treatment

Those in the smoking treatment should smoke a

pack a day for 10 years, while those in the no

smoking treatment should remain smoke free for 10

years.

After 10 years, measure lung capacity for each of

the 100 women

Analyze, interpret, and draw conclusions from the

data.

Observational Study

Find 100 women, age 30, for which 50 have

been smoking a pack a day for 10 years while

the other 50 have remained smoke free for those

10 years

Measure lung capacity for each of the 100

women

Analyze, interpret, and draw conclusions from

the data.

Suppose you want to study the effect of smoking on

lung capacity in women

10

Sampling

An economist obtains the unemployment rate and gross

state product for a sample of states over the past 10

years, with the objective of examining the relationship

between the unemployment rate and the gross state

product by census region.

Experiment or Observational Study?

Observational Study

11

Sampling

A psychologist tests the effect of three different

feedback programs by randomly assigning five rats to

each program and recording their response times at

specified intervals during the program.

Experiment or Observational Study?

Experiment

12

Sampling

A design in which the treatments are randomly

assigned to the experimental unit

Random Experiment

Is this a good choice?

13

Sampling

We want to test the null hypothesis that that

treatment means are all equal against the

alternative that at least two differ

The objective of a randomized design is to

usually compare the treatment means

! = " = # = ⋯ = $% = ℎ

14

Sampling

For example, suppose you randomly selected five males and

five females and looked at their SAT scores.

450 475 500 525 550 575 600 625 650

Female MaleFemale Average: 550

Male Average: 590

Can we conclude that there is a difference in test

scores between Females and Males?

No, as the difference in the means is

dominated by the sampling variability

15

Sampling

For example, suppose you randomly selected five males and

five females and looked at their SAT scores.

Female Average: 550

Male Average: 590

Can we conclude that there is a difference in test

scores between Females and Males?

Probably, as the difference in the means is

large relative to the sampling variability

450 475 500 525 550 575 600 625 650

Female Male

16

Sampling

The key to sampling is to compare the difference between

the treatment means with the amount of sampling variability

SST = Sum of Squares for Treatments

SSE = Sum of Squares for Error =$!"#$ !(̅! − )% Where ! is the sample size of the ith treatment, ̅!is the mean of the treatment and ̅ is the mean of the overall sample = ∑&"#'! (#& − ̅#)% + ∑&"#'" (%& − ̅%)%+ … + ∑&"#'# ($& − ̅$)%

Looks complicated, but we can rewrite to… = (#−1)#% + (%−1)%% +⋯+ ($−1)$%

Where s2 is the sample variance = ∑$%&' ()!*)̅)"'*&

17

Sampling

=$!"#$ !(̅! − )%

= ∑&"#'! (#& − ̅#)% + ∑&"#'" (%& − ̅%)%+ … + ∑&"#'# ($& − ̅$)% = 5 − 1 2250 + 5 − 1 2250 = 18000

But what we are really after is the MST and MSE…

For example, suppose you randomly selected five males and

five females and looked at their SAT scores.

= 5 550 − 570 2 + 5 590 − 570 2 = 4000450 475 500 525 550 575 600 625 650

Female Male

= (#−1)#% + (%−1)%% +⋯+ ($−1)$%

Don’t worry about calculating these right now…

18

Sampling

= (()$*#, where k-1 is the degrees of freedom

= − = 1800010 − 2 = 2250

MST = Mean Square for Treatments

(measures the variability among the treatment means)

MSE = Mean Square for Error

(measures the variability within the treatments)

= 40002 − 1 = 4000

19

Sampling

− =

Use the SST, SSE, MST, MSE => F-Statistic

− = 40002250 = 1.78

The F-statistic determines if the means of the treatment

groups are equal (H0) or different (Ha)

Can we reject the null hypothesis that means of the

treatments are equal?

20

Sampling

This graph will change based on the degrees of freedom in

the numerator and denominator, but what you want is that

your F-statistic is larger than the value of F at the

designated level of significance

21

Sampling

Historically, you would read a

table like this with the degrees

of freedom of the numerator in

the columns and the degrees of

freedom of the denominator in

the rows for a certain level of

significance… but now we

have technology that will give

you these numbers…

22

Sampling

Let’s choose level of significance of α=0.05. For our

example, the cut-off value of F0.05 is 5.32

Our F-statistic was 1.78

Can we reject the null hypothesis that means of the

treatments are equal?

NO!

Our F-stat = 1.78 < 5.31 = F0.05

We fail to reject the null hypothesis that the means are equal.

23

Sampling

Suppose you randomly selected five males and five females

and looked at their SAT scores.

450 475 500 525 550 575 600 625 650

Female Male

Let's do the same as before, but with this data.

24

Sampling = (5 − 1)(62.5) + (5 − 1)(62.5) = 500 = − = 50010 − 2 = 62.5 = 4000( ℎ ℎ ℎ ) − = 400062.5 = 64.0

Can we reject the null hypothesis that means of the

treatments are equal?

YES!

Our F-stat = 64.0 > 5.31 = F0.05

We can reject the null hypothesis that the means are equal.

25

Sampling

Since we rejected the null hypothesis that the means are equal,

we can conclude that the SAT mean score of males differs

from that of females.

450 475 500 525 550 575 600 625 650

Female Male

26

Sampling

This type of analysis is called ANOVA or Analysis of Variance

df SS MS F

Treatments − 1 SST = − 1 Error − SSE = −

Total − 1 SS(Total) = +

27

Sampling

Total Sum of Squares

SS(Total)

df=n-1

Sum of Squares for Treatments

SST

df=k-1

Sum of Squares for Error

SSE

df=n-k

28

Sampling

Example: Find the F-statistic and determine if we can reject

the null hypothesis of the means being equal at 0.10 level of

significance (F0.10=2.87)

df SS MS F

Treatments − 1 = 3 2794.39 = − 1

Error − = 36 762.30 = −

Total − 1 = 39 3556.69

Based on the table, tell me something about the experiment…

k=4, so we are comparing 4

different things

n=40, so we have 40 observations

29

Sampling

df SS MS F

Treatments − 1 = 3 2794.39 = − 1

Error − = 36 762.30 = −

Total − 1 = 39 3556.69

= − 1 = 2794.394 − 1 = 931.46 = − = 762.3040 − 4 = 21.18

= 931.4621.18 = 43.99

30

Sampling

= 931.4621.18 = 43.99 (F0.10=2.87)>

Can we reject the null hypothesis that means of the

treatments are equal at 0.10 level of significance?

Yes!

31

Sampling Example

Robotics researchers investigated whether robots could be

trained to behave like ants in an ant colony. Robots were

trained and randomly assigned to “colonies” (i.e. groups)

consisting of 3, 6, 9, or 12 robots. The robots were assigned

the tasks of foraging for “food” and recruiting another robot

when they identified a resource-rich area. One goal of the

experiment was to compare the mean energy expended (per

robot) of the four different sizes of colonies.

32

Sampling Example

1. Experiment or Observational Study? If experiment, what

kind?

2. Identify the treatments and the dependent variable.

3. Set up the null and alternative hypotheses of the test.

4. The following results were reported:

• F=7.70

• numerator df=3, denominator df=56

• F0.05=2.76

Interpret the results.

Randomized Experiment

Treatments: 3, 6, 9, 12 robots & Dependent Variable: Energy Expended

: = = = , : +

Reject H0 and conclude that the means

differ for the robot treatments

33

Sampling Example

We rejected H0 about the all means being equal for the robot

treatments, but that doesn’t tell you anything about the

difference between each treatment

Now we want to test = = = = = =

Essentially testing if the mean of the treatment with

3 robots is the same as the mean of the treatment

with 6 robots, etc…

When you have equal treatment sample sizes you

use the Tukey Method

You don’t want to do this by hand, instead have

the computer calculate the t-statistic and reject

that the means are equal if the t-statistic found is

larger than the t-statistic critical value (same

procedure as when we used the F-statistic)

34

Sampling Example

AB AC AD BC BD CD

A B A C A D B C B D C D

Mean 250.78 261.06 250.78 269.95 250.78 249.32 261.06 269.95 261.06 249.32 269.95 249.32

Variance 22.42 14.95 22.42 20.26 22.42 27.07 14.95 20.26 14.95 27.07 20.26 27.07

Observations 10.00 10.00 10.00 10.00 10.00 10.00 10.00 10.00 10.00 10.00 10.00 10.00

df 18.00 18.00 18.00 18.00 18.00 18.00

t Stat -5.32 -9.28 0.66 -4.74 5.73 9.48

P(T<=t) one-tail 0.00 0.00 0.26 0.00 0.00 0.00

t Critical one-tail 1.33 1.33 1.33 1.33 1.33 1.33

Means of the two

samples

5.32 > 1.33, we can reject the

hypothesis that the mean of A

is the same as the mean of B

Do this for every

combination.

Which combination can you

not reject the hypothesis that

the means are the same?

A=3 robots, B=6 robots, C=9 robots, D=12 robots

35

Sampling

We can also introduce a better experimental design with

better controls to help account for variability

Take our SAT score example, what else could

we control for?

School GPA

Classes SES

Instead of selecting independent samples, we

choose experimental units (students in this

example) that are matched sets.

The matched sets are called blocks.

36

Sampling

df SS MS F

Treatments − 1 SST = − 1

Blocks − 1 SSB = − 1

Error − − + 1 SSE = − − + 1

Total − 1 SS(Total)

37

Sampling

Randomized Block Design

Attempting to reduce the sampling variability

of the experimental units in each block, which

in turn reduces the MSE

Now again we compare SAT scores of male

and female high school seniors, but now we

select matched pairs of females and males

according to their GPA and school

38

Sampling

Block Female SAT Score Male SAT Score Block Mean

School A, 2.75 GPA 540 530 535

School B, 3.00 GPA 570 550 560

School C, 3.25 GPA 590 580 585

School D, 3.50 GPA 640 620 630

School E, 3.75 GPA 690 690 690

Treatment Mean 606 594

39

Sampling

Follow the same procedure but now with the blocks

Start with the SST, which measures the variation

between female and male means

=CDE"$ (̅F" − ̅)#

Squaring the distance between each treatment mean

and the overall mean, multiplying each squared

distance by the number of measurement for the

treatment, and then summing over treatments

̅"!the sample mean

for the ith treatment,

b is the number of

blocks, k is the

number of treatments

40

Sampling

=CDE"$ (̅F" − ̅)# = 5 606 − 600 # + 5 594 − 600 # = 360

Block Female SAT Score Male SAT Score Block Mean

School A, 2.75 GPA 540 530 535

School B, 3.00 GPA 570 550 560

School C, 3.25 GPA 590 580 585

School D, 3.50 GPA 640 620 630

School E, 3.75 GPA 690 690 690

Treatment Mean 606 594

Number of Blocks Overall mean

41

Sampling

Now calculate the Sum of Squares for Blocks (SSB)

Measure of variation among the five block means

representing different schools and GPA

=CDE"G (̅H" − ̅)#

Squaring the squares of the differences between each

block mean and the overall mean, multiple each

squared difference by the number of measurements

for each block, and then sum over all blocks

̅#!the sample mean

for the ith block, k is

the number of

treatments

42

Sampling

SSB= 2 535 − 600 # + 2 560 − 600 # + 2() 585 −600 # + 2 630 − 600 # + 2 690 − 600 # = 30100

Block Female SAT Score Male SAT Score Block Mean

School A, 2.75 GPA 540 530 535

School B, 3.00 GPA 570 550 560

School C, 3.25 GPA 590 580 585

School D, 3.50 GPA 640 620 630

School E, 3.75 GPA 690 690 690

Treatment Mean 606 594

=CDE"G (̅H" − ̅)#Number of Treatments Overall mean

43

Sampling

In a randomized block design, the sampling

variability is measured by subtracting the portion

attributed to treatments and blocks from the total

sum of squares, SS(Total)() =CDE"I (D − ̅)#

44

Sampling

Block Female SAT Score Male SAT Score Block Mean

School A, 2.75 GPA 540 530 535

School B, 3.00 GPA 570 550 560

School C, 3.25 GPA 590 580 585

School D, 3.50 GPA 640 620 630

School E, 3.75 GPA 690 690 690

Treatment Mean 606 594

= (540 − 600)#+(530 − 600)#+⋯+ 690 − 600 #= 30600

() =:$%&' ($ − ̅), Overall mean

45

Sampling

In a randomized block design, the sampling

variability is measured by subtracting the portion

attributed to treatments and blocks from the total

sum of squares, SS(Total)

= − − = 30600 − 360 − 30100 = 140

= + +

Sum of Squares

for Treatment

Sum of Squares

for Blocks

Sum of Squares

for Error

46

Sampling

Total Sum of Squares

SS(Total)

df=n-1

Sum of Squares for

Treatments

SST

df=k-1

Sum of Squares for

Error

SSE

df=n-k

Sum of Squares for

Blocks

SSB

df=b-1

Sum of Squares for

Error

SSE

df=n-b-k+1

Randomized

Design

Randomized

Block

Design

47

Sampling

= − 1 = 3602 − 1 = 360

= − − + 1 = 14010 − 5 − 2 + 1 = 35

− = = 36035 = 10.29

48

Sampling

df SS MS F

Treatments − 1 = 2 − 1= 1 360 = − 1 = 3602 − 1= 360

=36035 = 10.29

Blocks − 1 = 4 30100 = − 1 = 301004= 7525

=752535 = 215

Error

− − + 1= 10 − 2 − 5 + 1= 4 140

= − − + 1= 14010 − 2 − 5 + 1 = 35

Total 14 30600

Use this to test

the difference in

the means of the

treatments

Use this to test

the difference in

the means of the

blocks

49

Sampling

!.!S = 7.71

= 10.29 > !.!S = 7.71

Can we reject the null hypothesis that the mean

SAT scores are the same for females and males?

YES! And we can conclude that the mean SAT

scores differ for females and males.

50

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design yielded the following results:

51

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design

yielded the following results:

1. How many blocks and treatments were used in this experiment?

3 blocks, 5 treatments

52

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design

yielded the following results:

2. How many observations were collected in the experiment?

15 observations

53

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design

yielded the following results:

3. Specify the null and alternative hypotheses you would use to

compare the treatment means.: = = = = ,: \

54

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design

yielded the following results:

4a. Which test statistic should you use to test the null hypothesis

regarding treatment means? − =

4b. Which test statistic should you use to test the null hypothesis

regarding block means? − =

55

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design

yielded the following results:

5. Conduct the test for treatment means against F0.05=3.84 and

interpret the results. = . > . = .

reject H0 that the treatments means are equal

56

Today’s Agenda

• Experimental Design

• Experiment vs Observational

• Random Sampling

• ANOVA

• Randomized Block Design

57

Next Class

LAB DAY Data Analytics

Chapter 3:

Data Visualization

Chapter 2:

Descriptive Analytics

Chapter 7:

Regression

Chapter 8:

Time Series Analysis &

Forecasting

Chapter 12, 13, & 14:

Optimization & Prescriptive

Analytics

Experimental Design

Chapter 11:

Simulation

To Do List

58

Homework Assignment #6

Due 3/12/2021 by 12:00pm (noon)

学霸联盟

Introduction to Business Data Analytics

WEEK 9

MARCH 8, 2021

2Course Roadmap

Data Analytics

Chapter 3:

Data Visualization

Chapter 2:

Descriptive Analytics

Chapter 7:

Regression

Chapter 8:

Time Series Analysis &

Forecasting

Chapter 12, 13, & 14:

Optimization & Prescriptive

Analytics

Experimental Design

Chapter 11:

Simulation

3Today’s Agenda

• Experimental Design

• Sampling

4Sampling

• So far you have been given

observational data

• Little to no control over variables

• Merely observe their values

• For example, Age, Income,

etc…

5Sampling

• But if you designed an experiment…

• Then you could control one or

more variables

• And observe their effect

6Sampling

• Experiment

• Apply treatments to experimental

units (such as people, animals,

land, etc.) and then observe the

effect of the treatments on the

experimental units

7Sampling

• Observational Studies

• Observe subjects and measure

variables of interest without

assigning treatments to subjects.

8Sampling

Experiment

How would you design an experiment?

Observational Study

How would you design an observational study?

Suppose you want to study the effect of smoking on

lung capacity in women

9Sampling

Experiment

Find 100 women, age 20, who do not currently

smoke

Randomly assign half (50) to the smoking treatment

and the other half to the no smoking treatment

Those in the smoking treatment should smoke a

pack a day for 10 years, while those in the no

smoking treatment should remain smoke free for 10

years.

After 10 years, measure lung capacity for each of

the 100 women

Analyze, interpret, and draw conclusions from the

data.

Observational Study

Find 100 women, age 30, for which 50 have

been smoking a pack a day for 10 years while

the other 50 have remained smoke free for those

10 years

Measure lung capacity for each of the 100

women

Analyze, interpret, and draw conclusions from

the data.

Suppose you want to study the effect of smoking on

lung capacity in women

10

Sampling

An economist obtains the unemployment rate and gross

state product for a sample of states over the past 10

years, with the objective of examining the relationship

between the unemployment rate and the gross state

product by census region.

Experiment or Observational Study?

Observational Study

11

Sampling

A psychologist tests the effect of three different

feedback programs by randomly assigning five rats to

each program and recording their response times at

specified intervals during the program.

Experiment or Observational Study?

Experiment

12

Sampling

A design in which the treatments are randomly

assigned to the experimental unit

Random Experiment

Is this a good choice?

13

Sampling

We want to test the null hypothesis that that

treatment means are all equal against the

alternative that at least two differ

The objective of a randomized design is to

usually compare the treatment means

! = " = # = ⋯ = $% = ℎ

14

Sampling

For example, suppose you randomly selected five males and

five females and looked at their SAT scores.

450 475 500 525 550 575 600 625 650

Female MaleFemale Average: 550

Male Average: 590

Can we conclude that there is a difference in test

scores between Females and Males?

No, as the difference in the means is

dominated by the sampling variability

15

Sampling

For example, suppose you randomly selected five males and

five females and looked at their SAT scores.

Female Average: 550

Male Average: 590

Can we conclude that there is a difference in test

scores between Females and Males?

Probably, as the difference in the means is

large relative to the sampling variability

450 475 500 525 550 575 600 625 650

Female Male

16

Sampling

The key to sampling is to compare the difference between

the treatment means with the amount of sampling variability

SST = Sum of Squares for Treatments

SSE = Sum of Squares for Error =$!"#$ !(̅! − )% Where ! is the sample size of the ith treatment, ̅!is the mean of the treatment and ̅ is the mean of the overall sample = ∑&"#'! (#& − ̅#)% + ∑&"#'" (%& − ̅%)%+ … + ∑&"#'# ($& − ̅$)%

Looks complicated, but we can rewrite to… = (#−1)#% + (%−1)%% +⋯+ ($−1)$%

Where s2 is the sample variance = ∑$%&' ()!*)̅)"'*&

17

Sampling

=$!"#$ !(̅! − )%

= ∑&"#'! (#& − ̅#)% + ∑&"#'" (%& − ̅%)%+ … + ∑&"#'# ($& − ̅$)% = 5 − 1 2250 + 5 − 1 2250 = 18000

But what we are really after is the MST and MSE…

For example, suppose you randomly selected five males and

five females and looked at their SAT scores.

= 5 550 − 570 2 + 5 590 − 570 2 = 4000450 475 500 525 550 575 600 625 650

Female Male

= (#−1)#% + (%−1)%% +⋯+ ($−1)$%

Don’t worry about calculating these right now…

18

Sampling

= (()$*#, where k-1 is the degrees of freedom

= − = 1800010 − 2 = 2250

MST = Mean Square for Treatments

(measures the variability among the treatment means)

MSE = Mean Square for Error

(measures the variability within the treatments)

= 40002 − 1 = 4000

19

Sampling

− =

Use the SST, SSE, MST, MSE => F-Statistic

− = 40002250 = 1.78

The F-statistic determines if the means of the treatment

groups are equal (H0) or different (Ha)

Can we reject the null hypothesis that means of the

treatments are equal?

20

Sampling

This graph will change based on the degrees of freedom in

the numerator and denominator, but what you want is that

your F-statistic is larger than the value of F at the

designated level of significance

21

Sampling

Historically, you would read a

table like this with the degrees

of freedom of the numerator in

the columns and the degrees of

freedom of the denominator in

the rows for a certain level of

significance… but now we

have technology that will give

you these numbers…

22

Sampling

Let’s choose level of significance of α=0.05. For our

example, the cut-off value of F0.05 is 5.32

Our F-statistic was 1.78

Can we reject the null hypothesis that means of the

treatments are equal?

NO!

Our F-stat = 1.78 < 5.31 = F0.05

We fail to reject the null hypothesis that the means are equal.

23

Sampling

Suppose you randomly selected five males and five females

and looked at their SAT scores.

450 475 500 525 550 575 600 625 650

Female Male

Let's do the same as before, but with this data.

24

Sampling = (5 − 1)(62.5) + (5 − 1)(62.5) = 500 = − = 50010 − 2 = 62.5 = 4000( ℎ ℎ ℎ ) − = 400062.5 = 64.0

Can we reject the null hypothesis that means of the

treatments are equal?

YES!

Our F-stat = 64.0 > 5.31 = F0.05

We can reject the null hypothesis that the means are equal.

25

Sampling

Since we rejected the null hypothesis that the means are equal,

we can conclude that the SAT mean score of males differs

from that of females.

450 475 500 525 550 575 600 625 650

Female Male

26

Sampling

This type of analysis is called ANOVA or Analysis of Variance

df SS MS F

Treatments − 1 SST = − 1 Error − SSE = −

Total − 1 SS(Total) = +

27

Sampling

Total Sum of Squares

SS(Total)

df=n-1

Sum of Squares for Treatments

SST

df=k-1

Sum of Squares for Error

SSE

df=n-k

28

Sampling

Example: Find the F-statistic and determine if we can reject

the null hypothesis of the means being equal at 0.10 level of

significance (F0.10=2.87)

df SS MS F

Treatments − 1 = 3 2794.39 = − 1

Error − = 36 762.30 = −

Total − 1 = 39 3556.69

Based on the table, tell me something about the experiment…

k=4, so we are comparing 4

different things

n=40, so we have 40 observations

29

Sampling

df SS MS F

Treatments − 1 = 3 2794.39 = − 1

Error − = 36 762.30 = −

Total − 1 = 39 3556.69

= − 1 = 2794.394 − 1 = 931.46 = − = 762.3040 − 4 = 21.18

= 931.4621.18 = 43.99

30

Sampling

= 931.4621.18 = 43.99 (F0.10=2.87)>

Can we reject the null hypothesis that means of the

treatments are equal at 0.10 level of significance?

Yes!

31

Sampling Example

Robotics researchers investigated whether robots could be

trained to behave like ants in an ant colony. Robots were

trained and randomly assigned to “colonies” (i.e. groups)

consisting of 3, 6, 9, or 12 robots. The robots were assigned

the tasks of foraging for “food” and recruiting another robot

when they identified a resource-rich area. One goal of the

experiment was to compare the mean energy expended (per

robot) of the four different sizes of colonies.

32

Sampling Example

1. Experiment or Observational Study? If experiment, what

kind?

2. Identify the treatments and the dependent variable.

3. Set up the null and alternative hypotheses of the test.

4. The following results were reported:

• F=7.70

• numerator df=3, denominator df=56

• F0.05=2.76

Interpret the results.

Randomized Experiment

Treatments: 3, 6, 9, 12 robots & Dependent Variable: Energy Expended

: = = = , : +

Reject H0 and conclude that the means

differ for the robot treatments

33

Sampling Example

We rejected H0 about the all means being equal for the robot

treatments, but that doesn’t tell you anything about the

difference between each treatment

Now we want to test = = = = = =

Essentially testing if the mean of the treatment with

3 robots is the same as the mean of the treatment

with 6 robots, etc…

When you have equal treatment sample sizes you

use the Tukey Method

You don’t want to do this by hand, instead have

the computer calculate the t-statistic and reject

that the means are equal if the t-statistic found is

larger than the t-statistic critical value (same

procedure as when we used the F-statistic)

34

Sampling Example

AB AC AD BC BD CD

A B A C A D B C B D C D

Mean 250.78 261.06 250.78 269.95 250.78 249.32 261.06 269.95 261.06 249.32 269.95 249.32

Variance 22.42 14.95 22.42 20.26 22.42 27.07 14.95 20.26 14.95 27.07 20.26 27.07

Observations 10.00 10.00 10.00 10.00 10.00 10.00 10.00 10.00 10.00 10.00 10.00 10.00

df 18.00 18.00 18.00 18.00 18.00 18.00

t Stat -5.32 -9.28 0.66 -4.74 5.73 9.48

P(T<=t) one-tail 0.00 0.00 0.26 0.00 0.00 0.00

t Critical one-tail 1.33 1.33 1.33 1.33 1.33 1.33

Means of the two

samples

5.32 > 1.33, we can reject the

hypothesis that the mean of A

is the same as the mean of B

Do this for every

combination.

Which combination can you

not reject the hypothesis that

the means are the same?

A=3 robots, B=6 robots, C=9 robots, D=12 robots

35

Sampling

We can also introduce a better experimental design with

better controls to help account for variability

Take our SAT score example, what else could

we control for?

School GPA

Classes SES

Instead of selecting independent samples, we

choose experimental units (students in this

example) that are matched sets.

The matched sets are called blocks.

36

Sampling

df SS MS F

Treatments − 1 SST = − 1

Blocks − 1 SSB = − 1

Error − − + 1 SSE = − − + 1

Total − 1 SS(Total)

37

Sampling

Randomized Block Design

Attempting to reduce the sampling variability

of the experimental units in each block, which

in turn reduces the MSE

Now again we compare SAT scores of male

and female high school seniors, but now we

select matched pairs of females and males

according to their GPA and school

38

Sampling

Block Female SAT Score Male SAT Score Block Mean

School A, 2.75 GPA 540 530 535

School B, 3.00 GPA 570 550 560

School C, 3.25 GPA 590 580 585

School D, 3.50 GPA 640 620 630

School E, 3.75 GPA 690 690 690

Treatment Mean 606 594

39

Sampling

Follow the same procedure but now with the blocks

Start with the SST, which measures the variation

between female and male means

=CDE"$ (̅F" − ̅)#

Squaring the distance between each treatment mean

and the overall mean, multiplying each squared

distance by the number of measurement for the

treatment, and then summing over treatments

̅"!the sample mean

for the ith treatment,

b is the number of

blocks, k is the

number of treatments

40

Sampling

=CDE"$ (̅F" − ̅)# = 5 606 − 600 # + 5 594 − 600 # = 360

Block Female SAT Score Male SAT Score Block Mean

School A, 2.75 GPA 540 530 535

School B, 3.00 GPA 570 550 560

School C, 3.25 GPA 590 580 585

School D, 3.50 GPA 640 620 630

School E, 3.75 GPA 690 690 690

Treatment Mean 606 594

Number of Blocks Overall mean

41

Sampling

Now calculate the Sum of Squares for Blocks (SSB)

Measure of variation among the five block means

representing different schools and GPA

=CDE"G (̅H" − ̅)#

Squaring the squares of the differences between each

block mean and the overall mean, multiple each

squared difference by the number of measurements

for each block, and then sum over all blocks

̅#!the sample mean

for the ith block, k is

the number of

treatments

42

Sampling

SSB= 2 535 − 600 # + 2 560 − 600 # + 2() 585 −600 # + 2 630 − 600 # + 2 690 − 600 # = 30100

Block Female SAT Score Male SAT Score Block Mean

School A, 2.75 GPA 540 530 535

School B, 3.00 GPA 570 550 560

School C, 3.25 GPA 590 580 585

School D, 3.50 GPA 640 620 630

School E, 3.75 GPA 690 690 690

Treatment Mean 606 594

=CDE"G (̅H" − ̅)#Number of Treatments Overall mean

43

Sampling

In a randomized block design, the sampling

variability is measured by subtracting the portion

attributed to treatments and blocks from the total

sum of squares, SS(Total)() =CDE"I (D − ̅)#

44

Sampling

Block Female SAT Score Male SAT Score Block Mean

School A, 2.75 GPA 540 530 535

School B, 3.00 GPA 570 550 560

School C, 3.25 GPA 590 580 585

School D, 3.50 GPA 640 620 630

School E, 3.75 GPA 690 690 690

Treatment Mean 606 594

= (540 − 600)#+(530 − 600)#+⋯+ 690 − 600 #= 30600

() =:$%&' ($ − ̅), Overall mean

45

Sampling

In a randomized block design, the sampling

variability is measured by subtracting the portion

attributed to treatments and blocks from the total

sum of squares, SS(Total)

= − − = 30600 − 360 − 30100 = 140

= + +

Sum of Squares

for Treatment

Sum of Squares

for Blocks

Sum of Squares

for Error

46

Sampling

Total Sum of Squares

SS(Total)

df=n-1

Sum of Squares for

Treatments

SST

df=k-1

Sum of Squares for

Error

SSE

df=n-k

Sum of Squares for

Blocks

SSB

df=b-1

Sum of Squares for

Error

SSE

df=n-b-k+1

Randomized

Design

Randomized

Block

Design

47

Sampling

= − 1 = 3602 − 1 = 360

= − − + 1 = 14010 − 5 − 2 + 1 = 35

− = = 36035 = 10.29

48

Sampling

df SS MS F

Treatments − 1 = 2 − 1= 1 360 = − 1 = 3602 − 1= 360

=36035 = 10.29

Blocks − 1 = 4 30100 = − 1 = 301004= 7525

=752535 = 215

Error

− − + 1= 10 − 2 − 5 + 1= 4 140

= − − + 1= 14010 − 2 − 5 + 1 = 35

Total 14 30600

Use this to test

the difference in

the means of the

treatments

Use this to test

the difference in

the means of the

blocks

49

Sampling

!.!S = 7.71

= 10.29 > !.!S = 7.71

Can we reject the null hypothesis that the mean

SAT scores are the same for females and males?

YES! And we can conclude that the mean SAT

scores differ for females and males.

50

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design yielded the following results:

51

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design

yielded the following results:

1. How many blocks and treatments were used in this experiment?

3 blocks, 5 treatments

52

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design

yielded the following results:

2. How many observations were collected in the experiment?

15 observations

53

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design

yielded the following results:

3. Specify the null and alternative hypotheses you would use to

compare the treatment means.: = = = = ,: \

54

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design

yielded the following results:

4a. Which test statistic should you use to test the null hypothesis

regarding treatment means? − =

4b. Which test statistic should you use to test the null hypothesis

regarding block means? − =

55

Sampling Example

df SS MS F

Treatments 4 501 125.25 9.11

Blocks 2 225 112.5 8.18

Error 8 110 13.75

Total 14 836

A randomized block design

yielded the following results:

5. Conduct the test for treatment means against F0.05=3.84 and

interpret the results. = . > . = .

reject H0 that the treatments means are equal

56

Today’s Agenda

• Experimental Design

• Experiment vs Observational

• Random Sampling

• ANOVA

• Randomized Block Design

57

Next Class

LAB DAY Data Analytics

Chapter 3:

Data Visualization

Chapter 2:

Descriptive Analytics

Chapter 7:

Regression

Chapter 8:

Time Series Analysis &

Forecasting

Chapter 12, 13, & 14:

Optimization & Prescriptive

Analytics

Experimental Design

Chapter 11:

Simulation

To Do List

58

Homework Assignment #6

Due 3/12/2021 by 12:00pm (noon)

学霸联盟

- 留学生代写
- Python代写
- Java代写
- c/c++代写
- 数据库代写
- 算法代写
- 机器学习代写
- 数据挖掘代写
- 数据分析代写
- Android代写
- html代写
- 计算机网络代写
- 操作系统代写
- 计算机体系结构代写
- R代写
- 数学代写
- 金融作业代写
- 微观经济学代写
- 会计代写
- 统计代写
- 生物代写
- 物理代写
- 机械代写
- Assignment代写
- sql数据库代写
- analysis代写
- Haskell代写
- Linux代写
- Shell代写
- Diode Ideality Factor代写
- 宏观经济学代写
- 经济代写
- 计量经济代写
- math代写
- 金融统计代写
- 经济统计代写
- 概率论代写
- 代数代写
- 工程作业代写
- Databases代写
- 逻辑代写
- JavaScript代写
- Matlab代写
- Unity代写
- BigDate大数据代写
- 汇编代写
- stat代写
- scala代写
- OpenGL代写
- CS代写
- 程序代写
- 简答代写
- Excel代写
- Logisim代写
- 代码代写
- 手写题代写
- 电子工程代写
- 判断代写
- 论文代写
- stata代写
- witness代写
- statscloud代写
- 证明代写
- 非欧几何代写
- 理论代写
- http代写
- MySQL代写
- PHP代写
- 计算代写
- 考试代写
- 博弈论代写
- 英语代写
- essay代写
- 不限代写
- lingo代写
- 线性代数代写
- 文本处理代写
- 商科代写
- visual studio代写
- 光谱分析代写
- report代写
- GCP代写
- 无代写
- 电力系统代写
- refinitiv eikon代写
- 运筹学代写
- simulink代写
- 单片机代写
- GAMS代写
- 人力资源代写
- 报告代写
- SQLAlchemy代写
- Stufio代写
- sklearn代写
- 计算机架构代写
- 贝叶斯代写
- 以太坊代写
- 计算证明代写
- prolog代写
- 交互设计代写
- mips代写
- css代写
- 云计算代写
- dafny代写
- quiz考试代写
- js代写
- 密码学代写
- ml代写
- 水利工程基础代写
- 经济管理代写
- Rmarkdown代写
- 电路代写
- 质量管理画图代写
- sas代写
- 金融数学代写
- processing代写
- 预测分析代写
- 机械力学代写
- vhdl代写
- solidworks代写
- 不涉及代写
- 计算分析代写
- Netlogo代写
- openbugs代写
- 土木代写
- 国际金融专题代写
- 离散数学代写
- openssl代写
- 化学材料代写
- eview代写
- nlp代写
- Assembly language代写
- gproms代写
- studio代写
- robot analyse代写
- pytorch代写
- 证明题代写
- latex代写
- coq代写
- 市场营销论文代写
- 人力资论文代写
- weka代写
- 英文代写
- Minitab代写
- 航空代写
- webots代写
- Advanced Management Accounting代写
- Lunix代写
- 云基础代写
- 有限状态过程代写
- aws代写
- AI代写
- 图灵机代写
- Sociology代写
- 分析代写
- 经济开发代写
- Data代写
- jupyter代写
- 通信考试代写
- 网络安全代写
- 固体力学代写
- spss代写
- 无编程代写
- react代写
- Ocaml代写
- 期货期权代写
- Scheme代写
- 数学统计代写
- 信息安全代写
- Bloomberg代写
- 残疾与创新设计代写
- 历史代写
- 理论题代写
- cpu代写
- 计量代写
- Xpress-IVE代写
- 微积分代写
- 材料学代写
- 代写
- 会计信息系统代写
- 凸优化代写
- 投资代写
- F#代写
- C#代写
- arm代写
- 伪代码代写
- 白话代写
- IC集成电路代写
- reasoning代写
- agents代写
- 精算代写
- opencl代写
- Perl代写
- 图像处理代写
- 工程电磁场代写
- 时间序列代写
- 数据结构算法代写
- 网络基础代写
- 画图代写
- Marie代写
- ASP代写
- EViews代写
- Interval Temporal Logic代写
- ccgarch代写
- rmgarch代写
- jmp代写
- 选择填空代写
- mathematics代写
- winbugs代写
- maya代写
- Directx代写
- PPT代写
- 可视化代写
- 工程材料代写
- 环境代写
- abaqus代写
- 投资组合代写
- 选择题代写
- openmp.c代写
- cuda.cu代写
- 传感器基础代写
- 区块链比特币代写
- 土壤固结代写
- 电气代写
- 电子设计代写
- 主观题代写
- 金融微积代写
- ajax代写
- Risk theory代写
- tcp代写
- tableau代写
- mylab代写
- research paper代写
- 手写代写
- 管理代写
- paper代写
- 毕设代写
- 衍生品代写
- 学术论文代写
- 计算画图代写
- SPIM汇编代写
- 演讲稿代写
- 金融实证代写
- 环境化学代写
- 通信代写
- 股权市场代写
- 计算机逻辑代写
- Microsoft Visio代写
- 业务流程管理代写
- Spark代写
- USYD代写
- 数值分析代写
- 有限元代写
- 抽代代写
- 不限定代写
- IOS代写
- scikit-learn代写
- ts angular代写
- sml代写
- 管理决策分析代写
- vba代写
- 墨大代写
- erlang代写
- Azure代写
- 粒子物理代写
- 编译器代写
- socket代写
- 商业分析代写
- 财务报表分析代写
- Machine Learning代写
- 国际贸易代写
- code代写
- 流体力学代写
- 辅导代写
- 设计代写
- marketing代写
- web代写
- 计算机代写
- verilog代写
- 心理学代写
- 线性回归代写
- 高级数据分析代写
- clingo代写
- Mplab代写
- coventorware代写
- creo代写
- nosql代写
- 供应链代写
- uml代写
- 数字业务技术代写
- 数字业务管理代写
- 结构分析代写
- tf-idf代写
- 地理代写
- financial modeling代写
- quantlib代写
- 电力电子元件代写
- atenda 2D代写
- 宏观代写
- 媒体代写
- 政治代写
- 化学代写
- 随机过程代写
- self attension算法代写
- arm assembly代写
- wireshark代写
- openCV代写
- Uncertainty Quantificatio代写
- prolong代写
- IPYthon代写
- Digital system design 代写
- julia代写
- Advanced Geotechnical Engineering代写
- 回答问题代写
- junit代写
- solidty代写
- maple代写
- 光电技术代写
- 网页代写
- 网络分析代写
- ENVI代写
- gimp代写
- sfml代写
- 社会学代写
- simulationX solidwork代写
- unity 3D代写
- ansys代写
- react native代写
- Alloy代写
- Applied Matrix代写
- JMP PRO代写
- 微观代写
- 人类健康代写
- 市场代写
- proposal代写
- 软件代写
- 信息检索代写
- 商法代写
- 信号代写
- pycharm代写
- 金融风险管理代写
- 数据可视化代写
- fashion代写
- 加拿大代写
- 经济学代写
- Behavioural Finance代写
- cytoscape代写
- 推荐代写
- 金融经济代写
- optimization代写
- alteryxy代写
- tabluea代写
- sas viya代写
- ads代写
- 实时系统代写
- 药剂学代写
- os代写
- Mathematica代写
- Xcode代写
- Swift代写
- rattle代写
- 人工智能代写
- 流体代写
- 结构力学代写
- Communications代写
- 动物学代写
- 问答代写
- MiKTEX代写
- 图论代写
- 数据科学代写
- 计算机安全代写
- 日本历史代写
- gis代写
- rs代写
- 语言代写
- 电学代写
- flutter代写
- drat代写
- 澳洲代写
- 医药代写
- ox代写
- 营销代写
- pddl代写
- 工程项目代写