xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

统计代写-STA310

时间：2021-04-16

University of Toronto Mississauga

STA310 H5S: Bayesian Statistics in Forensic Science -

Winter 2021

Instructor: Dr. Ramya Thinniyam

Midterm Test

(Administered on Quercus)

February 22, 2021

SOLUTIONS

INSTRUCTIONS:

Test Duration/Submission Period:

• The test will be open on Quercus for 24 hours from Feb 22nd 9:00am EST to Feb 23rd

9:00am EST. You may submit your answers any time during the open period.

• The actual test duration is at most 90 minutes (if you were writing it in person, you would

not be given more than 90 mins). I am leaving the test open for 24 hours to accommodate

online test writing conditions, time zone differences, technical difficulties, etc.

• Do not leave the test to the last minute. It is your responsibility to make sure you have a

stable internet connection and the required materials to complete the test

Test Policies:

• You will be required to sign an Honour Pledge to confirm that you have maintained aca-

demic honesty during this assessment and submitted your own work. You must complete the

test individually. You are not allowed to discuss the test questions/content with any student

in the course or anyone beyond the course during the open period of the test. You will get

randomized questions/output, etc. so do not try to commit an academic offense by sharing

your answers with others - anomalies will be investigated.

• You may use a calculator and your notes as aids. You cannot use other resources such as

the internet or solutions from this course/other courses to copy answers (this is considered

plagiarism).

• You cannot base your justification on any procedure/fact that is not covered in the lectures.

• Questions will be randomized, locked (so you can’t go back to the previous question), and

displayed one at a time.

**If you skip/submit a question by mistake or submit the whole test by mistake without

finishing, you CANNOT go back. BE CAREFUL! You will get warning messages each time

you click Next. You cannot skip questions or go back**.

• For some Short Answer questions, you will be asked to upload your answers from a file. You

can either type up your answers or clearly write them using dark pen/pencil and then

scan/take photographs. Format your solution neatly, make sure to label the question num-

bers/part letters (1, 2, 3, a, b, c, etc.), and put all the parts together and then upload ONE

file. Your writing/scan/upload should be legible and clear for the marker to read.

• Numerical answers should be rounded to 4 decimal places where appropriate.

TOTAL: 50 marks

BEST WISHES ! ,

STA310 - Winter 2021 Midterm Test Solutions Page 1 of 11

[1 mark - 1m for typing in the pledge statement that was given and including

student’s full name and student number. ]

0. HONOUR PLEDGE

[5 marks - 1m each part. 5 parts randomly assigned ]

1. True/False: If the statement is true under all conditions, select T; otherwise select F.

a) The total sum of squares will change according to the model that is fit. T F

b) The Likelihood is a measure of the relative strength of evidence in favour of one hypothesis

against another. T F

c) The Bayes Factor is a probability. T F

d) The likelihood ratio is equal to the posterior probability when prior odds are 1:1. T F

e) Consider a One-Way ANOVA with only two levels (Group 1, Group 2). Conducting the

One-Way ANOVA F-test is equivalent to testing H0 : µ1 = µ2 vs Ha : µ1 6= µ2 using a

t-test. T F

f) Least Square means are equal to arithmetic means in a One-Way ANOVA model. T F

g) Bonferroni is a more conservative for pairwise comparisons of group means than Tukey’s

method when the design is balanced. T F

h) Tukey can be used for multiple tests that are not pre-planned. T F

[14 marks - 1m each part. 14 parts randomly assigned ]

2. Fill in the Blanks: Refer to Fingerprint Matching based on Minutiae study. Fill in the

blank with correct word/number. For numerical values, use 4 decimal places where appropriate.

a) Fill in the following missing number from the output: (A) = 2

b) Fill in the following missing number from the output: (B) = 21

c) Fill in the following missing number from the output: (C) = 2,931.9

d) Fill in the following missing number from the output: (D) = 12.2789

e) Fill in the following missing number from the output: (E) = 0.0103

f) Fill in the following missing number from the output: (F) = 0.0458

g) Is the design balanced or unbalanced? Balanced [write either ‘balanced’ or

‘unbalanced’ ].

h) What percent of variability in quality of fingerprints is accounted for by the dominant

minutia type used to match them? 53.9 %.

i) Give an unbiased estimate for the common standard deviation of the error terms from

model1: 15.4524 or 15.4525 or 15.45 .

STA310 - Winter 2021 Midterm Test Solutions Page 2 of 11

j) Suppose we want to do all pairwise comparisons between minutia types using Bonferroni

method. Write the p-value that corresponds to this question of interest: “Do finger-

prints that use bifurcations for matching have different quality than those that use island

minutia?” . 0.0309

k) What p-value corresponds to testing the question of interest? 0.0003 .

l) Based on the analyses conducted, which dominant minutia type is the worst choice for

matching poor quality fingerprints? Island .

m) Suppose we want to do all pairwise comparisons between minutia types using Bonferroni

method. Write the p-value that corresponds to this question of interest: “Do finger-

prints that use ridge endings for matching have poorer quality than those that use island

minutia?” 0.0001 .

n) Consider this model: Ygi = β0 + β1XB,i + β2XI,i + egi where for g = 1, 2, 3

XB,i =

1, if ith fingerprint used bifurcation minutia as dominant type to match

−1, if ith fingerprint used ridge ending minutia as dominant type to match

0, otherwise

and

XI,i =

1, if ith fingerprint used island minutia as dominant type to match

−1, if ith fingerprint used ridge ending minutia as dominant type to match

0, otherwise

Give a point estimate for β0. 29.6997

o) Consider this model: Ygi = β0 + β1XB,i + β2XI,i + egi where for g = 1, 2, 3

XB,i =

1, if ith fingerprint used bifurcation minutia as dominant type to match

−1, if ith fingerprint used ridge ending minutia as dominant type to match

0, otherwise

and

XI,i =

1, if ith fingerprint used island minutia as dominant type to match

−1, if ith fingerprint used ridge ending minutia as dominant type to match

0, otherwise

Give a point estimate for β1. -1.7877

p) Consider this model: Ygi = β0 + β1XB,i + β2XI,i + egi where for g = 1, 2, 3

XB,i =

1, if ith fingerprint used bifurcation minutia as dominant type to match

−1, if ith fingerprint used ridge ending minutia as dominant type to match

0, otherwise

STA310 - Winter 2021 Midterm Test Solutions Page 3 of 11

and

XI,i =

1, if ith fingerprint used island minutia as dominant type to match

−1, if ith fingerprint used ridge ending minutia as dominant type to match

0, otherwise

Give a practical interpretation for β0 (your answer should be in terms of this particular

case study).

It is the mean quality score of all latent fingerprints regardless of the dominant minutia

used to match them.

q) Consider this model: Ygi = β0 + β1XB,i + β2XI,i + egi where for g = 1, 2, 3

XB,i =

1, if ith fingerprint used bifurcation minutia as dominant type to match

−1, if ith fingerprint used ridge ending minutia as dominant type to match

0, otherwise

and

XI,i =

1, if ith fingerprint used island minutia as dominant type to match

−1, if ith fingerprint used ridge ending minutia as dominant type to match

0, otherwise

Give a practical interpretation for β1 (your answer should be in terms of this particular

case study).

It is the difference between the mean quality score of latent fingerprints that used bifur-

cation as dominant minutia and the mean of all fingerprints (regardless of minutia).

r) Consider this model: Ygi = β0 + β1XB,i + β2XI,i + egi where for g = 1, 2, 3

XB,i =

1, if ith fingerprint used bifurcation minutia as dominant type to match

−1, if ith fingerprint used ridge ending minutia as dominant type to match

0, otherwise

and

XI,i =

1, if ith fingerprint used island minutia as dominant type to match

−1, if ith fingerprint used ridge ending minutia as dominant type to match

0, otherwise

Give a practical interpretation for β2 (your answer should be in terms of this particular

case study).

It is the difference between the mean quality score of latent fingerprints that used island

as dominant minutia and the mean of all fingerprints (regardless of minutia).

STA310 - Winter 2021 Midterm Test Solutions Page 4 of 11

[20 marks]

3. Short Answer: Show your work and explain your answers. Answers (even correct ones)

without justification will not receive marks. Refer to Fingerprint Matching based on Minutiae

study. Recall that the question of interest is if the quality of latent fingerprints differ by the

dominant minutia type used to match them.

[3m-1m for indicators and defining them, 1m for proper notation of parame-

ters, 0.5m for response, 0.5m for errors]

a) Write out the theoretical model that is being fitted in model1 in the R output. Define

any variables you include.

The model being fitted in model1 is:

Yk = β0 + β1II,k + β2IR,k + ek for k = 1, 2, . . . , 24

where Yk is the quality score for the kth fingerprint, ek ∼ iid N(0, σ2), and

II,k =

{

1 , if the kth fingerprint used island as the dominant minutia for matching

0 , otherwise

IR,k =

{

1 , if the kth fingerprint used ridge ending as the dominant minutia for matching

0 , otherwise

[2m - 1m for correct values of parameter estimates, 1m for proper notation

with hat and not using error term ]

b) Write out the fitted model from model1 in the R output. Define any new variables you

introduce.

yˆk = 27.912 + 21.763II,k − 16.4IR,k

[2m - 1m using proper notation and correct statement in H0, 1m for using

proper notation and correct statement in Ha]

c) Write out the appropriate null and alternative hypotheses that would be needed to test

the question of interest using the linear regression model that was fitted in model1. Use

proper notation. (You do not have to conduct the actual test.)

H0 : β1 = β2 = 0 vs Ha: at least one of βj 6= 0 for j = 1, 2

[6m - 1m for each step- hypotheses, test stat, distribution, conclusion, and

practical conclusion with proper words]

d) Test the question of interest using the ANOVA test at the 5% significance level. Include

all the necessary steps for the hypothesis test (and include a practical conclusion).

H0 : µB = µI = µR vs Ha : µi 6= µj for at least one of pair i 6= j for i, j = 1, 2, 3

F = MSReg

MSE

= 5863.8/2

238.78

= 12.2787 ∼ F2,21 under H0

p = P (F2,21 > 12.2787) = 0.0003 ⇒ Reject H0

STA310 - Winter 2021 Midterm Test Solutions Page 5 of 11

There very strong evidence to conclude that the quality scores of latent fingerprints vary by the

dominant minutia used for matching.

(Hypotheses are in terms of the means, not the regression parameters. In practical conclusion,

underlined words or synonyms for them should be used.)

[4m]

e) Based on the Post-Hoc Analysis conducted, what do you conclude? Be specific and

name the methods that you are referring to. Which of the procedures carried out is most

appropriate in this example? Justify. Give an overall conclusion to this study.

• Since the design is balanced, Tukey’s HSD method will be more powerful for pairwise

comparisons between group means. (1m)

• There is moderate to strong evidence (p = 0.0268) of a difference between the quality of

fingerprints that used island and bifurcation, and very strong evidence (p = 0.0002) of

a difference between fingerprints that used island and ridge ending. There is insufficient

evidence of a difference between fingerprints that used bifurcation and that used ridge ending.

(2m)

• Specifically, fingerprints that used ridge ending are optimal since they have the lowest quality

scores but were still successful in being matched.(1m)

[4m]

f) Does there appear to be any violation of the model assumptions? Discuss in detail with

reference to the appropriate diagnostic plots/summary statistics. If any of the assumptions

may be of concern, pick one and give a practical reason (in terms of this scenario) for why

this may be the case.

• The QQ-plot of the residuals shows a slight curve/ S shape indicating that the residuals

are skewed and not normally distributed. (1m)

• There does not seem to be a major violation of the constant variance assumption when

looking at the boxplot and plot of residuals vs fitted values.

The ratio of the largest standard deviation to smallest is sI

sR

= 20.78583

11.78830

= 1.76 < 2.

(1m)

• There time series plot of the residuals appears to have a pattern (not randomly scat-

tered about 0) so there could be a problem with independence. It is possible for quality

scores to be correlated because some fingerprints could be collected together or affected

by the collection method, measurement method, laboratory, or collected from similar

crime scenes, etc. (2m)

[10 marks]

4. Short Answer: For each of the following questions, show your work and explain your

answers. Answers (even correct ones) without justification will not receive marks.

(Important ideas/parts of the solution are underlined for general solution. Specific examples

will be different but should include parts of solution but with different scenarios and numbers.)

STA310 - Winter 2021 Midterm Test Solutions Page 6 of 11

[2m]

a) Explain in YOUR OWN words what ”Prosecutor’s Fallacy” means.

The Prosecutor’s Fallacy is usually committed by the prosecution when they wrongfully conclude

that the chance of the suspect being innocent based on the evidence is very small when in

reality it is the probability of the observed evidence if the suspect is innocent that is very small.

[2m]

b) Explain in YOUR OWN words what ”Defender’s Fallacy” means.

The Defender’s Fallacy is usually committed by defense lawyers who argue that the evidence

is irrelevant and has no value. If the suspect was identified solely based on the one piece of

forensic evidence that they mention, then their argument is acceptable, but often the suspect

will be found from other evidence as well.

[2m]

c) Make up YOUR OWN ORIGINAL example of a case and show how both of the above

fallacies would be used in your example. Use only ONE example that shows both fallacies

(not two different examples).

**You may NOT use any of the cases we covered in the course (State vs Skipper from

lecture, Murder vs SIDS from lecture, Blood Stain example from HW). You may not copy

an example from the internet; you have to make up your own to demonstrate knowledge of

the concepts.**

To answer this question, include the following:

• Make up a realistic scenario (1m - cannot be same as one of the examples listed above)

• State probabilities (make probabilities realistic/sensible but they don’t have to be ac-

curate) (0.5m)

• Clearly define events/hypotheses (0.5m - use proper notation)

• State what the prosecutor and defense lawyer would say to commit the above fallacies

in this scenario (2m -wordings should be accurate and in terms of the case example,

not general)

• Rewrite both fallacies as probabilistic statements using the events/hypotheses you de-

fined (1m - use proper notation and distinguish between conditional, inverse conditional,

and unconditional probabilities)

• Explain why these fallacies are problematic for a court case (1m - practical reason about

court cases should be given such as the reasons listed below)

(Here is an example. Each student’s example will vary but should include relevant numbers

and explanations:)

Scenario: A crime has been committed and a blood stain is left behind at the crime scene.

A suspect, whose blood type matches that of the stain found at the crime scene, is arrested.

Only 1 in 100, 000 of the population has this rare blood type found at the crime scene (and

in the suspect).

STA310 - Winter 2021 Midterm Test Solutions Page 7 of 11

Hypotheses: Let Hs be the hypothesis that the blood stain came from the suspect and Ho

be the hypothesis that the blood stain was left by someone other than the suspect. Let E

represent the blood test evidence.

Defense Lawyer’s claim: “In a city like this with a population of 30,000,000 people who may

have committed the crime, this blood type would be found in approximately 300 people. So

the evidence merely shows that the suspect is one of 300 people in the city who might have

committed this crime. The blood test evidence has provided a probability of guilt of 1 in

300, which is negligible and cannot prove the suspect is guilty.”

Stated as Probabilities: P (Hs|E) = 1/300 and P (Hs) = 1/100000

Prosecutor’s claim: “The chance of observing this blood type if the blood came from someone

other than the suspect is 1 in a 100,000.’ Therefore, the chance that the blood came from

someone other than the suspect is only 1 in a 100,000. The probability is so small that the

defendant must have committed the crime. ”

Stated as Probabilities: P (E|Ho) = 0.00001 and P (Ho|E) = 0.00001

Defender’s fallacy is problematic because it ignores evidence that was important because it

helped narrow down the population of suspects. The Prosecutor’s fallacy is problematic

because mathematically it assumes a 50% chance that the suspect is guilty (before the evi-

dence is introduced). This goes against the legal principle of “innocent until proven guilty”

that should be followed in our court system.

STA310 - Winter 2021 Midterm Test Solutions Page 8 of 11

Fingerprint Matching based on Minutiae:

Latent fingerprints (fingerprints left at a crime scene) are often used to identify

suspects by matching them to digital prints from a database. There are different

features called “minutiae” on latent prints that can provide useful information

to a forensic examiner. There are several types of minutiae (see photos). A

new technology has been proposed to rate the quality of fingerprints based on

gradient quality (by converting the latent fingerprint into a digital copy and

then examining pixels around the image). The quality score ranges from 0 to

100, where higher scores indicate higher quality. For the purposes of this test,

you do not need to know the specific technical details of this algorithm. This

study considered only prints of poor quality that are damaged by substances

from crime scenes and so are naturally harder to match. Amongst these poor

quality prints, only ones that were matched properly to digital prints of suspects

based on one dominant minutia (either ridge ending, bifurcation, or island)

were analyzed. The question of interest is if the quality of latent fingerprints

differ by the dominant minutia type used to match them. Refer to the R output

The variables are:

‘quality score’ (score from 0 to 100)

‘minutia type’ (R-ridge ending, B-bifurcation, I-island)

R OUTPUT ON NEXT PAGE

STA310 - Winter 2021 Midterm Test Solutions Page 9 of 11

Fingerprint Matching based on Minutiae: R OUTPUT

> tapply(quality_score,minutia_type,sd)

B I R

12.05481 20.78583 11.78830

> tapply(quality_score,minutia_type,length)

B I R

8 8 8

> model1 <- lm(quality_score ~ minutia_type)

> summary(model1)

Call:

lm(formula = quality_score ~ minutia_type)

Residuals:

Min 1Q Median 3Q Max

-28.875 -8.238 -1.994 11.797 29.725

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 27.912 5.463 5.109 4.63e-05 ***

minutia_typeI 21.763 7.726 2.817 0.0103 *

minutia_typeR -16.400 7.726 -2.123 0.0458 *

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 15.45

Multiple R-squared: 0.539, Adjusted R-squared: 0.4951

> anova(model1)

Analysis of Variance Table

Response: quality_score

Df Sum Sq Mean Sq F value Pr(>F)

minutia_type (A) 5863.8 (C) (D) 0.000294 ***

Residuals (B) 5014.3 238.78

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

> bonf = pairwise.t.test(quality_score, minutia_type, p.adj = "none")

> bonf

Pairwise comparisons using t tests with pooled SD

data: quality_score and minutia_type

STA310 - Winter 2021 Midterm Test Solutions Page 10 of 11

B I

I (E) -

R (F) 6.9e-05

P value adjustment method: none

> tukeyCIs = TukeyHSD(aov(model1),factor="minutia_type")

> tukeyCIs

Tukey multiple comparisons of means

95% family-wise confidence level

Fit: aov(formula = model1)

$minutia_type

diff lwr upr p adj

I-B 21.7625 2.288026 41.236974 0.0267884

R-B -16.4000 -35.874474 3.074474 0.1093058

R-I -38.1625 -57.636974 -18.688026 0.0001967

STA310 - Winter 2021 Midterm Test Solutions Page 11 of 11

学霸联盟