Midterm Information
Date & Time:
Tutorial Groups 1A-1D: March 16, 10:10 AM - 12 PM ET
Tutorial Groups 2A-2D: March 17, 3:10 - 5 PM ET
Midterm Coverage:
Midterm in all topics up to and including confidence intervals (including
coverage in synchronous lecture week of March 7th). This does not include
hypothesis testing. See the accompanying learning outcomes for each topic!
You won't be tested on writing scripts in R but you're expected to know how
to perform and describe steps to simulate R, and anticipate what a script
presented to you might output.
1. Exploratory Data Analysis:
Numerical and Graphical summaries (e.g. five number summary,
percentiles, mean, standard deviation, variance, MAD, correlation
coefficient, boxplots, density histograms, kernel density estimation,
scatterplots)
Selecting the appropriate tools to represent data sets (e.g. selecting the
right type of graphs based on the data type
(categorical/discrete/continuous) AND based on the features of the data
you want to examine/show).
Making comparative statements between two populations based on
EDA. Using EDA to make observations about the distribution of the
data.
2. LLN, CLT, Chebyshev's Inequality:
Their statement and implications (e.g. How LLN justifies/gives us
methods to estimate probability mass/probability density from data),
Assumptions that need to be verified prior to use (such as sample size
checks in CLT),
Tradeoff between the simplicity and accuracy of Chebyshev's inequality.
3. Estimator Properties & Analysis:
Properties of unbiasedness, consistency, mean squared error, and
estimator variability.
Be able to explain that no one of these properties is sufficient to decide
"goodness" of estimators!
Able to determine unbiasedness explicitly or using tools like Jensen's
inequality,
Verify consistency graphically or for the special case of unbiased
estimators,
Identify the most efficient estimator among unbiased estimators
1. Methods of Estimation:
Method of moment estimation, maximum likelihood estimation,
Motivations for each method,
Advantages and disadvantages (especially for MLE)
2. Sampling Distribution vs Bootstrapping:
Define sampling distribution. Determine when T, N(0, 1), or Chi-squared
distributions are appropriate for use,
Select the appropriate sampling distribution for the type of data and
estimator under study
Recognize when bootstrapping (empirical or parametric) is appropriate
for a data set based on available information (e.g. when should
bootstrapping be used in place of simulation? When should empirical
bootstrap be used over parametric? etc.)
Verify normality through density histogram AND normal QQ plots
Describe in details the steps to implement bootstrapping or simulation
methods for approximating sampling distributions, or for estimation
purposes.
3. Confidence Intervals:
Construction and interpretation.
Construction and interpretation.
Selecting the most appropriate construction based on the appearance
of our data (e.g. checking for normality, large enough sample size, etc.).
Includes bootstrapped confidence intervals using studentized means or
ratios of variances.
Format for the 2 Hours:
Part 1 (10:10-11:40 AM / 3:10-4:40 PM): Short answer document will be
distributed via Crowdmark. 4-5 questions.
Please print/load onto tablet and answer in the space provided OR
complete your work on a separate sheet.
Ensure your work is legible and organized, and that you upload your
files to the correct question on Crowdmark!
Part 2 (11:40 AM - 12 PM / 4:40-5:00 PM): Upload time.
Upload your short answers onto Crowdmark (.jpg, .pdf, .png files only).
Have a plan on how you will do this beforehand. (e.g. how will you get
the images/scans onto your computer? Are your devices charged? Will
you be somewhere with reliable internet? Do you have sufficient
storage?)
Apple users: images are (default) saved using HEIC format. Options:
(Easiest, Recommended) Change your settings: Settings -> Camera
-> Formats -> "Most Compatible" to save as jpeg. This will save all
future images as jpeg so if you prefer HEIC, you'll have to manually
change your settings each time.
You could try any of the free "convert heic to jpg" websites, but use
at your own discretion.
Submissions after 12 PM/5 PM will not be accepted. Ensure that we
have received your work by 5:00 of your time slot.
It is strongly recommended that you transfer and upload your short
answers as you complete them so you can ensure your completed work
will be submitted.
Allowable Resources
The midterm is an individual and independent open-book test. You have
access to resources within our course ONLY:
Course notes
Course Textbooks
Scientific calculator
Pencil, eraser, ruler (or something that can draw a straight edge
Probability calculations in R (not mandatory) or the following probability
tables:
NormalDistributionTable.pdf
(https://q.utoronto.ca/courses/253127/files/18378454?wrap=1)
(https://q.utoronto.ca/courses/253127/files/18378454/download?
download_frd=1)
tDistributionTable.pdf
(https://q.utoronto.ca/courses/253127/files/19350013?wrap=1)
(https://q.utoronto.ca/courses/253127/files/19350013/download?
download_frd=1)
Chi-square-table.pdf
(https://q.utoronto.ca/courses/253127/files/19350019?wrap=1)
(https://q.utoronto.ca/courses/253127/files/19350019/download?
download_frd=1)
Using any other resources beyond our course is considered an unauthorized
aid. This includes but is not limited to:
Posting on discussion board or elsewhere for help (e.g., Slack, Discord,
Facebook, etc.)
Working or discussing with classmates (this is an individual and
independent test!). The work you submit must be entirely your own.
Receiving assistance, tips, hints, or otherwise from someone else
Submitting any work that is not your own
Searching for answers on other platforms (e.g. external textbooks, TAs,
Searching for answers on other platforms (e.g. external textbooks, TAs,
Google, etc.)
Extra Practice
Mixed bag of practice problems will be posted here shortly. It is not an
exhaustive list of questions that covers all topics listed, but offers a randomized
selection of practice problems for you to carefully think about tool selection to
solve these problems. This is generally the number one problem students
encounter on a test: without a structured textbook indicating what tool/concepts
to apply, it really tests how comfortable you are with the various concepts
covered and how to appropriately use them.
Mixed Bag of Practice Problems.pdf
(https://q.utoronto.ca/courses/253127/files/19766015?wrap=1)
(https://q.utoronto.ca/courses/253127/files/19766015/download?download_frd=1)