ETF2100/5900 Introductory Econometrics
Assignment 1 — A Case Study on the House Price of Stockton California
Important notes:
1. This is an individual assignment. This assignment is worth 20% of this unit’s total mark.
Marks will be deducted for late submission on the following basis: 10% for each day late,
up to a maximum of 3 days. Assignments more than 3 days late will not be marked.
2. Submission deadline for coursework is 11:55pm Thursday of Week 7 (i.e., 14/Apr/2022).
Please submit a soft copy through Moodle. Name the soft copy as follows: student
ID Name.doc (or .pdf). Pdf file is preferred, but word file is also fine. Also, on the
title page, please make sure you provide the student ID and name correctly. Please save
and submit your R script as well.
3. Notation used in the assignment needs to be typed correctly and properly. Incorrect (or
inconsistent) notations are treated as wrong answers.
Please pay attention to the words in bold.
There are 6660 observations of data on houses sold from 1999-2002 in Stockton California
in the file “hedonic1.xls”. Use the data of 2000 only to estimate the next linear model and
answer the associated questions below.
SP = β1 + β2 SFLA + u, (1)
where u is an error term. Note that the sub-index i of each variable has been suppressed in the
above equation. SP = Selling Price, which is a function of SFLA (size of living area in square
feet).
Questions: (20 marks in total)
1. (a). Consider the data from year 2000 only, and generate the descriptive statistics for SP
and SFLA (i.e., 2 VARIABLES IN TOTAL), and report them in a table. (2 points — 1
point for each variable)
(b). Using the data from year 2000, Plot SP (y-axis) against SFLA (x-axis). Do you
observe any pattern? (2 points — 1.5 points for the plot, and 0.5 point for the comment)
2. Estimate the above hedonic model for the houses sold in Stockton California. Write down
the estimated model (including estimates of the coefficients and the associated standard
deviations), and comment on the estimation result using Goodness of fit. (4 points — 3
1
points for reporting the results properly, and 1 point for commenting on Goodness of fit
correctly)
3. At the 5% significance level, test if SFLA has POSITIVE impacts on SP. Keep two decimals
for the calculation involved. (4 points)
4. The model (1) can be written in a sample version as follows:
yi = β1 + β2xi2 = x
′
iβ + ui, (2)
where i = 1, . . . , N , β = (β1, β2)
′, xi = (xi1, xi2)′ = (1, SFLAi)′, and the definition of yi
should be obvious. Further define X = (x1, . . . , xN)
′ in case one may need this notation
to answer the following questions.
(a). To obtain the OLS estimate of β of (2), we need to minimize an objective function.
Please write down the correct function form of the objective function. (1 point)
(b). Describe the basic assumptions of the classic linear regression models using the nota-
tions of (2). (2 points)
(c). Provided that these assumptions hold, what conclusions can you make about the OLS
estimator? (1 points)
5. Provide detailed steps to prove that by minimizing the objective function in question (4),
your OLS estimate has the form β̂ = (
∑N
i=1 xix
′
i)
−1∑N
i=1 xiyi. We assume that
∑N
i=1 xix
′
i
is invertible. (4 points)
2