ECOM30002/90002 -无代写-Assignment 1|学霸联盟

ECOM30002/90002 -无代写-Assignment 1

时间：2025-08-18

ECOM30002/90002 Econometrics 2, Semester 2, 2025
Capstone Project Assignment 1 (Proposal)
1 Overview
The capstone project combines knowledge in econometrics and economics with practical
data analysis skills. You will use the 2018-19 wave of the “Burkina Faso Enqueˆte Har-
monise´e sur le Conditions de Vie des Me´nages (Harmonized Survey on Household Living
Standards)” dataset to develop a unique research question and undertake empirical anal-
yses across three Capstone Project Assignments, of which this is the first.
• The capstone project is a group project.
– Groups must consist of 3 students, all from the same tutorial. Groups will
be the same for the three assignments of the capstone project throughout the
entire semester.
– Students may form their groups independently and register their group with
their tutor during tutorials of Week 1–3.
– During the Week 3 tutorial, the tutor will form groups of any remaining stu-
dents who are not already in a group.
• This assignment (Assignment 1 – Proposal) is worth 5% towards the final mark and
has two components:
1. Oral defence
Takes place in person during Week 4 (18–22 August). Details in section 2.1
below.
2. Proposal report
Submission of the report must be online through the LMS assignment portal
and is due Friday, 29 August, 11:59 p.m. Late submissions will not be
accepted. Details in section 2.2 and 2.3 below.
• Marks:
– Each group member will receive equal marks.
– The oral defence is marked between 0.4 and 1, where 1 is awarded for a satis-
factory oral defence.
1
– The submitted proposal report is marked between 0 and 5.
– The total marks for “Assignment 1 – Proposal” are obtained as
“toal marks” = “marks oral defence” × “marks submitted proposal report”.
– For groups that do not book or show up to the oral defence, 0.4 marks will be
given for the oral defence.
– In the case that not all members are able to attend the oral defence, 0.4
marks will be awarded for the oral defence unless the group submits a signed
statement of cooperation (template available on the LMS) to “Project proposal
oral defence (statement)” on Canvas.
2 Assignment tasks
2.1 Oral defence
The oral defence is a required component of the 1st capstone assessment and forms
an essential part of the feedback process that helps you towards a successful proposal
submission and capstone project delivery.
• Groups book a 10-minute oral defence via the LMS. Details on how to book will be
announced through the LMS.
• Penalties may be imposed for being late or/and exceeding the time limit.
• All group members attend the booked oral defence.
• During the oral defence, the group members show a written full-length draft of their
Proposal report and discuss it with the capstone tutor: The group presents their
proposed research question (or questions) and the variables they have identified
as suitable to answer the question. The group shows/discusses some descriptive
statistics and at least one regression result for these variables, and answers questions
regarding the preparation of the analysis and the results produced.
2.2 Written assessment: Research proposal report
• Download the data set following the instructions in section 3.
• Develop an interesting research question: Identify variables within the “Burkina
Faso Enqueˆte Harmonise´e sur le Conditions de Vie des Me´nages (Harmonized Sur-
vey on Household Living Standards)” dataset that are related and have a plausible
2
causal link. This will form the basis of your research question. Briefly motivate
your research question based on economic arguments and previous literature.
• Pin down the key causal relationship: Construct a causal linear regression model.
Clearly define your outcome (dependent) variable and primary explanatory (inde-
pendent) variable, explaining the expected direction and nature of their relation-
ship. Provide a rationale for the hypothesized causal relationship.
• Selecting key variables: Choose an outcome/dependent variable that reflects the
impact or effect you are interested in examining. Identify a treatment/independent
variable that represents the cause or intervention you believe affects the outcome
variable. Include a few additional regressors that can control for other factors influ-
encing the relationship between your primary independent and dependent variables,
enhancing the robustness of your findings. Remember, the selection of variables
should be driven by your research question. This approach aids in the clarity and
precision of your analysis.
• You may choose up to two dependent variables that are relevant for your research
question, with one key explanatory variable and up to four further control variables
relating to each dependent variable. If you have two dependent variables, they
should reflect the same causal relationship of interest.
• Present a table of descriptive statistics for the selected variables. The table should
show the mean, standard deviation, minimum, and maximum for each variable.
Briefly interpret the statistics.
• Present a table of regression results for the selected model and variables. No dis-
cussion of these results needed.
• Report submission guidelines and formal requirements are detailed in section 2.3.
2.3 Proposal report submission
• The proposal report submission is due Friday 29 August at 23:59 (11:59 PM).
• The report must be submitted online as a single PDF file.
• The proposal report must start with a cover page that contains a preliminary title
and the names and student numbers of all group members.
• Reports need to use a font size of 12, line spacing of 1.5, and margins of at least 2
cm.
3
• Maximum word count: 200 words. Tables and graphs are exempted from the word
count. Penalties may be imposed for exceeding the word count by 50% or more.
There are no penalties for reports under the word limit.
• Any data analysis has to be performed in R/RStudio, and the R-code used to
produce all results in the report must be submitted as an appendix. R-code does
not count towards the word count limit. The R code should include clear comments
explaining the purpose of major steps.
• You are required to keep a copy of your submission after it has been submitted.
3 The dataset
• Overview of the data
The World Bank owns a library of survey datasets, many of which are freely avail-
able for public, non-commercial use. For this capstone project, you will utilize one
of these datasets, the “Burkina Faso Enqueˆte Harmonise´e sur le Conditions de Vie
des Me´nages (Harmonized Survey on Household Living Standards) 2018-2019”, to
formulate your own research question and conduct empirical analyses.
This is a nationally representative survey which interviewed with 7,010 households
and as much as 45,612 individuals in Burkina Faso during 2018–2019. Surveyed top-
ics include living conditions, health, education, food consumption, and other char-
acteristics. The full survey dataset contains community, household, and individual-
level data.
You may initially focus on the individual-level datasets to explore variables you
may be interested in.
• Downloading the data
To access the data, download the “Burkina Faso Enqueˆte Harmonise´e sur le Con-
ditions de Vie des Me´nages (Harmonized Survey on Household Living Standards)
2018-2019” from the World Bank Microdata Library at
https:
//microdata.worldbank.org/index.php/catalog/4290/get-microdata.
You will need to sign-up for a free account. Steps to sign-up such an account are
as below:
4
– Click on the “Get Microdata” tab of the page to download the data, accepting
the terms and conditions first.
– Click the “Register Button”
– Fill-in the user registration information. Once you register, a confirmation
email will be set to the address you provided.
– Log in
– Return to the page of the dataset and click on the “Get Microdata.”
– You will be redirected to a data use application, where you only need to enter
a brief description of your intended use of the data. A short sentence, such as
“Use for a university project,” should suffice.
– Accepting the terms and conditions and submitting the application will redi-
rect you to the data files.
Apart from the link to download the data, the webpage contains a description
of the survey, a description of the data, and other documentation such as the
questionnaires used in the survey, which will be indispensable for working on your
project.
Key information on the webpage:
– Variable descriptions: DATA DESCRIPTION > data file
– Questionnaires: DOCUMENTATION > Questionnaires
– Data files: GET MICRODATA
Note that since the survey was conducted in French, the documentation does not
provide English translations of the questions corresponding to each variable. How-
ever, automatic translation functions available in most browsers, such as Google
Chrome, Safari and Microsoft Edge, can be used for quick translations.
• Data formats
There are several data formats available for download. You may use the csv
format as in the tutorials. Alternatively, the data is available in Stata format
(dta), which can be imported into R via the navigation menu by selecting File >
Import Dataset > From Stata. This method attaches descriptive labels to the
variables, potentially facilitating data exploration.
• Data files
5
The download package contains 50 data files (and a further zip file with more
data files about consumption) with varying levels of granularity (i.e., community,
household, or individual-level).
To use information from different sections, you need to merge the corresponding
data files in R into one single dataframe that you can use for your econometric
analysis. See ?merge in R.
• Merging dataframes: Example
Across datasets, each individual can be identified using their cluster (grappe), house
(menage) , an individual (s01q00a) identifiers. Combining these can generate a
unique ID for each individual, which is required to merge datafiles.
Below is sample R code for merging datafiles:
Load data files:
df1 <- read.csv("s03 me bfa2018.csv")
df2 <- read.csv("s04 me bfa2018.csv")
Create individual ID (iid) for df1 and df2 by concatenating grappe, menage, and
s01q00a variables, then converting the new variable into a factor:
df1$iid <-paste0(df1$grappe,str pad(df1$menage, 3, pad = "0"),
str pad(df1$s01q00a, 2, pad = "0"))
df2$iid <-paste0(df2$grappe,str pad(df2$menage, 3, pad = "0"),
str pad(df2$s01q00a, 2, pad = "0"))
Merge the datafiles by iid
df merged <- merge(df1, df2, by = "iid")
• Further considerations on the data
– You are not limited to variables from two data files. By repeating the merger
procedure, you can merge any number of them.
– Be sure to carefully read the documentation to more fully understand how
each variable is represented (i.e., missing values may be indicated as ”NA”,
”9999”, or merely left blank). Binary variables may also be stored as 1 or 2
as opposed to 0 or 1.
6
4 Further instructions for working with the data
• Ensure all your regressions and descriptive statistics are obtained using the same
estimation sample, a practice known as “complete case analysis”. This means all re-
gressions and descriptive statistics should include the same number of observations.
Due to missing values in many variables (e.g., from non-response or irrelevance),
the estimation sample size may vary based on the variables included in a regression.
After determining which variables you will use, select your estimation sample for
the entire assignment based on the subset of observations with non-missing values
across these variables. Functions like na.omit(), subset() or complete.cases()
in R can assist in this process.
• When selecting variables for your analysis, you may need to transform them to
better fit your analysis. This could involve converting a categorical variable into
dummy variables, applying logarithmic transformations to continuous variables,
etc. If you transform any variables, explain why. Also, check for and exclude
unreasonable values (e.g., “999” for non-response or blank values). If observations
are excluded, justify your reasoning. Avoid using variables with very low response
rates to maintain an estimation sample size of at least 300 observations.
• Use clear and interpretable names for variables in your text and tables, such as
“birth year” instead of codenames like s01q03c. Reference the original codename
only once when first mentioning the variable. You should only mention the original
codename briefly at first mention of the variable in your text.
• Number tables and provide descriptive titles, ensuring they are self-explanatory, free
of unnecessary information, and formatted simply. Tables should be understandable
on their own, typically requiring 2–4 decimal places for interpretation. Include any
additional notes for clarification below the table. For examples of well-formatted
tables, refer to the American Economic Review. R packages such as stargazer can
generate publication-quality tables; see the cheatsheet at https://www.jakeruss.
com/cheatsheets/stargazer/ for a quick start.
5 Suggested reading
The following chapters from the subject’s recommended references contain helpful general
information on carrying out an empirical project and thinking critically about research
studies:
7
Wooldridge, Jeffrey M. (2019), “How to carry out an empirical project,” Chapter 19,
Introductory Econometrics: A Modern Approach, 7th Edition, Cengage.
Stock James H., Watson Mark W. (2015), “Assessing Studies Based on Multiple Regres-
sion,” Chapter 9, Introduction to Econometrics, 3rd Edition, Pearson.
8

学霸联盟