PPGA503-无代写
时间:2022-12-09
PPGA 503 – Take Home Final 2022
Due: 11:59 pm
on Dec 15 This is the take home component of your PPGA 503 Final Exam.
You may brainstorm with your classmates when beginning your work, but
all of the analysis and write-up are to be completed individually.
Please upload three deliverables to Canvas:
• Annotated .do file you used to complete your analysis
• Technical write-up
•
1-page memo For the take home portion of your final, you will
again use a modified version of the World Bank’s
Indonesia
Database for Policy and Economic Research (INDO-DAPOER) dataset. You
shouldn’t need it, but you can access the background material on the
dataset in the following places: here and here. The dataset and a
list of variables are available on Canvas (under final exam). As you
know, the dataset includes a significant number of variables covering
health, education, governance, economic, development, and natural
resource attributes at the district level in Indonesia. There are
slightly over 500 districts in one of two types: kota (city) are
more urbanized, while kabupaten (regency) are typically more
rural. Provinces, of which there are just under 40, form the
meso (middle) tier of government. You previously focused on health
outcomes. For this assignment, you will choose one of the following
three prompts, each with a different DV. You will complete the
necessary data management and analysis to answer your chosen prompt.
We are largely leaving the analysis to you, though the write-up
instructions (together with HW 4 and 5) will be useful as a guide.
Prompts
(please choose one of the following three) 1. Educational
performance: Educational performance is critical for development, as
it is directly empowering for local populations, and has all manner of
spillover effects (for instance in attracting new investment and
economic opportunity). Our dataset has a district-level variable called
Average
National Exam Score: Senior Secondary Level, which
captures performance in a nationally-administered exam. The score is
out of 100 points. You are asked to provide insights (1) on variation
in the exam performance (ie, why some districts do well and others do
not), and (2) on how (based on those insights) districts might improve
their performance. Remember that some district-level attributes are
fixed (a district can’t change its location or demographic structure,
for example, but it can build more schools, raise/lower taxes, etc…).
2. Inequality: Inequality receives a lot of attention in popular circles
and among policy makers, as it has important practical and normative
implications. The Gini Index is a frequently-used measure of inequality
(higher values denote higher levels of inequality. Our dataset includes
district-level Gini coefficient. You are asked to provide insights (1)
on variation in the Gini coefficient (ie, why some districts are highly
unequal, while others are more equal), and (2) on how (based on those
insights) districts might reduce inequality levels. Remember that some
district-level attributes are fixed (a district can’t change its
location, demographic structure, or natural resource endowment,
for example, but it can make a number of potentially relevant policy
changes). 3. Household access to electricity: Access to electricity is
seen as important for quality of life and for empowering populations,
as it has benefits for education, health, and productivity.
Electrification (Household Access to Electricity: Total (in % of
total households)) varies considerably across Indonesian districts.
You are asked to provide insights (1) on variation in household
electrification rates (ie, why some districts have high rates and others
low rates), and (2) on how (based on those
insights) districts
might increase electrification rates. Remember that some district-level
attributes are fixed (a district can’t change its location,
demographic structure, or natural resource endowment, for example,
but it can make a number of potentially relevant policy changes).
Your technical write-up should answer the following questions. You are
writing this for the instructional team, so you are free to use precise
and technical language. 1. Theoretical/Conceptual a. Once you have
selected your prompt, think carefully about the outcome you are trying
to explain. From a theoretical/conceptual perspective (don’t think about
specific variables yet), what do you think is important for
understanding variation in the DV? In the real world, of course,
everything matters. But some things matter more than others. Here we’d
like you to focus on the most important factors (or groups of
factors). Please write one paragraph summarizing, from a
theoretical/conceptual perspective, what you believe is important for
explaining variation in your DV. Don’t go overboard (we don’t want 30
distinct factors), but do try to be reasonably comprehensive. You might
also distinguish conceptually between core and peripheral (or
subsidiary) factors. b. Now let’s think about conceptualizing and
operationalizing those factors. Look at the variables available in
the dataset. Are you able to capture all of the factors you identified
in 1a with the available variables? How comfortable are you with
those variables? In other words, do you think the available data can
effectively capture the factors you’ve identified, or do you have
concerns? Recall that you can always construct new variables
based on existing ones. Please write one paragraph that discusses how
well you think the available data can capture the theoretically relevant
factors you previously identified, making sure to note specific
concerns. 2. Models a. Please do any necessary data preparation
(re-scaling, renaming, constructing new variables, etc) and then start
constructing your model. We’d like you to proceed in three stages. The
first (model 1) should have the core independent variables (somewhere
between 2 and 4). The second (model 2) and third (model 3) should make
incremental refinements. This will primarily be the inclusion of
additional variables, but you may also include other refinements such
as interaction effects, functional form transformations, etc, if
you think they are
necessary (in other words, don’t include them just for the sake of being fancier!). Please write
at
least four paragraphs: in the 1st, briefly describe any data
preparation you did. In the 2nd, discuss how useful the models are
overall, being sure to address how they change through the refinements;
in the next paragraph(s), interpret with some precision the
independent variables, focusing on both statistical and substantive
importance (you may use more than one paragraph if you feel
necessary); in the final paragraph, please briefly discuss any
important decisions you made in the analysis, and note any remaining
concerns you have. b. Please look for and address remaining issues with
the models (some of which you may have noted in the previous
paragraph). This might include things violations of the Gauss Markov
assumptions, the strong effect of outliers, etc. If you make
any corrections, estimate the regression again and include the
results as model 4. Please write one paragraph that discusses
what you did and whether it changed your overall conclusions in any
meaningful way. c. Please construct a (nicely edited) regression table
containing all four models, and include it in your write-up. Remember
outreg2 (or similar) to save you from doing this manually! 3.
Reflections
a. Now that you have completed the full analysis, step
back to reflect on the findings. Do they support or contradict your
initial expectations? Is there anything you find surprising? Are there
any major omitted variables that you haven’t accounted for yet? Please
discuss these (or related) issues in one or two paragraphs. b. Finally,
we care about causality because it is key to changing outcomes
via policy. Think critically about your findings, especially
between key IVs and the DV. Are we picking up correlations?
Causations? An endogenous relationship? Think about these issues, and
then discuss in one paragraph. Finally, complete your 1-page
memo. Think of this as a briefing for a policy maker,
addressing the questions posed in the prompt. You should provide basic
contextual information (data source, descriptive statistics, etc), an
overview of your conclusions, and a short discussion of implications,
particularly in terms of potential policy interventions. You should
write this for the policy maker (not for the instructional team), so be
sure that your language is precise but accessible, and that the analysis
is grounded in evidence. You may structure the memo however you’d like
(including using visualizations), but you are limited to one page (to be
clear, one side, not both!).