stata代写-SOST70032-Assignment 20
时间:2021-05-19
1 COMPLEX SURVEY DESIGNS AND ANALYSIS -SOST70032 Assignment 2021 The assessment for this module is based on 85% for this three-part assignment, and the remaining 15% on an online quiz to be held on the 15th April 2021. This document specifies the particular requirements for the assignment. The total marks available for the assignment are set to 100. Part A: An outline proposal A maximum of 2-page (maximum 1000 words) summary of a proposal for research on a topic of your interest implementing a complex survey design. Note: You are also allowed to have up to two pages of appendices, including figures, tables, and questionnaires. This brief summary proposal should cover the following: - Background, aims and objectives, and justification of your research questions. - Description of your methodology with emphasis on your target population, survey design, sampling, mode of data collection, instrumentation and any piloting. - Proposed methods of analysis, and implications of the complex survey design. - Expected main contributions and outcomes of this work Total points for Part A: 35 Part B: Practical Exercise The dataset assignmentCSDA2021.dta is a stata format dataset based on data from the Brazilian Family Budgets Survey 2002/2003 (BFBS). The study was designed to be a nationally representative survey of Brazilian households, covering 26 States and the Federal District of Brasilia. Fieldwork took place between July 2002 and June 2003 (12 months). The survey aimed to get State and National level estimates for: (a) income and expenditure variables, (b) anthropometric measurements of population, and (c) a large number of socio-economic indicators. The Sample Design involved a stratified, two-stage sampling of households, with the following details:  Stratification of PSUs by - State - Education of heads of households (average at PSU level)  Primary sampling units are census enumeration areas PSUs sampled with probability proportional to size (PPS – size) where the size variable is the number of HHs in the census  Secondary sampling units are households SSUs sampled with SRS within each PSU 2 The achieved sample sizes are summarised in the following table: Items National Rio de Janeiro Strata 443 30 PSUs 4,000 117 Households 48,470 1,218 Families 48,568 1,222 Individuals 182,333 3,917 Data were collected at 3 levels with 6 CAPI questionnaires, as detailed below: Household level:  List of all usual residents, including those not present  Household structure, tenure and utilities / services  Existence of and number of appliances and cars  Household related expenditures Family (consumption unit):  Family level expenditures  Family subjective assessment of living conditions Individual level:  Age, sex, ethnicity, religion, education, height, weight  Jobs held, and status, occupation, activity & income in each job  Other individual income and expenditure The dataset for analysis represents only the Rio de Janeiro area. Variables have been pre-coded and labelled, but you will need to do some recoding for some of them. The variable description and codebook is attached at the end of the assignment. The variables indicating stratification and clustering, as well as the weights are also listed there (Appendix 1). There are two potentially interesting response variables: - The variable hlthins is a 2 category outcome indicating whether or not the person has health insurance. - The variable crdcard is also a 2 category outcome indicating whether or not the person has a credit card You will need to choose one of these two variables as your outcome of interest in order to perform the tasks in this assignment. We will call this variable within the guidelines ‘response’ variable. It is of interest to find out how the chance of a person having health insurance or a credit card is associated with other variables in the dataset, including their education (education), gender (sex) and ethnic group (ethngrp). Please note that you may need to perform some data recoding (for both dependent and independent variables) during the analysis. Given this 2-stage stratified design, use STATA to carry out the analysis detailed next. It is advised that you present and discuss your results under the four headings used 3 below. However, it is also advised that this should be in the form of a coherent essay- style report (and not simply a list of answers to these questions or copies outputs from the software): Part 1: Descriptive analysis [5 points] Q1. Obtain the overall proportion (mean) of the response variable, and its standard error assuming a SRS had been taken. Briefly comment. Q2. Briefly state which variables you expect to significantly affect the response variable and give some descriptives for them. Explain any recoding you performed and the reasoning for this. Part 2: Model based inference [25 points] Q3. Using an appropriate model estimate (without considering the survey design yet): a) The overall chance that the person has a health insurance/credit card. b) The association between the chance that the person has health insurance/credit card and their age-group, gender, education and ethnic group. Which, if any of these variables have a significant effect? c) Extend or adopt the model in Q3.b to include other variables of interest (as you discussed in Q2 – this will be defined as “your model” next) Q4. Take the stratification by state into account in your model and estimate: a) The association between the chance that the person has health insurance/credit card and their age-group, gender, education and ethnic group. Which, if any of these variables have a significant effect? b) Do the same for “your model”. Q5. Take the multistage clustering and stratification by state into account in your model and estimate: a) The extent of within-PSU similarity of the response variable before adding the stratification variable and any other explanatory variables to the model. b) The association between the chance that the person has health insurance/credit card and their age-group, gender, education and ethnic group. Which, if any of these variables have a significant effect? c) Do the same for “your model”. d) For either “your model” or the resulting model in Q5.b, discuss whether there are any differences between the conclusions from this model and the corresponding conclusions of the models from Q2, Q3 and Q4. Also discuss what is the extent of within PSU similarity of the response variable once the explanatory variables have been added to the model? Part 3: Design based inference [15 points] Q6. Use the appropriate commands in stata to set up a complex survey design used in the study. Then estimate: a) The proportion of people who have health insurance/credit card. b) The association between the chance that the person has health insurance/credit card and their age-group, gender, education and ethnic group. Which, if any of these variables have a significant effect? c) Do the same for “your model”. 4 Part 4: Discussion [15 points] Discuss the results in the context of the debate between design based and model based approaches. This should include a discussion of how the results from the design based approach in Part 3 compare with those from the model based approach in Part 2. This section needs to include a review of some relevant methodological literature, and a critical evaluation of the ‘debate’ between the two approaches with references. Please note that this section should not exceed 500 words. Important Note: The results from the practical part of the assignment need to be written and presented in essay formal. Copied and pasted outputs from the software are not allowed. Total points for Part B: 60 Total points for Parts A and B = 95. 5 points are awarded for layout and clarity of the report, including referencing. The combined word count for Parts A and B should not exceed 3000. The submission deadline for this work is May 21st 2021, 3pm via turnitin. Penalties are applied to work that is submitted late without a valid excuse. Please bear in mind that other modules may also have the same deadline as this one, so please manage your workload accordingly. Please note the Assessment Pledge applies to this course. Beyond this, extensions are granted only by the MSc Director and must be sough prior to the deadline. To ensure fairness, extensions to the deadline will only be granted in exceptional cases. Pressure of work will not be seen as a valid reason for extension and students should plan their workloads accordingly. The deadline has been set as late as possible in the semester and leaves the minimum time for marking ahead of the exam board. Plagiarism Plagiarism is presenting the ideas, work or words of other people without proper, clear and unambiguous acknowledgement. It also includes ‘self plagiarism’ (which occurs where, for example, you submit work that you have presented for assessment on a previous occasion), and the submission of material from ‘essay banks’ (even if the authors of such material appear to be giving you permission to use it in this way). Obviously, the most blatant example of plagiarism would be to copy another person’s or student’s work. Plagiarism is a serious offence and will always result in imposition of a penalty. In deciding upon the penalty the University will take into account factors such as the year of study, the extent and proportion of the work that has been plagiarised and the apparent intent of the student. The penalties that can be imposed range from a minimum of a zero mark for the work (with or without allowing resubmission) through the down grading of degree class, the award of a lesser qualification (e.g. a pass degree rather than honours, a certificate rather than diploma) to disciplinary measures such as suspension or expulsion. The University of Manchester is committed to combating plagiarism. In the School of Social Sciences a percentage of all work submitted for assessment can be submitted for checking electronically for plagiarism. This may be done in two ways: (i) Phrases or sentences in your assessed work may be checked against material accessible on the world wide web, using commonly available search tools. You will not be informed before this checking is to be carried out; (ii) The University subscribes to an online plagiarism detection service specifically designed for academic purposes. See SRMS Booklet for more information. 5 Appendix 1: Codebook Variable Name Variable Description Code Code Description stratum Selection stratum psu Identification of Primary Sampling Unit hhold Number of Household in PSU family Family / consumption unit in household person Number of resident within Household hhcalwgt Calibrated household weight nmembers Number of household members sex Sex 1 Male 2 Female ethngrp Ethnic group 1 White 2 Black 3 Asian, indigenous 4 Mixed race rel Relation to reference person 1 Reference person 2 Spouse 3 Son/daughter 4 Other relative 5 Aggregate age Age in years yrsstudy Years of study crdcard Indicator of having credit card 1 Yes 2 No hlthins Indicator of health insurance 1 Yes 2 No n_benef Number of beneficiaries of health insurance Only for those with HlthIns=1 bodymass Weight or body mass in Kg height Height in cm educatio Classes of years of study 1 YrsStudy = 0 2 1 <= YrsStudy <= 3 3 4 <= YrsStudy <= 7 4 8 <= YrsStudy <= 10 5 11 <= YrsStudy <= 14 6 YrsStudy >= 15 radio Number of radios available in HH fridgfrz Number of fridge-freezers available in HH vcr Number of VCR available in HH washmash Indicator for Washing Machine in HH microwav Indicator for microwave in HH computer Indicator for computer in HH tv Number of TVs available in HH auto Indicator for car in HH aircond Indicator for air conditioner in HH hhtype Household type 1 House 2 Flat 3 Single room in dwelling 6 variable name Variable Description Code Code Description nrooms Number of rooms in household 1 1 to 3 rooms in HH 2 4 rooms in HH 3 5 rooms in HH 4 6 rooms in HH 5 7 or more rooms in HH nbedrms Number of bedrooms in household 1 - 4 Actual number of rooms in HH 5 5 or more bedrooms in HH watsuppl Type of water supply 1 General public supply 2 Well or natural source 3 Other form nbathrms Number of bathrooms in household 0 No bathroom in HH 1 One bathroom 2 Two Bathrooms 3 Three Bathrooms 4 Four or more bathrooms mjobtype Job type in main occupation 1 Domestic employee 2 Employee 3 Employer 4 Self-employed 5 Apprentice, trainee or unpaid worker 6 Self-consumption only 7 No job held income Monthly Income Income for each individual, in Brazilian Real at prices of January 2003. deductns Monthly Deductions othincom Other Monthly Income othdeduc Other Monthly Deductions totincom Total Monthly Income totdeduc Total Monthly Deductions hhincome Household Monthly Income Income for the household, in Brazilian Real at prices of January 2003. Values are repeated for all household members hhdeduct Household Monthly Deductions hhothinc Household Other Monthly Income hhothded Household Other Monthly Deductions hhtotinc Total Monthly Household Income hhtotded Total Monthly Household Deductions
essay、essay代写