SOST30062-R代写
时间:2023-05-21
SOST30062 Data Science Modelling – Final report
Semester 2, 2022/23
Administrative arrangements
The final report contributes 60% of your total mark for SOST30062 Data Science
Modelling.
Please submit your work using the link to the TurnItIn anti-plagiarism service on the
course Blackboard. The submission link will be placed on the “Assessment /
Assignments, tasks” page of Blackboard, but you will be notified about the exact
location before submission opens. You can also find detailed instructions there on
how to upload your assignment to TurnItIn.
The report is due by 2pm (UK time) on Tuesday, 23th of May, 2023.
Your report should not be longer than 2,000 words. (Figure/table captions,
bibliography, and appendices do NOT count towards the word limit).
You are advised to keep a copy of the work you hand in. Your attention is drawn to
the sections regarding late submissions, mitigating circumstances and plagiarism in
the Course Outline (available on the “Essential Information / Course outline” page in
Blackboard).
If you have any questions concerning this assignment, you should email Tanja
at: tatjana.kecojevic@manchester.ac.uk.
Description of the task
We provide you with a large social science dataset, which you will need to analyse in
your report. You can download the dataset and additional description from the
“Assessment / Assignments, tasks” page of the course Blackboard.
Using the dataset, you are asked to
1. select one variable, Y, to be explained by the other variables, the Xs, in the
dataset;
2. motivate and formulate a research question about Y and the Xs (we present
many examples of possible research questions throughout the course);
3. apply an unsupervised learning technique (lecture week 9) to explore the
dataset, such as PCA, clustering techniques (Method I – exploration);
2
4. choose a supervised learning technique (lecture weeks 3-5 and 8) that is
appropriate for Y and answers your research question, such as regression
models, splines, LDA, trees-based methods (Method II – inference);
5. apply the method selected in the last step in a suitable advanced analytic
approach (lecture weeks 6-7), such as subset selection, ridge regression,
cross-validation, bootstrap, LASSO (Method III – the “twist”);
6. write a report on the above steps, producing no more than one descriptive
figure or table and one inferential figure or table that sum up your results.
Overview of the task
Structure of the report
We suggest the following section structure for your report (you may choose to
structure your report differently):
1. Introduction: briefly describe the empirical context and present the dataset
2. Research question: motivate and formulate your research question
3. Methods: briefly present your method choices (Methods I-III) and the steps of
your analysis
4. Results: present and interpret your results, with two key tables/figures
5. Conclusion: briefly discuss what the results imply for your research question,
discuss one or two key limitations of your analysis
6. Appendices: add any important additional figures/tables, include R code (so
that your analyses can be reproduced)
3
Marking criteria
20% – Data presentation and research question (sections 1-2 from the above
structure)
• - Do you present the context of the data (e.g. topic) and the dataset (e.g.
types of variables, number of observations you use) clearly?
• - Do you formulate your research question clearly?
• - Do you explain why you think the research question is interesting and
relevant?
• - Can you answer your research question using the available variables?
20% – Method choices (section 3)
• - Are the chosen methods appropriate for your Y and X variables?
• - Do you motivate their use (explain why they are appropriate for Y and the
Xs)?
• - Do you explain the steps of your analysis clearly?
• - Can you answer your research question with your planned analysis?
20% – Data analysis (section 4, appendix R code)
• - Is your application of the methods to the data technically correct (e.g. are
the variables transformed if needed and used in the correct role in the
analysis)?
• - Do you use the R packages and functions most appropriate for your
analysis?
• - Do you provide a reproducible R code in the appendix?
20% – Interpretation of results (sections 4-5)
• - Are the interpretations of your results correct and clearly explained?
• - Do you discuss what the results imply for your research question?
• - Is your conclusion about your question correct in light of the results?
• - Do you discuss one or two important limitations of your approach in
answering the research question?
20% – Visual presentation of results (section 4)
• - Are the tables and figures readable and clear (e.g. they are clearly
annotated, do not have overlapping labels)?
• - Are they appropriate representations of your findings?
• - Do you interpret them correctly in the text?
4
Additional notes on marking
• will value original and well-motivated research questions.
• will also reward thoughtful ideas about the limitations of your analysis (and
how one could possibly overcome them).
Though your R codes will not be assessed, we appreciate it if you provide a clear
and understandable code in the appendix.
• the marking weight of your two main figures/tables is quite high (20%), we will
evaluate these critically.
• will provide you with good examples and further guidelines for the different
elements of the report throughout the semester (for example, about writing
clean R codes and making readable figures).
essay、essay代写