ST308: Assessed Coursework - Project
You will undertake a project that will determine your final mark of the course by
20 per cent. The project will require you to analyse one or more real world
datasets of your choice. You can use the Open ML website, the UCI repository or
any other publicly available dataset that was not analysed during the course and
is suitable for the analyses described below.
The project will consist of analysing the data based on the following techniques
covered in the course
1. Regression (linear) or Classification (logistic regression): where the
problem consists of a continuous or a binary response variable.
2. Hierarchical / Multi-level models.
The above tasks should be implemented with Markov Chain Monte Carlo
methods. Material from the computer classes can be used for loading the data
and doing the analysis.
You will be expected to present the empirical problem, consider and implement
competing methods to use the available data to address it. The output from these
techniques should be described in non-technical language targeting people with a
minimal quantitative background.
The results of the project should be presented in a 8-page article in A4 format. The
8-page limit includes figures and tables but excludes the title page, table of
contents and references. In addition to the 8-page article, which should be
submitted via a soft copy, your R code should also be submitted with appropriate
comments and description via a R markdown notebook.
The project is due Thursday, May 6th noon.
It would be ok to also include the following topics in your analysis if they are
relevant but this is entirely optional.
1. Cluster Analysis: if it is of interest to identify homogeneous population
groups in the context of the empirical application.
2. Gaussian processes: if some of the associations are non-linear. 学霸联盟