Technique Spotting and Choice of P4
We have discussed previously that when approaching a project, we might usefully
apply the UPSTART procedure. Here we give some additional comments about
identifying the Statistical Technique to adopt and how this impacts on your choice of
P4.
1. Recall the elements of UPSTART:
Understand the immediate question and its underlying
Purpose (NB be open to the idea that the question asked is not well considered.
Do you need to look at or research the context?).
Identify relevant
Statistical
Technique(s) (of course, more than one might be needed!) to address the
question, bearing in mind the data provided (or available—do you need
to update the data or find different data?)
Analyze (including recalling how to carry out the necessary data handling and
use of R)
Review whether you are happy that your analysis gives a complete, correct
solution. Consider carefully any limitations or assumptions affecting the
suitability of your solution.
Translate your statistical information into a suitable form for reporting to your
client, bearing in mind their requirements and statistical sophistication.
2. What Statistical Techniques are available to you?
Remember that it is unethical to pretend expertise that you do not have, but you have
met many methods during your studies to date. It is probably helpful to compile a list
(this will be useful here, but it might be particularly valuable before going to an
interview; it should help bolster your confidence to know how many things you can
do!). You could do this by looking at the syllabi for all the courses you have taken/are
taking. I have given some suggestions in the separate document ‘Familiar Techniques’,
but it would be good to produce your own first.
3. How do I choose between techniques?
Remember each technique is designed to answer a particular type of problem. Usually
you will have seen this in a simple/sanitized form; you will need to consider the UP
elements of UPSTART carefully here to see what is relevant. Don’t forget that the
problem may be multi-faceted, so you may need to consider sub parts separately.
Techniques make certain assumptions. They are only applicable to certain types of
data (for example, some tests can only be applied to continuous data, others to counts
or categorized data) and often have certain other limitations (eg observations must be
independent). You will need to check carefully whether these hold for your situation.
You will need to be able to apply the technique and explain it (to some extent). Do you
know how to apply the technique you are considering? Are you familiar with suitable
software and is it available to you? Maybe you have both a simple graphical technique
and a more formal hypothesis test available (eg Normal probability plot and
Kolmogorov-Smirnov test); do you need a formal procedure or is a graphical one
adequate?
It would be good to review the list you came up with in 2, adding brief notes on purpose,
assumptions, data requirements, key R command,…
4. How do I use this to pick my P4 choices?
You should quickly scan all the projects, seeing if you can get a quick understanding
of what is likely to be a useful approach. This initial pass need not be complete; use it
to cut down options.
When you only have a few left, start to consider more carefully whether your initial
thoughts would address the whole problem, whether assumptions are likely to be valid,
whether more data might be needed,…
It is good to consider your own interests, but do not be too heavily swayed by them.
They may give you useful background context (almost ‘for free’), but if you have no
idea of how to approach a project, they will be irrelevant and picking such a project
could harm your chances of a good mark. Don’t forget, everyone has the chance to
propose their own project under Pv, but you need to think of an interesting and viable
question, you cannot just say you want to follow up on some hobby.
5. Example
Look at the example brief and data (this is a project we have often used as a P4 option).
Taking no more than 10 minutes, try and identify:
a) the types of techniques which might be useful
b) assumptions of these methods
c) whether the data look amenable to such an approach/any issues you can foresee
d) any relevant background (no research, just ideas related to the topic which come
to mind) which might influence approach, other questions you might consider,…
Then (not before!) look at the ‘Comments on Example’ sheet.