COMP226 Assignment 2: Strategy
Development
Continuous
Assessment
Number
2 (of 2)
Weighting 15%
Assignment
Circulated
Monday 19 April 2021
Deadline 17:00 Friday 7th May 2021
Submission
Mode
Submit up to two files to the CodeGrade assignment on Canvas:
strategy.R (non-optional to get marks) and results.yaml (optional).
Learning
Outcomes
Assessed
This assignment addresses the following learning outcomes:
• Understand the spectrum of computer-based trading applications
and techniques, from profit-seeking trading strategies to execution
algorithms.
• Be able to design trading strategies and evaluate critically their
historical performance and robustness.
• Understand the common pitfalls in developing trading strategies
with historical data.
• Understand methods for measuring risk and diversification at the
portfolio level.
Summary of
Assessment • The goal is to implement and optimize a well-defined trading
strategy within the backtester_v5.5 framework.
• Marks are available for the correct implementation of 4 functions in
strategy.R (70%). Further marks (that depend on a correct
implementation in strategy.R) are available for the results of a
cross-validated optimisation that you can include in results.yaml
(30%).
• The expected input and output behaviour of each function, and form
of results.yaml, is fully specified in this document.
Submission
necessary to
pass module
No
Late
Submission
Penalty
Standard UoL policy; resubmissions after the deadline may not be
considered..
Expected time
taken
Roughly 8-12 hours
Before you move on and read more about the assignment and start working on it, please
make sure you have worked through "backtester.pdf", which is an intro to the
backtester_v5.5 framework. Only return to this document when you already have
backtester_v5.5 up and running.
First, let's recall the contents of the backtester_v5.5.zip:
backtester_v5.5
├── DATA
│ ├── A2
│ │ ├── 01.csv
│ │ ├── 02.csv
│ │ ├── 03.csv
│ │ ├── 04.csv
│ │ ├── 05.csv
│ │ ├── 06.csv
│ │ ├── 07.csv
│ │ ├── 08.csv
│ │ ├── 09.csv
│ │ └── 10.csv
│ └── EXAMPLE
│ ├── 01.csv
│ ├── 02.csv
│ ├── 03.csv
│ ├── 04.csv
│ └── 05.csv
├── a2_main_template.R
├── a2_periods.R
├── a2_test_getTMA.R
├── a2_yamls
│ ├── x1xxx
│ │ └── results.yaml
│ ├── x1yyy
│ │ └── results.yaml
│ └── x1zzz
│ └── results.yaml
├── example_strategies.R
├── framework
│ ├── backtester.R
│ ├── data.R
│ └── processResults.R
├── main.R
└── strategies
├── a2_strategy_template.R
├── bbands_contrarian.R
├── bbands_holding_period.R
├── bbands_trend_following.R
├── copycat.R
├── fixed.R
└── rsi_contrarian.R
9 directories, 33 files
In the above listing, the following files/directories are specifically there for assignment 2:
• a2_main_template.R
• a2_periods.R
• a2_test_getTMA.R
• strategies/a2_strategy_template.R
• a2_example_yamls
• DATA/A2
The relevance of these files and directories will be explained below. The rest of the document
is split into three parts: Part 1 describes the 4 functions needed to fully complete
strategy.R; part 2 describes how to create (the optional) results.yaml; part 3 describes
submission via CodeGrade and the available pre-deadline tests. Beyond these tests on
Canvas, example outputs are provided (in this document and as files) so that you can
test whether you have implemented things correctly. Note that the pre-deadline tests do not
deal with the correctness of results.yaml at all, so you should use the examples provided
(which are in the subdirectory a2_example_yamls).
Part 1: strategy implementation (70%)
The trading strategy you should implement is a triple moving average (TMA) momentum
strategy. The specification of the strategy and the functions that it should comprise are given
in full detail, so the correctness of your code can and will be checked automatically.
Two template files are provided to get you started:
• strategies/a2_strategy_template.R, which should become the file strategy.R that
you eventually submit;
• a2_main_template.R, which uses DATA/A2 and strategies/a2_strategy_template.R.
If you source a2_main_template.R with no edits to these two files you get an error:
Error in if (store$iter > params$lookbacks$long) { :
argument is of length zero
This is because the strategy requires a parameter called lookbacks that you will need to
pass in from a2_main_template.R. Read on to see what form this parameter should take,
and, more generally, how you should be editing these two files.
a2_strategy_template.R contains 4 incomplete functions that you need to complete:
1. getTMA
2. getPosSignFromTMA
3. getPosSize
4. getOrders
The main strategy logic of the TMA strategy that you will implement will be contained in
getOrders, which will use the first 3 functions.
The TMA momentum strategy that you should implement is Example 1 in slides 4.7 (but we
give full details here). It uses three moving averages with different lookbacks (window
lengths). The short lookback should be smaller than the medium one, which in turn should
be smaller than the long lookback. In every trading period, the strategy will compute the
value of these three moving averages. You will achieve this by completing the
implementation of the function getTMA.
The following table indicates the position that the strategy will take depending on the relative
values of the three moving averages (MAs). You will compute this position (sign, but not
size) by completing the function getPosSignFromTMA. The system is out of the market (i.e.,
flat) when the relationship between the short moving average and the medium moving
average does not match the relationship between the medium moving avergage and long
moving average.
MA MA MA Position
short MA < medium MA < long MA short
short MA > medium MA > long MA long
The function getPosSignFromTMA takes the output of getTMA. The position size, i.e., the
number of units to be long or short, is determined by getPosSize. As for all strategies in the
backtester framework, the positions are given to the backtester in getOrders. Here are the
detailed specification and marks available for these 4 functions.
Function
name
Input parameters Expected behaviour Marks available for a
correct implementation
getTMA prices;
lookbacks. The
specific form that
these arguments
should take is
specified in the
template code via
the 6 checks that
you need to
implement.
First implement the checks
described in the template.
Hints are given below.
The function should return
a list with three named
elements (named short,
medium, and long). Each
element should be equal to
the value of a simple
moving average with the
respective window size as
defined by lookbacks. The
windows should all end in
the same period, the final
row of prices.
18% for the checks (3%
per check); 12% for a
correct return; 30%
overall
getPosSign
FromTMA
tma_list is a list
with three named
elements, short,
medium, and long.
These correspond
to the simple
moving averages
as returned by
getTMA.
Note: You do not
need to check the
validity of the
function argument
in this case, or for
the remaining
functions either.
This function should return
either 0, 1, or -1. If the
short value of tma_list is
less than the medium
value, and the medium
value is less than the long
value, it should return -1
(indicating short). If the
short value of tma_list is
greater than the medium
value, and the medium
value is greater than the
long value, it should return
1 (indicating long).
Otherwise, the return value
should be 0 (indicating
flat).
10%
getPosSize current_close:
this is the current
close for one of
the series.
constant: this
argument should
have a default
value of 1000.
The function should return
(constant divided by
current_close) rounded
down to the nearest
integer.
5%
getOrders The arguments to
this function are
always the same
for all strategies
used in the
backtester
framework.
This function should
implement the strategy
outlined below in "Strategy
specification".
25%
Strategy specification
The strategy should apply the following logic independently to every series.
The strategy does nothing until there have been params$lookbacks$long-many
periods.
In the (params$lookbacks$long+1)-th period, and in every period after, the strategy
computes three simple moving averages with window lengths equal to:
• params$lookbacks$short
• params$lookbacks$medium
• params$lookbacks$long
The corresponding windows always end in the current period. The strategy should in
this period send market orders to assume a position (make sure you take into
account positions from earlier) according to getPosSignFromTMA and getPosSize.
(Limit orders are not required at all, and can be left as all zero.)
Hints
You can develop the first 3 functions without running the backtester.
For the checks for getTMA you may find the following functions useful:
• The operator ! means not, and can be used to negate a boolean.
• sapply allows one to apply a function element-wise to a vector or list (e.g., to
c("short","medium","long")).
• all is a function that checks if all elements of a vector are true (for example,
it can be used on the result of sapply).
• %in% can be used to check if an element exists inside a vector.
To compute the moving average in getTMA you can use SMA from the TTR package.
For getPosSize, you can use the function floor.
Fo getOrders some instructions are given as comments in a2_strategy_template.R
Example output for strategy.R
We now give some ways for you to test the correctness of implementations of the getTMA,
getPosSignFromTMA, getPosSize, and getOrders.
getTMA. You can use the provided a2_test_getTMA.R to do one test for each of the 7 cases
(6 error cases, and the normal behaviour). Make sure that you have sourced your
implementation of getTMA. Then, for the first 6 error cases, if your code is correct it will give
the expected error, for example:
> test_getTMA('E01')
Read 10 series from DATA/A2
Error in getTMA(prices, lookbacks_no_names) :
E01: At least one of 'short', 'medium', 'long' is missing from names(lookbacks)
Note: the unedited template will always give 'E01'!
For the "normal" case with correct input arguments, you should get the following output if
your implementation of getTMA is correct:
> test_getTMA('normal')
Read 10 series from DATA/A2
$short
[1] 960.05
$medium
[1] 964.15
$long
[1] 964.7875
If you want to do further testing, you can use the pre-deadline tests, or you can extend
a2_test_getTMA.R by adding alternative examples yourself.
Warning
Because of the sequential nature of the checks, where any single check can halt
execution, it is possible for a bad implementation of one check to preclude getting
further in the code.
You may lose credit for a correct check this way.
Hint: You can always delete a check (just comment out the if and stop) if you know
that it doesn't work.
getPosSignFromTMA. Here is one example input for each of the three possible outputs:
> getPosSignFromTMA(list(short=10,medium=20,long=30))
[1] -1
> getPosSignFromTMA(list(short=10,medium=30,long=20))
[1] 0
> getPosSignFromTMA(list(short=30,medium=20,long=10))
[1] 1
getPosSize. Here are two examples of correct outputs:
> current_close <- 100.5
> getPosSize(current_close)
[1] 9
> getPosSize(current_close,constant=100.4)
[1] 0
getOrders. The following table gives the "PDratio" for a correct implementation for three
different time periods and the parameter combination:
params$lookbacks <- list(short=as.integer(5),
medium=as.integer(50),
long=as.integer(100))
start period end period PDratio
1 884 2.37
1 819 2.3
1 817 2.26
The three examples of results.yaml (details below) can also be used to further establish the
correctness of getOrders.
Part 2: cross-validation (30%)
Warning
This last part of the assignment requires getOrders to be correct; otherwise you will
get 0 marks for results.yaml.
In this part of the assignment you are asked to do a cross-validated parameter optimization
of the PDratio ("fitAgg").
Every student has their own in-sample and out-of-sample periods based on their MWS
username. This ensures that different results.yaml files are correct for different students.
To get your in-sample and out-of-sample periods, use a2_periods.R as follows. Source it
and run the function getPeriods with your MWS username as per the following example
(where we use the fake username "x1xxx"). Use startIn, endIn, startOut, and endOut as
the start and end of the in-sample and out-of-sample periods respectively.
> source('a2_periods.R')
> getPeriods('x1xxx')
$startIn
[1] 1
$endIn
[1] 884
$startOut
[1] 885
$endOut
[1] 2000
You will do two parameter sweeps. One on your in-sample period, and one on your
out-of-sample period. The sweep will be over three parameters: the short, medium, and long
lookbacks. (You should not optimize the constant used with getPosSize, and leave it as
1000 as defined in the template code.) The parameter combinations of lookbacks that you
should use are defined by two things: parameter ranges and a further restriction. Make sure
you correctly use both to produce the correct set of parameter combinations. The ranges
are:
Parameter Minimum value Increment Maximum Value
short lookback 5 5 10
medium lookback 50 25 100
long lookback 100 50 200
You should further restrict the parameter combinations as follows:
• The medium lookback should always be strictly greater than the short lookback.
• The long lookback should always be strictly greater than the medium lookback.
The correct resulting number of parameter combinations is 16.
Hint
Here are two ways to generate the parameter combinations:
• Use three nested for loops and within the innermost loop ensure that it is a valid
combination that meets the further restriction before proceeeding.
• Use expand.grid to create all combinations based on the ranges and then
remove rows from the data.frame if they do not meet the futher restriction.
The following information is needed in results.yaml:
1. The parameter combination that gives the best PDratio on the in-sample period; the
corresponding PDratio.
2. The parameter combination that gives the best PDratio on the out-of-sample period; the
corresponding PDratio.
3. rank_on_out: The rank (an integer between 1 and 16) that describes where the
parameter combination from 1. ranks on the out-of-sample period.
4. rank_on_in: The rank (an integer between 1 and 16) that describes where the
parameter combination from 2. ranks on the in-sample period.
There will never be ties for the first place in these rankings, so the correct parameter
combinations for 1. and 2. are always unique.
Interpretation
An ideal scenario is for the best in-sample parameter combination to also be the best
out-of-sample parameter combination. In practice, this is often not the case, as we
have seen in the slides. Here, as we did in the slides, we are exploring the difference
between parameter combination performance on in-sample and out-of-sample
periods, where "a good outcome" is for the rank_on_out and rank_on_in to both be
close to 1 (where 1 is ideal).
Example output for results.yaml
In the a2_yamls subdirectory, three examples of results.yaml are provided for the fake
usernames "x1xxx", "x1yyy", and "x1zzz". For "x1xxx", the yaml file contents are:
ins:
short: 5.0
medium: 50.0
long: 150.0
PDratio: 3.16
rank_on_out: 3.0
out:
short: 5.0
medium: 50.0
long: 100.0
PDratio: 4.1
rank_on_in: 4.0
Once you have a working implementation of getOrders and the code to do the parameter
sweep and ranking you can use these three examples to test your output.
Marks breakdown for results.yaml
Note that the marks for results.yaml are only available if getOrders gives the expected
output. Moreover, the yaml must have the right format -- the pre-deadline tests check for
this to help you.
Here is an example blank results.yaml, shown with additional line numbers:
1 ins:
2 short:
3 medium:
4 long:
5 PDratio:
6 rank_on_out:
7 out:
8 short:
9 medium:
10 long:
11 PDratio:
12 rank_on_in:
Note that the line numbers on the left are not part of the file; they are shown since they are
used in the following table that describes the available marks:
Field(s) Line numbers in example Marks
In-sample best params 2-4 2.5
In-sample best PDratio 5 2.5
Out-of-sample best params 8-10 2.5
Out-of-sample best PDratio 11 2.5
rank_on_out 6 10
rank_on_in 12 10
Part 3: submission and pre-deadline tests
You need to submit strategy.R to have your submission marked, i.e., you will get 0 if you
only submit results.yaml; moreover, to get marks for results.yaml, your submitted
getOrders function must give correct output. Submission of one or both files should be done
via the CodeGrade assignment on Canvas.
Pre-deadline tests are provided to help check correctness for getTMA, getPosSignFromTMA,
getPosSize, and getOrders.
For results.yaml, the pre-deadline tests only check that the format of the yaml file is correct.
Use the example yamls described above to check correctness
学霸联盟