r代写-ECOM135|学霸联盟

r代写-ECOM135

时间：2021-04-20

Main Examination Period 2020
ECOM135 Machine Learning with Applications in Finance Duration: 3 hours
ANSWER FOUR QUESTIONS
The first four questions that you submit will be marked. Cross out or
delete any answers that you do not wish to be marked.
COMPLETED PAPERS SHOULD BE SUBMITTED VIA QMPLUS AND ALSO E-MAILED TO
ECOM135-exam@qmul.ac.uk.
THIS IS AN OPEN BOOK EXAMINATION TO BE CONDUCTED ONLINE. YOU MAY REFER TO
ANY OF THE COURSE MATERIALS, OR ANY OTHER SOURCE OF INFORMATION. YOU MAY
ALSO USE A SPREADSHEET OR CALCULATOR.
THE SHARING OF THIS EXAMINATION PAPER IS AN EXAMINATION OFFENCE.
YOU ARE REQUIRED TO TYPE YOUR ANSWERS, HANDWRITTEN ANSWERS ARE NOT
PERMITTED.
PLEASE ENSURE THAT YOUR WORKING IS CLEARLY SHOWN WITH ALL STEPS OF YOUR
CALCULATION INCLUDED IN YOUR ANSWER DOCUMENT, INCLUDING ANY FORMULA USED.
When writing formulas, please note the following:
· It is acceptable to use the standard alphabet in place of Greek letters. The following are
recommended: a for ↵, b for , d for , D for , l (lowercase l) for , m for µ, v for ⌫ (to avoid
confusion with n), s for , S for ⌃.
· Use + for addition, - for subtraction, * for multiplication and / for division.
· Where appropriate use an underscore to indicate a subscript, e.g. x i for xi.
· Use the ˆ character for power, e.g. x^2 for x2, x^0.5 for px.
· When referring to the following functions use log(x) or ln(x) for loge x, logb(x) for logb x,
exp(x) for ex, cos(x) for cos x.
· Use infty for 1.
· Use Sum to denote summation of terms, e.g. Sum (i=1)^n x i for Pni=1 xi.
· Use Prod to denote product of terms, e.g. Prod (i=1)^n x i for Qni=1 xi.
Guidance continued on the next page
Page 2 ECOM135 (2020)
· Use D for derivative, e.g. D(x^2) = 2x.
· Use Int for integral, e.g. Int a^b (x) dx for R ba x dx.
· Use cap for \ and cup for [ when referring to sets.
· Where it is not obvious that an estimate is implied then state this in full, e.g. ‘a suitable
estimate of b is 0.125’ or more simply (and equally acceptable) ‘est.b = 0.125’.
· Use brackets as necessary. To make your answer clearer use di↵erent types of bracket pairs where
appropriate, e.g. (), [], {}.
Use obvious choices for any other mathematical symbols not listed above that you may require.
Examiner: Dr R.A. Saldanha
c Queen Mary University of London, 2020
ECOM135 (2020) Page 3
Question 1
a) What is a na¨ıve Bayes classifier? Explain the probability model and the decision rule.
[10 marks]
b) Consider the following training dataset which gives selected firm type, size, UK domicile and
whether or not the firm has been granted a margin trading account.
Type Size UK Margin
Hedge Fund Medium no yes
Asset Manager Large yes yes
Investment Bank Medium no no
Asset Manager Small yes yes
Investment Bank Medium no no
How would a na¨ıve Bayes classifier determine whether a new hedge fund that is small in size and
domiciled in the UK would be given a margin trading account?
[8 marks]
c) You work in the credit risk department of a prime broker. A potential hedge fund client appears to
have a poor credit score. However, such a score is quite prevalent, around four out of five hedge
funds have much the same score. You also know that historically the probability of such a score
given observed defaults is also high at around 40%. The actual probability of hedge fund default is,
however, much lower around 7%. Your firm only accepts clients who have less than a 5% risk of
default. Based on the available evidence, do you recommend giving this client a trading account?
[7 marks]
Turn over
Page 4 ECOM135 (2020)
Question 2
The Poisson distribution with parameter has density
P (X = x) =
xe
x!
for x = 0, 1, . . .
a) Explain generally how this particular statistical distribution arises.
[4 marks]
b) i. Under what conditions does the Poisson distribution serve as an approximation to the
binomial distribution?
ii. Under what conditions does the normal distribution serve as an approximation to the Poisson
distribution?
[4 marks]
c) High-frequency algorithmic trading errors for a particular hedge fund occur on only two trading
desks. The first desk experiences an average of one error every five weeks. The second desk
experiences an average of one error every eight weeks.
i. What would be a suitable combined probability model for these data?
ii. Calculate the probability of three or more high-frequency algorithmic trading errors occurring
for the hedge fund during a particular week.
[8 marks]
d) A particular stock exchange experiences the following monthly main hardware computer failures
over a period of 20 years:
Number of Number of
failures months
0 169
1 62
2 7
3 2
4+ 0
Do these computer failures appear to occur randomly in time? Explain your assumptions,
calculations and conclusion carefully.
[9 marks]
ECOM135 (2020) Page 5
Question 3
a) Why is Principal Component Analysis (PCA) sometimes described as an unsupervised machine
learning technique?
[2 marks]
b) For what reasons might you apply PCA?
[3 marks]
c) The following sample variance-covariance matrix S has been calculated using the daily log returns
(⇥100%) for four stocks from the Nasdaq 100 index over one year:
AAPL CSCO GOOGL NVDA
AAPL 2.75
CSCO 1.36 2.41
GOOGL 1.39 0.93 2.23
NVDA 2.49 1.95 1.75 6.62
What is the corresponding sample correlation matrix R? What conclusions (if any) do you draw
from R?
[7 marks]
d) A PCA analysis was carried out using exactly the same data used to compute S. The following
computer output was obtained:
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4
AAPL 0.421 0.456 0.153 0.769
CSCO 0.337 0.415 -0.800 -0.272
GOOGL 0.312 0.487 0.580 -0.574
NVDA 0.782 -0.618 0.032 -0.068
Importance of components:
Comp.1 Comp.2 Comp.3 Comp.4
Standard deviation 3.077 1.443 1.182 1.007
Proportion of Variance 0.678 0.149 0.100 0.073
How valid is this analysis? What are your conclusions?
[9 marks]
e) How might you use the results from PCA in other analyses?
[4 marks]
Turn over
Page 6 ECOM135 (2020)
Question 4
a) Why is it often necessary to use numerical optimization to estimate the parameters of a machine
learning model? Give two examples to support your answer.
[3 marks]
b) The following fictitious ordinal integer credit score data y is displayed as a histogram in Figure 1
below. The higher the score (max 999) the better the credit quality. A reasonable credit score is
considered to be a score higher than 721. What do you observe about these data? What type of
model for these data would you consider fitting?
[7 marks]
Figure 1: Credit score data (350 observations).
c) The following log-likelihood function was chosen as a suitable model for the data displayed above in
Part b):
l(p, µ1,1, µ2,2; y) =
nX
i=1
log
⇢
p
1

✓
yi µ1
1
◆
+
1 p
2

✓
yi µ2
2
◆
where p is a proportion 0 6 p 6 1; n is the number of observations; µ1, 1 and µ2, 2 are the means
and standard deviations for two distinct populations; and is the standard normal density
function. Explain what this model is attempting to do.
[5 marks]
d) i. Describe the Nelder–Mead optimization method for estimating the parameters for the model
given in Part c)?
ii. Parameter estimates for the model given in Part c) were obtained as pˆ = 0.31, µˆ1 = 581.31,
ˆ1 = 45.14, µˆ1 = 849.53 and ˆ2 = 78.54. How reasonable do these estimates appear to you?
[10 marks]
ECOM135 (2020) Page 7
Question 5
a) What is generally meant by an ensemble method in machine learning?
[3 marks]
b) What is k-fold cross-validation? How does cross-validation di↵er from the bootstrap?
[4 marks]
c) What do the following terms mean in machine learning:
i. bagging
ii. boosting
iii. stacking
[6 marks]
d) i. What is gradient boosting and what are the method’s aims?
ii. How does gradient boosting di↵er from random forest?
iii. Why might k-fold cross-validation form part of a gradient boosting fitting algorithm?
[9 marks]
e) What is semi-supervised machine learning? What assumptions might be appropriate to employ with
such data?
[3 marks]
Turn over
Page 8 ECOM135 (2020)
Question 6
a) Give a general description of a neural network.
[4 marks]
b) Give four examples of a neural network in use.
[2 marks]
c) The following questions are on the McCulloch & Pitts (1943) model of a biological neuron.
i. Why was the model thought to be a reasonable mathematical representation of brain biology
(albeit a highly simplified representation)?
ii. What are the main disadvantages of the model?
iii. Why might a sigmoid function be better the model’s activation function?
[5 marks]
d) What is deep learning?
[2 marks]
e) In what situations might deep neural networks be preferred to shallow neural networks?
[3 marks]
f) What is gradient descent and what is its relevance in a neural network context?
[5 marks]
g) In what situations would you standardize the inputs to your neural network? Why is this necessary?
[4 marks]
ECOM135 (2020) Page 9
Question 7
The following dataset consisting of 8 observations in 2 dimensions x1 and x2 are available. There is an
associated red cross or blue dot group label for each observation.
x1 x2 group
-2 2 red cross
-4 6 red cross
-3 4 red cross
-3 2 red cross
-3 6 blue dot
-2 4 blue dot
-4 8 blue dot
-2 5 blue dot
Figure 2: Data plot and maximal margin classifier for the observations given in the table above.
Turn over
Page 10 ECOM135 (2020)
a) The following questions (i.–vii.) all relate to the table and graph (Figure 2) shown on the previous
page.
i. Explain why a maximal margin classifier can be applied to these data?
[3 marks]
ii. Explain what the line shown in the graph as b) is.
[2 marks]
iii. What do the green dashed lines labelled in the graph as c) depict?
[2 marks]
iv. State the classification rule for these data.
[3 marks]
v. How many support vectors for the maximal margin classifier are there?
[2 marks]
vi. State which vectors are not support vectors.
[2 marks]
vii. Use the classification rule derived from the maximal margin classifier to determine the groups
for the following two new row vectors (-2.5, 2) and (-3.75, 7).
[4 marks]
b) Explain why the maximal margin classifier is unlikely to be of much use in most practical problems.
What adaptations to this method allow for more real world application?
[7 marks]
End of Paper

学霸联盟