R代写 - ACTL1101 Introduction to Actuarial Studies

1. Part One: Australian Schools Data
1.1 Context
In this part of the assignment, you will revisit a dataset on Australian Schools, which you
encountered before in ‘R Questions on Week 4’. This dataset is available on Moodle under
the name school data2019.csv. On Moodle there is also a file ‘Decription of Variables’,
which briefly describes what each variable represents.
In this part of the assignment, you will create some visualisations to deepen your understanding
of this data. Note that, compared to the dataset encountered previously, school data2019.csv
• More variables, in particular it contains data about the average taxable income and tax
paid (from tax year 2017-2018) for people living within the postcode where each school
is located. Those additional income variables are called:
– Taxable.Income: average taxable income, within the postcode
– Net.Tax: average net tax paid, within the postcode
– Salary.or.Wages: average salary/wages, within the postcode
• A few schools have been removed (because the income data for the postcodes of
those schools was not available).
For your information, the school data is publicly available here, while the income data is
available here (with its license found here).1
1We must disclose that, compared to the original income/tax data found here, we have modified the income
data to obtain averages by individual (as opposed to the total amounts by postcode found in the original data).
ACTL1101 Introduction to Actuarial Studies 2020 Main Assignment
1.2 Your Tasks
1. (2pts) Produce a visualisation of the distribution of variable ICSEA across all schools.
Briefly describe this distribution.
Note: You may want to review the definition of ICSEA found here.
2. (2pts) Produce a visualisation that illustrates, by State, the proportion of schools with
their School.Sector equal to either ‘Government’, ‘Catholic’, or ‘Independent’. Briefly
discuss what you observe.
3. (2pts) Add a variable called Income.Bracket to this dataset. This new variable should
be based on variable Taxable.Income, and be equal to:
• ‘Low’ if Taxable.Income is below its 20% percentile.
• ‘Medium’ if Taxable.Income is equal or above its 20% and below its 90% percentile.
• ‘Rich’ if Taxable.Income is equal or above its 90% and below its 99% percentile.
• ‘Very Rich’ if Taxable.Income is equal or above its 99% percentile2.
What is the average Taxable.Income within each Income.Bracket?
4. (2pts) Produce a visualisation which illustrates the relationship between ICSEA and the
new variable Income.Bracket, and briefly discuss what you observe.
5. (2pts) Open Question: use any variable(s) you want in this dataset to tell a brief story
about the data. This can be anything you find relevant, but you must include at
least one visualisation to support your ‘story’. Example: an interesting/surprising link
between variables, an insight that could help public policy, a finding that is the starting
point for new research, etc.
2Hint: consider using function quantile().
2. Part Two: Utility of an Insurer
2.1 Context
In this second part of the assignment (which is totally unrelated to the first part), you will work
on an insurance problem that would be difficult to tackle without programming. Note that you
must use the numerical values of w0, n and r0, given in the file parameters by student on
Moodle (look for the values that correspond to your zID). The context is as follows.
An insurance company (insurer) has a current capital (initial wealth) of w0. It can choose
to insure a new group of n people. The individuals in this group are independent, and
they all have similar characteristics, so that each individual has a probability of incurring
a loss within the next year equal to q. The coverage offered is only for the next year. The
payout by the insurance company to each individual will be:
• 1 in the case of a loss.
• 0 otherwise.
We can model the total future payout made by the insurer by:
S = X1 +X2 + . . .+Xn,
where each Xi ∼ Bernoulli(q), i = (1, 2, . . . , n) is the individual payout on the policy of
individual i.
Based on its risk aversion policy, this insurer makes financial decisions using a utility
function given by:
v(w) = log(w), for w > 0,
where log() is the natural logarithm, i.e. such that log(e) = 1. The premium charged by
the insurance company to every individual is set to
Premium = c · q
where c > 0 is a ‘loading factor’: a bigger value of c implies a bigger margin for a profit.
Note: Be careful that the utility function v(w) is not defined for negative values of w. Hence,
in your R programming, be wary of not computing v(·) on amounts that are negative, which
could result in error messages.
ACTL1101 Introduction to Actuarial Studies 2020 Main Assignment
2.2 Your Tasks
1. (3pts) Assume that the insurer uses the Principle of Zero utility to fix c. Write a
function called find.c(), which returns the numerical value of c. Your function should
have q as an argument. Using this function, find the value of c for q = 0.01 and for
q = 0.10.
2. (2pts) Use your function find.c() from Q1 to create a graph which illustrates the
relationship between c and q. Analyse and interpret this relationship.
3. (3pts) Now assume that the insurer can choose to insure, or not, one specific individual
in that group. Further, assume that this individual has the exact same utility function
as the insurer and an initial wealth of r0. Then, we can show that, based on its utility
function, the individual will be willing to buy insurance if:
r0 > (r0 − 1)q · (r0)1−q + cq
On the other hand, the insurer will be willing to take on this risk (sell the insurance) if1
w0 < (w0 + cq − 1)q · (w0 + cq)1−q.
An interesting point is that there is a set of values of q and c such that both the individual
and the insurer would be willing to enter the insurance deal.
Your task: Produce a 13 × 13 table, where columns represent increasing values of c,
from 0.9 to 2.1 (left to right) by jumps of 0.1. Rows represent increasing values of q,
from q = 0.01 to q = 0.25 (from top to down) by increments of 0.02. Inside the table,
each cell displays a number, either:
• 1: if only the insurer is happy with that particular combination of {q, c}.
• 0: if both the insurer and individual are happy with that {q, c}.
• -1: if only the individual is happy with that {q, c}.
4. (2pts) Analyse and explain your results from Part 3. In particular, explain why it is
possible for two parties with the exact same utility function to happily stand on opposite
sides of a financial transaction (in this case, the exchange of an insurance risk).
1It is not required here, but for extra practice for the Final Exam: try to prove those inequalities. It is not too
hard :-).
3. Format Requirements
Here we explain the format requirements you must satisfy for this assignment. Heavy
penalties will be applied for non compliance, and in particular we will not mark any material
that exceeds the page limit.
• You must submit your assignment on Turnitin (under section ‘Main Assignment’ in
Moodle). The deadline is November 20th at 13:00.
• You must submit two files:
– a .pdf file: contains your answers to all questions.
– a .R file: contains all the R codes you used to produce your answers.
• About the .pdf file:
– It includes a title page with: student name, student zID.
– The page format is A4 (this is the standard Australian format).
– The minimum font size used is an equivalent of ‘Times New Roman’ size 11.
– The minimum line spacing used is 1.15.
– The margins should not be narrower than the ‘narrow’ option in Word (which is
0.5 inches every side).
– The answers (including sub-parts) are numbered in the same way they are numbered
in the statement of the questions.
– Your answers to Part One (including all plots) must fit on 2 pages.
– Your answers to Part Two (including the plot) must fit on 1 page.
– All R codes necessary to produce your results (i.e. the content of the .R file)
must also be placed in an Appendix to your .pdf file. There is no page limit on
this Appendix but the efficiency of your code will be graded (see Marking Criteria).
• About the .R file:
– Your R codes must run as they are (and produce exactly the results in your
assignment). If we cannot run your code, you will lose ALL marks associated
to R code (C1 and C2 in the Marking Criteria, see next page).
– Your R codes must contain ALL steps necessary to answer the questions in this
assignments. To be specific: you are NOT allowed to do any data manipulation in
Excel, or any other software.
4. Marking Criteria
Each individual Question gets attributed a fixed number of marks. To assess your answers,
we will use a series of criteria. Those criteria are stated below, with a brief description that
corresponds to a ‘HD mark’. Not all criteria are relevant to every sub-question: find a detailed
mapping below.
• C1: Code Correctness: Your R codes, functions and algorithms produce exactly the
desired results, and do not produce any irrelevant/superfluous results.
• C2: Code Efficiency: Your R codes are extremely efficient, without sacrificing readability.
Your R codes are extremely well organised and easy to follow.
• C3: Analysis: Your analysis is insightful and accurate. Your interpretation of your
results is correct, clear, precise and shows a great depth of understanding and critical
thinking. Your writing is concise, fluent and devoid of typos, grammatical and syntactical
• C4: Choice of Visualisation: Your choice of which visualisation to use is excellent: it
conveys all (and only) the appropriate information.
• C5: Presentation: The formatting and presentation of your results and/or visualisations
is impeccable: clear, readable and aesthetic.
4.1 Part One
For each sub question (Q1, Q2, Q3, Q4, Q5) in Part One, the relevant marking criteria are:
• Q1: C1, C3, C4, C5
• Q2: C1, C3, C4, C5
• Q3: C1, C2
• Q4: C1, C3, C4, C5
• Q5: C3, C4, C5
ACTL1101 Introduction to Actuarial Studies 2020 Main Assignment
4.2 Part Two
For each sub question (Q1, Q2, Q3, Q4) in Part Two, the relevant marking criteria are:
• Q1: C1, C2
• Q2: C1, C3, C4, C5
• Q3: C1, C2, C5
• Q4: C3
4.3 Plagiarism Awareness
This is an individual assignment. While we have no problem with students discussing
assignment problems if they wish, the material each student submits must be their own
individual work. Students should make sure they understand what plagiarism is.
In particular, any R code you present must be from your own computer, and developed by
you alone. While some small elements of code are likely to be similar with 265 students
performing the same task, big patches of identical code (even with different variable names,
layout, or comments) will be considered as plagiarism. Turnitin picks this up easily, so cases
of plagiarism have a very high probability of being discovered. The best strategy to avoid
any problem is to never share bits and pieces of code with other students.
5. Answering Students’ Questions
Any question or clarification about the assignment must be posted on the Ed Forum, under
category ‘Main Assignment’. We do not plan to give out many additional hints, but if we were
to do so, we want everyone to benefit from them.
Important Note: The deadline for submission of this assignment is Friday November 20th
at 13:00. However, we will stop answering any questions about the assignment on Monday
November 16th at 13:00. The rationale for this is twofold:
• we want to incentivise students to start the assignment early
• we want to be fair to assiduous students who decide to submit their assignment ahead
of time. Were we to give hints right before the deadline, those students would be
penalised for their earliness.