ATHK1001 ANALYTIC THINKING: ASSIGNMENT 1, 2021
Due date: 11:59pm Friday, April 23rd (Week 7). Late penalty of 5% per calendar day applies.
Online submission: All submissions are to be made online on the ATHK1001 Canvas website.
Submissions will be checked for plagiarism.
Incorrect submissions: If you discover before the closing date that the file you submitted on
Turnitin was incorrect, and let us know, you may be given the option to resubmit a corrected
version which will incur a 50% penalty or the relevant lateness penalty, whichever is greater.
Word length: 750 words across all questions (excluding references in Question 12). A penalty of
10% will apply to papers that exceed this limit by more than 10%, a 20% penalty if you exceed
20% of the limit, and 30% if you exceed the limit by 30%.
Total marks: 60 (17.5% of total grade for class)
Background and Aims
Sometimes people do not have access to the data they need, so they have to make informed
estimates. However, there can be biases in people’s estimates. Tversky and Kahneman (1974) identified
one such bias they called “anchoring” (amongst other biases in decision making). They showed that when
people tried to estimate the numerical answers to a question they do not know, they can be influenced by
a number they have just seen. For example, if people estimate the proportion of African countries in the
United Nations, then they give higher estimate if they first had to say if the answer was higher or lower
than 65 rather than 10. This was true even though they were told that the number was randomly
generated. The first number appeared to anchor their estimate and drag the estimate towards the anchor.
In the experiment you did during Week 2 tutorials explored the anchoring bias to estimation.
Although participants in Tversky and Kahneman’s (1974) study was told that the number was
randomly generated, perhaps they did not believe the experimenter and instead thought the number they
were given was actually related to the true answer, so rather than the number they were given being an
anchor it may have been regarded as useful information. To test this possibility, in our experiment for one
of our tasks we had participants generate a number based on their own phone number, so they knew it
was unrelated to the question. Participants then said whether or not the Attila the Hun was defeated at the
Battle of Chalons on a date before or after the year equal to their phone number (plus 100). They then
estimated the true answer. If a knowingly random number can anchor an estimate, then the higher a
participant’s phone number the higher the estimate should be.
The anchoring bias seems to imply that any random number we are exposed to could influence us
whenever we have to try to estimate a numerical answer. So an important question is how wide is the
scope of the anchoring effect? We tested this using a task based on that used by Strack and Mussweiler
(1997). We gave participants a set of pairs of questions. The first question in the pair asked them whether
the answer to the question was higher or lower than a given answer (i.e., the anchor), the second question
in the pair asked them to either give a numerical answer to the same question or to a different question.
The anchor was either substantially higher than the true answer or substantially lower than it. If to be
influential an anchor must be directly related to the number being estimated, then when the anchor is high
estimates should be higher when the second question in the pair is the same than when it is different, and
when the anchor is low estimates should be lower when the second question in the pair is the same than
when it is different.
In the class experiment we examined hypotheses about both of these tasks.
Method
Participants
A total of 294 students from analytic thinking course (ATHK1001) participated as part of a class
experiment. Additional students participated but either did not complete the experiment or did not consent
2
to having their data analysed. Of these 177 were female, 117 were male and they had a mean age 19.5
years).
Materials
For the Phone task participants answered three questions:
“Think of the last 3 digits of your telephone number, now add 100 to its value, then write the
answer in the box below” (adding 100 was a way to make participants focus on the number).
“Looking at the number you created above did the following event occur before or after this date
AD: Attila the Hun was defeated at the Battle of Chalons”
“Provide your estimate of what year AD the event occurred” (true answer is 451)
An example of the questions asked in the Paired task was the following: “Please indicate whether
you think the true answer for the quantity is higher or lower than the random number in blue. The
population of New York City in 2019 was [anchor number in blue] ”. They then answered on the
following two questions:
“What is the population of New York City in 2019?” [Same question condition]
“What was the number of babies born in the USA in 2018?” [Different question condition]
The eight other questions in the Same question condition were:
What was the total livestock population of ducks in France in 2008?
In 2014, what was the Gross National Debt of the Republic of Congo (US$)?
What is the length of time an American person spends eating dinners per year in minutes?
What was the weight of King Henry 8th of England (in pounds)?
What is the total area (square kilometres) of Chile?
What is the height of the mountain K2 (in feet)?
What is the annual consumption of dairy products per person (in pounds)?
What was the total worldwide gross of the film 'How to Train Your Dragon II' (US$)?
The eight other questions in the Different question condition were:
In 2012, what was the annual consumption of electricity (megawatt hours) in Kazakhstan?
What was the lowest daily value of shares traded on New York Stock Exchange in Year 2003
(US$)?
What is the average time spent by a person per year on social networking online (minutes)?
How much does a typical refrigerator weight (in pounds)?
What is the population of Suriname?
What is the average distance a car travels per year (in km)?
What is the average annual water consumption per household (in Liters)?
What were the global iPhone sales during 2018 (units)?
The high anchors were randomly generated but always had a magnitude one greater than the true
answer. The low anchors were randomly generated always with a magnitude one less than the true
answer.
Design and Procedure
The experiment had two independent variables: Question condition, either same or different;
Anchor condition, either high or low. These independent variables were varied between participants, each
participant received all their paired questions in either the same or different condition, and they received
either all the high anchor high or all the low anchors.
During tutorials for the class Analytic Thinking at the University of Sydney participants
completed the experiment individually on computers in class or online. They then completed the
experiment in a set of steps. First participants answered some demographic questions, then they did the
Paired task, and then the Phone task. After completing the experiment participants indicated whether or
not they consented to having their data included in the data set.
3
Hypotheses
We proposed four hypotheses related to anchoring effects. First, we will test whether there is an
anchoring effect when participants can know for certain that the anchor is random and completely
unrelated to the question being asked. We predict that participants with relatively high phone numbers
will produce higher estimates than those with relatively low phone number.
Hypothesis 1: Participants with phone numbers above the median produced higher Battle of
Chalons estimates than those with phone number below the median.
Another way to test whether people’s phone numbers influence their estimates is to calculate the
correlation between them.
Hypothesis 2: Participants’ phone numbers and their estimates for the Battle of Chalons will have a
positive correlation.
For the Paired task, if the anchoring effect depends on the similarity of the question introducing the
anchor and the question asking for the estimate, then would expect a bigger impact of the anchor for
participants in the same question condition than the different question condition.
Hypothesis 3: In the Paired task, participants in the High anchor condition will produce higher
aggregated estimates if they were also in the same question condition than if they were in the
different question condition.
For the Paired task, we would expect a result consist with the result of Hypothesis 3 for participants in the
low anchor condition.
Hypothesis 4: In the Paired task, participants in the Low anchor condition will produce lower
aggregated estimates if they were also in the same question condition than if they were in the
different question condition.
Results
The data set for our class can be found on the Canvas site for ATHK1001 under “Assignment 1”.
This assignment description can be found there as well as an Excel file called “Assignment1_dataset.xls”.
This Excel file contains all the data for the assignment and has 294 data lines, one for each participant.
Each participant has values for 6 variables, and the values of each variable are in a single column of the
file.
The first variable is an id number. There are two variables related to the Phone task, the variable
“phone_number” which is the number participants entered for their phone number (plus 100), and the
variable “Chalons_estimate” which shows participants’ estimates for the year of the Battle of Chalons.
There are only 261 participants with values for these variables because this task came at the end of the
experiment, so some participants ran out of time. Excel formula can sometimes act nonintuitively when
presented with blank cells, so be aware of this.
The need to make this data public for the assignment created a privacy issue due to the
“phone_number” variable because it potentially includes part of each participant’s phone number. Some
of the values for this variable are unique, so if a student has friends in the class who know their phone
number, then these friends may be able to identify the student’s data. This could not be done with
complete confidence because only about one-third of the class has data for this variable and participants
could have deliberately or accidently entered incorrect numbers. However, participants were told that
their data would be kept private so we decided we needed to make identifying a participant much more
difficult. We did this by randomly adding a number between 0 and 100 to each value for the
phone_number variable. The data file you have access to contains this modified data. We checked all the
analysis impacted by this variable that you are requested to carry out and found that there was no material
impact on the analysis, so you can ignore this change to the data when answering the questions below.
There are three variables related to the Paired task. “Anchor_condition” says whether the
participant was given high or low anchors, “Question_condition” says whether the participant received
4
the same question or a different question to that used to present the anchor. “Aggregated_estimates”
which represent how close overall a participant’s estimate was to the mean estimate, with negative
numbers indicating they tended to be below the mean of everyone’s estimates and positive numbers
indicating they tended to be above.
The calculation of the Aggregated estimates variable is somewhat complicated and to interpret it
you do not need to understand how it was calculated, just what it represents. If you are interested though,
this is what we did. We could not just take the average of participants estimates because the questions had
very different answers meaning they were on different scales. To put them onto to the same scale we
converted them into standard scores (standard scores are calculated by first calculating the sample’s mean
and standard deviation for a variable, then calculating the difference between each participant’s estimate
for that variable and the variable’s mean, and finally dividing this difference by the sample’s standard
deviation). We then calculated each participant’s mean standard score across their estimates, which is
what Aggregated_estimates are.
WHAT YOU WILL WRITE
Your task is to analyse the data in order to test the four hypotheses proposed above. You will do this by
addressing each of the following twelve questions. Answer all questions with complete sentences, not
with just numbers, notes or tables. Do not include the text of the questions in your assignment (this
will trigger a plagiarism warning), but you should include the number of the question being addressed.
1) For the Phone task the median phone number (modified) was 611. For participants with phone
numbers below or equal to the median calculate and state the mean and standard deviation of their
estimate for the Battle of Chalons. Do the same for participants with phone numbers above the median. (4
marks)
2) Based on the means you calculated in Question 1 use a t-test to test Hypothesis 1, that participants with
phone numbers above the median produced higher Battle of Chalons estimates than those with phone
number below the median. Report the p-level for the t-test and state clearly whether or not Hypothesis 1
was supported, and why. (Note that we will be discussing hypothesis testing in lectures in Week 4 and
practicing using Excel to test hypotheses in tutorials in Week 5. So you may need to wait to answer this
question until we have covered the relevant material in class.)
(4 marks)
3) Test Hypothesis 2 by calculating the correlation between participants’ phone numbers and their
estimates for the Battle of Chalons. To test the statistical significance of the correlation you can use the
fact that for a sample n=261 any correlation with greater magnitude than .122 is statistically significant at
the p<.05 level. State whether Hypothesis 2 was supported, and why. (4 marks)
4) Present a scatter graph to show the relationship between phone numbers and their estimates for the
Battle of Chalons. Based on your analysis in Questions 1-4, how strong does the influence of the anchor
on the estimated answers appear to be? Is testing Hypothesis 1 or testing Hypothesis 2 better convey the
nature of this influence? (6 marks)
5) For the Paired task calculate four means and standard deviations for aggregated estimates: for
participants in the high anchor condition and the same question condition, for participants in the high
anchor condition and the different question condition, for participants in the low anchor condition and the
same question condition, for participants in the low anchor condition and the different question condition.
(8 marks)
6) Based on the means calculated in Question 5, test Hypothesis 3 that participants in the High anchor
condition will produce higher aggregated estimates if they were also in the same question condition than
5
if they were in the different question condition. Report the p-level for the t-test and state clearly whether
or not Hypothesis 3 was supported, and why. (4 marks)
7) Based on the means calculated in Question 5, test Hypothesis 4 that participants in the Low anchor
condition will produce lower aggregated estimates if they were also in the same question condition than if
they were in the different question condition. Report the p-level for the t-test and state clearly whether or
not Hypothesis 4 was supported, and why. (4 marks)
8) Are your conclusions for Hypothesis 3 and Hypothesis 4 consistent? Explain your answer and any
implications (3 marks)
9) Identify three different issues with the way we collected data which could limit our ability to draw
conclusions from it. These issues could relate to one or more of the hypotheses. Clearly differentiate the
three issues as “Issue 1”, “Issue 2” and “Issue 3” and explain how each of these issues relates to a data
collection consideration raised in ATHK1001 lectures. For each issue suggest a way it might be resolved
in future research on this topic or if it cannot be resolved then explain why. (12 marks)
10) Summarize what YOUR data analysis tells us about anchoring effects. What do our results add to
Tversky and Kahneman’s (1974) discussion of anchoring, particularly on page 1128 of their article?
Explain your answers with reference to the results of your testing of the hypotheses and possibly the
issues you raised in Question 9. (6 marks)
11) What do you think is the single most important finding from our experiment? Why? (3 marks)
12) Include a reference section which lists the full reference for any paper you have cited when
addressing these questions. You must cite Tversky and Kahneman (1974) in Question 10, and include
citations where ever appropriate. You should use APA style for citations and references, but we will
accept other standard journal article referencing formats. (2 marks)
References
Judgement under uncertainty. Author(s): Amos Tversky and Daniel Kahneman. Science, volume 185,
pages 1124-1131
THIS IS NOT IN A STANDARD REFERENCING FORMAT, YOU WILL NEED TO REFORMAT
THIS FOR Q.12.
Strack, F., & Mussweiler. T. (1997). Explaining the Enigmatic Anchoring Effect: Mechanisms of
Selective Accessibility. Journal of Personality and Social Psychology, 73, 437-446.
[NOTE THAT YOU DO NOT HAVE TO READ THIS PAPER, BUT YOU CAN IF YOU WISH TO]
Note that you do not have to use other sources for answering these questions, but if you do then you must
correctly cite and reference these sources.
Formatting Recommendations
Our preferences
Use the font “Times New Roman”, 12-point size, and double-space all the lines.
Indent the beginning of each paragraph using one tab space.
Use APA referencing style