PSYC20008-心理学代写
时间:2023-04-29
PSYC20008 | Calculating Chi-squared
Page 1 of 15

PSYC20008
Lab and Statistics Resource
Calculating the Chi-Squared
Statistic
A resource guide
17 March 2023
PSYC20008 | Calculating Chi-squared Page 2 of 15
Table of Contents
About this Guide .............................................................................................................. 3
Ahead of the Analysis ...................................................................................................... 4
Obtaining the CSV file ................................................................................................................ 4
Obtaining JASP ........................................................................................................................... 4
Familiarising yourself with JASP ....................................................................................... 4
Opening the dataset in JASP ...................................................................................................... 4
Understanding your columns and rows ..................................................................................... 6
Adjusting JASP to suit your preferences .................................................................................... 7
Running the chi-squared test of independence for Hypothesis 1 ....................................... 7
Selecting a subset of the sample ............................................................................................... 7
Cross tabulating data ................................................................................................................. 8
Running the chi-squared test ...................................................................................................10
Calculating the standardised residuals ....................................................................................10
Writing up the findings to address the first hypothesis ..........................................................11
Saving your data & analysis outputs ........................................................................................12
Running the chi-squared test of independence for Hypothesis 2 .................................... 12
Selecting a subset of the data for your second Hypothesis.....................................................13
Testing your second Hypothesis ..............................................................................................13
Finding information for the Method section ................................................................... 13
Editing the dataset in Excel ............................................................................................ 14
Opening the spreadsheet in Excel............................................................................................14
PSYC20008 | Calculating Chi-squared Page 3 of 15
About this Guide
This guide has been developed to help students of PSYC20008 Developmental Psychology to conduct a chi-squared test of
independence in JASP. This guide is not for publication outside of the subject. Please do not share this guide on third party
websites (e.g., Chegg, Coursehero, StudentVIP, Studocu, and many more).
The guide offers step-by-step instructions to produce a contingency table with observed counts and expected counts, the chi-
squared statistic, degrees of freedom, and p-value, and calculate the standardised residuals. For guidance on what all
these numbers mean, please refer to the notes for PSYC20008 Lecture 4: Calculating Chi-Squared.
The example used throughout this guide is a chi-squared test of independence for the first hypothesis of the lab report, which
is testing the association between two categorical variables, among a specific subset (group) in the sample:
Variable 1: Mid/late childhood psychosocial crisis (three crisis profiles [3 categories]: industry; balanced LC*, inferiority)
Variable 2: Adolescence psychosocial crisis (three crisis profiles [3 categories]: identity; balanced Ad*, role confusion)
Subset of the sample: Participants who identify as adolescents.
Putting all of this together, the first hypothesis is:
Among those who self-identify as adolescents, there will be a statistically significant association between the psychosocial
crises of mid/late childhood and adolescence. More/Fewer participants with ________ crisis profiles will have ______ crisis
profiles than expected by chance. (Students can change the underlined text to match their prediction of what the data will
show).
By following this guide, students will have the statistics they need to address this first hypothesis and an understanding of what
those numbers mean. Once completing these steps to address the first hypothesis, students can follow the steps a second
time, making amendments as they require, to examine their second hypothesis. The second hypothesis will be developed
according to students’ interests, using any subset of the data to investigate the association between any two variables and
predicting outcomes for respective levels within those variables.
About the survey responses
During the 5-week window of data collection (starting two weeks before the semester, and closing in Week 3), the survey was
accessed 1,105 times. We removed 344 cases from the data file, including cases that were duplicate cases, cases with
missing data, cases that were completed too quickly to reasonably say they were genuine (i.e., survey duration was less
than 3 minutes), and “test” or “pilot” cases created by the teaching team.
On average, participants took approximately 8 minutes to complete the survey (SD = 3.2 minutes); with a Median of 7 minutes
and a Mode of 5 minutes.
Two variables in the dataset (Wellbeing, and Age) could only be created once we had the complete dataset and knew the
distribution of their respective continuous variables.
• Wellbeing: Participants’ wellbeing scores were calculated from the WEMWBS (Tennant et al., 2007). The scores were
normally distributed with the Median wellbeing score (43) and Mean score (M = 43, SD = 8.86) both slightly higher
than the scale’s midpoint (42). We used these midpoints to create a categorical variable (“Wellbeing (categorical)”)
with three levels of measurement:
o “Lower wellbeing” (scores ranging from 14 to 37);
o “Middle wellbeing” (scores ranging from 38 to 46); and
o “Higher wellbeing” (scores ranging from 47 to 70).
• Age: The age range for the sample was positively skewed, with the Mode (19) being slightly lower than the Median
(20) and Mean (21.25, SD = 5.26). We used this distribution to create a categorical variable (“Age (categorical)”) with
three levels of measurement:
o “17 to 19” (ages ranging from 17 to 19);
o “20 to 21” (ages ranging from 20 to 21); and
o “22 to 57” (ages ranging from 22 to 57).
* NOTE: Balanced LC = Balanced late childhood profile, Balanced Ad = Balanced adolescence profile
PSYC20008 | Calculating Chi-squared Page 4 of 15
Ahead of the Analysis
Obtaining the CSV file
1. Navigate to the Week 5 Activities page (or Week 4 Activities page) on the PSYC20008 Canvas site.
2. Download the document “2023 PSYC20008 lab report data.csv”
3. Save the file somewhere that you can access it again (e.g., save it to a USB drive).
Obtaining JASP
4. Navigate to https://jasp-stats.org.
5. Click the orange button that says “Download JASP”.
a) Click the orange button that is most appropriate for your device: Windows, macOS, or Linux.
b) While the file is downloading, check out the links to “Getting Started” and “How to Use JASP” pages. Note that on the
“how to use JASP” page, there is a link to the YouTube video for “Contingency Tables”. If you are new to JASP, you can
return to these resources at any time.
c) Once the file has been downloaded, double click on the file to run the installation software. Follow the prompts to
install JASP.
d) Once the setup wizard has finished, click Finish to close the wizard and launch JASP.
Familiarising yourself with JASP
When JASP opens, you will see a screen similar to Figure 1.
Figure 1. The JASP home screen.
Opening the dataset in JASP
6. To open your dataset:
a) Click the menu button (three blue lines) in the top-left-hand corner of the window.
b) Select Open.
c) Select Computer. This is shown in Figure 2.
d) Select Browse. Navigate to the .csv file of the dataset that you saved earlier.
PSYC20008 | Calculating Chi-squared Page 5 of 15
Figure 2. Opening a dataset in JASP.
JASP should now display the spreadsheet, similar to Figure 3.
Figure 3. The spreadsheet in JASP.
NOTE: When viewing the spreadsheet in JASP, you cannot adjust the cell values, delete cases, or create new variables. Most
students will not need to do these things. If you want to edit the spreadsheet, follow the steps in the final section of this
guide, called “Opening the dataset in Excel”.
PSYC20008 | Calculating Chi-squared Page 6 of 15
Understanding your columns and rows
7. The columns represent the variables. Read through the columns and check that you know their meaning:
• Response ID: Each participant has a separate identifying number. This information is valuable when checking
details about your file.
• Age (Continuous): The age of the participant, in years. This is the only continuous variable in the dataset.
Presented like this, this variable cannot be used as a variable within the chi-square test; however, it can be used as
a filter or as a way to create other categorical age variables (in Excel).
• Age (Categorical): The age of participants, transformed into a categorical variable with three levels of
measurement: “17 to 19”, “20 to 22”, and “23 to 57”.
• Life Period: A categorical variable recording the self-identified life period of each participant. This variable has five
levels of measurement: Adolescence, Young Adulthood, Mid Adulthood, Late Adulthood, and Other.
• Gender: A categorical variable recording the self-nominated gender of each participant. This variable has five
levels of measurement: Male, Female, Non-binary, Genderqueer, and Prefer not to say.
• Enrolment Status: A categorical variable recording each participants’ enrolment status. This variable has two levels
of measurement: Domestic student and International student.
• Course: A categorical variable recording each participant’s course. This variable has three levels of measurement:
Bachelor degree, Graduate Diploma of Psychology, and Other Course.
• Early Childhood Crisis: A categorical variable recording how each participant has resolved (or is resolving) their
early childhood psychosocial crisis of “Initiative & Guilt”. For more detail about this crisis, revisit Lecture 2 or the
Hoffnung et al., (2019) textbook. This variable has three levels of measurement: Initiative, Balanced EC*, and Guilt.
• Mid/Late Childhood Crisis: A categorical variable recording how each participant has resolved (or is resolving)
their mid/late childhood psychosocial crisis of “industry & Inferiority”. For more detail about this crisis, revisit
Lecture 2 or the Hoffnung et al., (2019) textbook. This variable has three levels of measurement: Industry,
Balanced LC*, and Inferiority.
• Adolescence Crisis: A categorical variable recording how each participant has resolved (or is resolving) their
adolescence psychosocial crisis of “identity & Role confusion”. For more detail about this crisis, revisit Lecture 2 or
the Hoffnung et al., (2019) textbook. This variable has three levels of measurement: Identity, Balanced Ad*, and
Role confusion.
• Young Adulthood Crisis: A categorical variable recording how each participant has resolved (or is resolving) their
young adulthood psychosocial crisis of “intimacy & Isolation”. For more detail about this crisis, revisit Lecture 2 or
the Hoffnung et al., (2019) textbook. This variable has three levels of measurement: Intimacy, Balanced YA*, and
Isolation.
• Wellbeing (categorical): A categorical variable reporting each participants’ level of psychological wellbeing,
relative to the wellbeing of the rest of the sample. This variable has three levels of measurement: Higher
Wellbeing, Middle Wellbeing, and Lower Wellbeing.
8. Each row represents a different participant in the sample. Scroll down the dataset until you reach the bottom case. The
number in the row tells you how many participants are in the sample.
a) Record the total number of participants in the space below:
* NOTE: Balanced EC = Balanced early childhood profile, Balanced LC = Balanced late childhood profile, Balanced Ad =
Balanced adolescence profile, Balanced YA = Balanced young adulthood profile
The total number of participants in the sample is:
______________________________________________
PSYC20008 | Calculating Chi-squared Page 7 of 15
Adjusting JASP to suit your preferences
Before running any analyses, you can use this window to adjust some settings of how JASP will read, present, and analyse the
dataset.
9. Click the menu button (three blue lines) in the top-left-hand corner of the window.
10. Select Preferences. The settings for four aspects of JASP are displayed:
a) Data preferences allow you to change how JASP is reading the .csv file.
• Ensure that Synchronise automatically on data file save is ticked. This means that if you make any edits to the .csv
file (e.g., in Excel) and save the .csv file, those changes will automatically be updated in JASP as well. This is handy
if you decide to create new variables or refresh the data.
• Ensure that Use default spreadsheet editor is ticked. This means that your experience in JASP will look similar to
the figures in this guide and to our video. If you are already quite sufficient in JASP and you have a preferred
different editor, feel welcome to use that. For most people (including us), the default editor is fine.
b) Results preferences allow you to change how the statistical analyses will be presented.
• Ensure that Display exact p-values is ticked. This will give you the p-value in the form that you need when writing
up the chi-squared statistics.
• Adjust the number of decimals to your preference.: 0, 1, 2, or 3 decimal places.
c) Four our purposes, you can leave the Interface and Advanced preferences set to the default.
Running the chi-squared test of independence for Hypothesis 1
Selecting a subset of the sample
Before you create your contingency table (i.e., cross-tabulation), you need to tell JASP who to include in that table. By default,
JASP will include all sample participants in the analysis. These next two steps show you how to ask JASP to focus on
adolescents and young adults.
11. In the data view (e.g., Figure 3), double click on the heading for the variable Life Period
a) A space will appear at the top of the window, presenting the categories within that variable. Next to each category is a
tick. These ticks are telling JASP to include this category in any subsequent analysis.
b) Click on the ticks next to Young Adulthood, Late Adulthood, Mid Adulthood, and Other.
The ticks will become crosses.
You have now told JASP that for all future analyses (or, until you tell it otherwise), it should only include participants
who are in Adolescence (i.e., the one category that is still ticked) and to exclude participants who are in Young
Adulthood, Mid Adulthood or Late Adulthood, or Other (i.e., the categories with crosses).
This YouTube video shows you an example of selecting a subset of the data (watch from the 45-second mark).
12. To check that the data have been selected: scroll through the dataset. All participants who are in Young Adulthood, Mid
Adulthood, Late Adulthood, or Other are presented in transparent text, with no box around them.
This is shown in Figure 4.
PSYC20008 | Calculating Chi-squared Page 8 of 15
Figure 4. Selecting a subset of the sample.
Cross tabulating data
13. Along the top of the JASP window, above the spreadsheet, are seven icons representing different types of analyses. Click
on Frequencies.
a) From the dropdown menu that appears, click on Classical Contingency Tables.
NOTE: There are two Contingency table options in the list: One (in the top half of the list) is a Classical Contingency
Table, the other (in the bottom half of the list) is a Bayesian Contingency Table. Click on the Classical contingency
table. We are not using Bayesian statistics in this subject.
14. JASP now presents two new tabs: the Analysis tab (centre) and the Results tab (right). These are separated by a vertical
bar, as shown in Figure 5.
a) You can adjust the size of the tabs at any time by sliding the bar from left to right.
b) You can also bring back the spreadsheet tab by sliding a second vertical bar, located on the left-hand-side of the
window.
Frequencies
Contingency
table
Analysis Tab
Results Tab
Spreadsheet
Vertical bars to
adjust tabs
Figure 5. Selecting a subset of the sample.
PSYC20008 | Calculating Chi-squared Page 9 of 15
15. To find the observed counts, in the Analysis tab:
a) Locate the variable Mid/Late Childhood Crisis in the list. Move this variable from the list into the Rows box.
b) Locate the variable Adolescence Crisis in the list. Move this variable from the list into the Columns box.
As you do this, you will see that the contingency table in the Results tab populates with numbers.
16. Take a moment to read through the contingency table. Make some notes about the following in the spaces provided:
a) Are the Mid/Late Childhood Crisis categories evenly balanced or unevenly balanced?
Find the cell that you have described in your hypothesis. For example, if the second sentence of your hypothesis is:
“More participants with balanced LC crisis profiles will have balanced Ad crisis profiles than expected by chance”, find
the cell in the row “Balanced LC” and the column “Balanced Ad”.
Compare its “row total” to the other row totals. Is the row total similar to other rows? Is it much higher, or much
lower?
b) Are the Adolescence Crisis categories evenly balanced or unevenly balanced?
Find the cell that you have described in your hypothesis. Compare its “column total” to the other column totals.
Is the column total similar to other columns? Is it much higher, or much lower?
You can use the notes that you have made above to inform your descriptive statistics, in the Results section of your
lab report.
17. To calculate the expected counts, in the Analysis tab:
a) Click the down arrow next to Cells.
b) From the options that appear, select Expected. This is asking JASP to calculate the Expected counts for each cell.
18. Take a moment to read through the observed and expected counts in the contingency table.
Find cell that you have described in your hypothesis and consider how similar – or how different - the observed count and
the expected count are to each other.
a) Do the observed and expected counts in that cell look very similar or very different? Which is higher? What does that
suggest to you?
The Chi-squared test and the standardised residuals will help you confirm the notes that you made above.
PSYC20008 | Calculating Chi-squared Page 10 of 15
Running the chi-squared test
19. To calculate the Chi-squared statistic, the degrees of freedom, and the p-value, In the Analysis tab:
a) Click the down arrow next to Statistics.
b) From the options that appear, select 2 . A new table will appear in the Results tab, with the heading “Chi-Squared
statistics.
20. Remember from Lecture 4 that these statistics each tell you something different about the relationship between the two
categorical variables. Take a moment to locate each statistic in the table and reflect on what that means.
• The degrees of freedom (df) gives an indication of the size of the table (i.e., how many cells in the table are free to
vary).
• The n value is the size of the sample included in the analysis (Note: when reporting statistics, N refers to the whole
population, n refers to the sample size. JASP is showing you n in this table, despite its use of the symbol N).
• The Chi-Squared statistic (2) gives an indication of all residuals in the table (i.e., the difference between observed
and expected counts).
• The p value represents the likelihood that the pattern that you are seeing is due to chance. p-values less than .05
indicate a significant relationship.
b) Record your Chi-Squared statistics in the space below, in the form 2 (__, n = __) = ___, p = ____
c) Is this indicating a significant association between your two variables or a non-significant association?
Calculating the standardised residuals
As shown in Lecture 4, to calculate the standardised residuals for each cell, we use the formula below.
21. Use the table on page 11 of this guide (or create one of your own that is similar) to calculate the standardised residuals of
the cells in the contingency table for your first hypothesis.
NOTE: An example of how to calculate the standardised residuals is shown in slide 23 of Lecture 4.
PSYC20008 | Calculating Chi-squared Page 11 of 15
Table 1: Calculating the standardised residuals for Hypothesis 1
ij Oij Eij Oij-Eij √Eij Oij-Eij
√Eij
In each cell below, record: Observed count
for each cell
Expected count
for each cell
Observed –
Expected count
Square root of
Expected count
Column 4 divided
by Column 5
Industry - Identity
Industry – Balanced Ad
Industry – Role Confusion
Balanced LC - Identity
Balanced LC - Balanced Ad
Balanced LC - Role Confusion
Inferiority - Identity
Inferiority - Balanced Ad
Inferiority - Role Confusion
NOTE: Balanced LC = Balanced late childhood profile, Balanced Ad = Balanced adolescence profile
22. Which of these cells has a standardised residual of greater magnitude than ±1.96? Is the number positive or negative?
What does that suggest about those cells?
Writing up the findings to address the first hypothesis
Lecture 8 offered an example of how to report a chi-squared test, shown in the blue box below.
PSYC20008 | Calculating Chi-squared Page 12 of 15
23. Use the space below to follow the above example and write up your findings for the chi-squared test that you have just
conducted:
Saving your data & analysis outputs
24. There are a few ways to save your output window so that you can return to the findings as you write up your Results
section. Here are two ways:
a) To copy one table straight into a document for editing:
• In the Results tab, hover the mouse over the contingency table.
• A down arrow will appear next to the table heading. Click on that arrow.
• Click Copy.
• Navigate to a word document.
• Paste the table into the word document.
• Adjust the formatting of the table to be consistent with APA formatting (advice on APA formatting of tables can be
found at this link: https://apastyle.apa.org/style-grammar-guidelines/tables-figures/tables).
• Save the word document somewhere that you can find it later (e.g., to USB, Dropbox, Google Doc, or OneDrive)
b) To save the results as a PDF or HTML for viewing later:
• Click the menu button (three blue lines) in the top-right-hand corner of the window.
• Click Export Results, then click Computer, then click Browse.
• Navigate to somewhere that you will be able to find the file later (e.g., to USB, Dropbox, Google Doc, or OneDrive)
• Give the file a name that you will recognise later.
• Click Save.
Running the chi-squared test of independence for
Hypothesis 2
The lab report requires that you have two hypotheses. Your second hypothesis can be for any two categorical variables in the
dataset and for any subset of the sample. You can now run your chi-squared test for your second hypothesis.
25. Write your hypothesis in the space below and work out how to conduct the chi-squared test to address that hypothesis.
Subset of sample:
Variable 1:
Variable 2:
Prediction of what will happen to the levels of measurement within those variables:
Full hypothesis:
PSYC20008 | Calculating Chi-squared Page 13 of 15
Check that your hypothesis:
a) Clarifies who is included in the analysis (i.e., which subset of the sample, Step 11 in this Guide),
b) Identified two variables, which will be cross-tabulated (Step 13 in this Guide),
c) Clarifies that you predict a significant association between those variables (Step 19-20 in the Guide)
d) Predicts what will happen in once cell of the resulting contingency table, regarding whether the observed count
will be more or fewer than chance (Step 21).
Selecting a subset of the data for your second Hypothesis
26. In the dataset view (e.g., Figure 4), double-click on the heading for Life Period.
a) Click on the crosses next to Young Adulthood, Mid Adulthood, late Adulthood, and Other.
b) Make sure all categories in Life Period have a tick instead of a cross. Any analysis conducted once you do this will
include the entire sample.
27. Find the variable that you are using to describe the subset of the sample for your second Hypothesis. Repeat Steps 11 and
12 using the categories in the relevant variable to make sure that your analysis is focused on only those people who are in
your subset of interest.
Testing your second Hypothesis
28. Repeat steps 13 to 23 using the variables of interest for your second hypothesis.
Using your response in Step 23 as an example, write up your findings for the chi-squared test that you have just conducted
in the space below:
Finding information for the Method section
29. In the dataset view (e.g., Figure 4), make sure that any categories that were previously excluded now have ticks (e.g., Step
26). Any analysis conducted once you do this will include the entire sample.
30. Along the top of the JASP window, above the dataset, are seven icons representing different types of analyses. Click on
Descriptives.
A new set of options will appear in the Analysis tab. You can return to the contingency table analysis at any time by clicking
the down arrow next to the words “Contingency tables”.
31. To find the distribution of ages in the sample, in the Analysis tab:
a) Locate the variable Age (continuous) in the list. Move Age from the list into the Variables box.
As you do this, you will see that the descriptive statistics table in the Results tab populates with numbers.
b) Click the down arrow next to Statistics.
• By default, the following should already be selected: Mean, Maximum, Minimum, Standard Deviation. As you tick
and untick different options in this section, the Results tab will update accordingly.
c) Make note of the sample’s distribution of age:
• Minimum age:
• Maximum age:
• Mean age:
• Standard Deviation:
PSYC20008 | Calculating Chi-squared Page 14 of 15
32. In your method section, you will also need to report the distribution of categorical variables, including the number of
people with different life periods, genders, and other variables that are relevant for how you define the subset of your
sample for the second hypothesis. To do this:
a) From the top bar of seven types of analyses, click on Descriptives.
b) In the new Descriptives analysis that appears in the Analysis tab, check the box next to Frequency tables.
c) Locate the variables that you need to describe your participants (e.g., gender, life period, course, enrolment status,
and any variable that you use to define the subset for your second hypothesis). Move those variables from the list into
the Variables box.
d) In the Tables option of the analysis tab, make sure that frequency tables is selected (ticked).
e) Repeat Step 24 to save this information for your report.
Editing the dataset in Excel
If you want to edit the dataset in any way (e.g., create a new variable, delete missing cases, sort variables), you can do so by
opening the file in Excel instead of JASP. Most students will not need to do this, because we have cleaned and organised
the dataset for you already.
Opening the spreadsheet in Excel.
33. There are a few ways to open the data file in Excel:
a) If Excel is the default spreadsheet program on your computer, you can double click on the file to open it. The file will
automatically open with Excel.
b) For PC: Right-click on the file, select open with, then select Excel.
c) For Mac: Hold down control and click on the file. Select open with and select Excel.
The open excel file should look like Figure 6. The first row comprises the names of each variable. Each subsequent row
represents one person in the dataset. Along the row for any one participant, each cell represents one level of measurement
(e.g., Achievement”) for its respective variable (e.g., “Identity Status”).
When you scroll to the bottom of the dataset, you will see there is one more row in Excel than JASP. This is because Excel
counts Row 1 as the row displaying the column names, whereas JASP counts Row 1 as the first line of data (i.e., the first
participant). Consequently, the number of participants is identical to the number of rows counted in the JASP dataset, and
to one-count-less than the number of rows in Excel.
Figure 6. Viewing the data in Excel.
34. If you have the file open in JASP at the same time, and you ensured that JASP was syncing with the file (Step 10), then each
time you save the file in Excel, the spreadsheet in JASP will update with any changes that you have made.
35. When you have finished viewing the file in Excel (and created new variables, if required for your second hypothesis), save
the .csv file and close Excel.
PSYC20008 | Calculating Chi-squared Page 15 of 15
Dr Abi Brooker
Melbourne School of Psychological Sciences
email: brookera@unimelb.edu.au
essay、essay代写