What Are Pearson Residuals? (Definition & Example)


Pearson residuals are used in a Chi-Square Test of Independence to analyze the difference between observed cell counts and expected cell counts in a contingency table.

The formula to calculate a Pearson residual is:

rij = (Oij – Eij) / √Eij

where:

  • rij: The Pearson residual for the cell in the ith column and jth row
  • Oij: The observed value for the cell in the ith column and jth row
  • Eij: The expected value for the cell in the ith column and jth row

A similar metric is the Standardized (adjusted) Pearson residual, which is calculated as:

rij = (Oij – Eij) / √Eij(1-ni+)(1-n+j)

where:

  • rij: The Pearson residual for the cell in the ith column and jth row
  • Oij: The observed value for the cell in the ith column and jth row
  • Eij: The expected value for the cell in the ith column and jth row
  • pi+: The row total divided by the grand total
  • p+j: The column total divided by the grand total

Standardized Pearson residuals are normally distributed with a mean of 0 and standard deviation of 1. Any standardized Pearson residual with an absolute value above certain thresholds (e.g. 2 or 3) indicates a lack of fit.

The following example shows how to calculate Pearson residuals in practice.

Example: Calculating Pearson Residuals

Suppose researchers want to use a Chi-Square Test of Independence to determine whether or not gender is associated with political party preference.

They decide to take a simple random sample of 500 voters and survey them on their political party preference.

The following contingency table shows the results of the survey:

  Republican Democrat Independent Total
Male 120 90 40 250
Female 110 95 45 250
Total 230 185 85 500

Before we calculate the Pearson residuals, we must first calculate the expected counts for each cell in the contingency table. We can use the following formula to do so:

Expected value = (row sum * column sum) / table sum.

For example, the expected value for Male Republicans is: (230*250) / 500 = 115.

We can repeat this formula to obtain the expected value for each cell in the table:

  Republican Democrat Independent Total
Male 115 92.5 42.5 250
Female 115 92.5 42.5 250
Total 230 185 85 500

Next, we can calculate the Pearson residual for each cell in the table.

For example, the Pearson residual for the cell that contains Male Republicans would be calculated as:

  • rij = (Oij – Eij) / √Eij
  • rij = (120 – 115) / √115
  • rij = 0.466

We can repeat this formula to obtain the Pearson residual for each cell in the table:

  Republican Democrat Independent
Male 0.446 -0.259 -0.383
Female -0.446 0.259 0.383

Next, we can calculate the Standardized Pearson residual for each cell in the table.

For example, the Standardized Pearson residual for the cell that contains Male Republicans would be calculated as:

  • rij = (Oij – Eij) / √Eij(1-pi+)(1-p+j)
  • rij = (120 – 115) / √115(1-250/500)(1-230/500)
  • rij = 0.897

We can repeat this formula to obtain the Standardized Pearson residual for each cell in the table:

  Republican Democrat Independent
Male 0.897 -0.463 -0.595
Female -0.897 0.463 0.595

We can see that none of the Pearson Standardized Residuals have an absolute value greater than 3, which indicates that none of the cells contribute to a significant lack of fit.

If we use this online calculator to perform a Chi-Square Test of Independence, we’ll find that the p-value of the test is 0.649198.

Since this p-value is not less than .05, we do not have sufficient evidence to say that there is an association between gender and political party preference.

Additional Resources

The following tutorials explain how to perform a Chi-Square Test of Independence using different statistical software:

An Introduction to the Chi-Square Test of Independence
How to Perform a Chi-Square Test of Independence in Excel
How to Perform a Chi-Square Test of Independence in R
Chi-Square Test of Independence Calculator

Leave a Reply

Your email address will not be published.