How to Interpret Chi-Square Test Results in R


In statistics, there are two different types of Chi-Square tests:

1. The Chi-Square Goodness of Fit Test – Used to determine whether or not a categorical variable follows a hypothesized distribution.

2. The Chi-Square Test of Independence – Used to determine whether or not there is a significant association between two categorical variables.

Often you may have to perform each of these tests using the R programming language.

This tutorial explains how to interpret the results of both tests using step-by-step examples.

Example 1: Interpret Chi-Square Goodness of Fit Test Results in R

Suppose a store owner believes that an equal number of customers come into his shop each day from Monday through Friday.

To test this hypothesis, a he records the number of customers that come into the shop in a given week and finds the following:

  • Monday: 50 customers
  • Tuesday: 60 customers
  • Wednesday: 40 customers
  • Thursday: 47 customers
  • Friday: 53 customers

We can perform a Chi-Square goodness of fit test in R to determine if the data is consistent with the store owner’s claim.

To perform this test in R, we can use the chisq.test() function, which uses the following syntax:

chisq.test(x, p) 

where:

  • x: A numerical vector of observed frequencies.
  • p: A numerical vector of expected proportions.

The following code shows how to perform this test in practice:

#create array of observed and expected frequencies
observed <- c(50, 60, 40, 47, 53) 
expected <- c(.2, .2, .2, .2, .2)

#perform Chi-Square Goodness of Fit Test
chisq.test(x=observed, p=expected)

	Chi-squared test for given probabilities

data:  observed
X-squared = 4.36, df = 4, p-value = 0.3595

Here is how to interpret the results of the test:

  • The Chi-Square test statistic is 4.36.
  • The corresponding p-value is 0.3595.

Since the p-value (.3595) is not less than 0.05, we fail to reject the null hypothesis.

In the context of this example, it means we do not have sufficient evidence to say that the true distribution of customers is different from the distribution that the shop owner claimed.

Example 2: Interpret Chi-Square Test of Independence Results in R

Suppose researchers want to know whether or not gender is associated with political party preference.

They take a simple random sample of 500 voters and survey them on their political party preference:

  Republican Democrat Independent Total
Male 120 90 40 250
Female 110 95 45 250
Total 230 185 85 500

We can use the following syntax to perform a Chi-Square Test of Independence in R to determine if gender is associated with political party preference:

#create table to hold survey data
data <- matrix(c(120, 90, 40, 110, 95, 45), ncol=3, byrow=TRUE)
colnames(data) <- c("Rep","Dem","Ind")
rownames(data) <- c("Male","Female")
data <- as.table(data)

#perform Chi-Square Test of Independence
chisq.test(data)

	Pearson's Chi-squared test

data:  data
X-squared = 0.86404, df = 2, p-value = 0.6492

Here is how to interpret the results of the test:

  • Chi-Square Test Statistic: 0.86404
  • The corresponding p-value: 0.6492

Since the p-value (0.6492) of the test is not less than 0.05, we fail to reject the null hypothesis.

In the context of this example, this means we do not have sufficient evidence to say that there is an association between gender and political party preference.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Perform a Chi-Square Test of Independence in R
How to Calculate the P-Value of a Chi-Square Statistic in R
How to Find the Chi-Square Critical Value in R

Leave a Reply

Your email address will not be published. Required fields are marked *