Chi-Square Test for Independence

This lesson explains how to conduct a chi-square test for independence.

When to Use a Chi-Square Test for Independence

We use a chi-square test for independence when we want to formally test whether or not there is a significant association between two categorical variables from a single population.

Checking Conditions

Before we can conduct a chi-square test for independence, we first need to make sure the following conditions are met to ensure that our test will be valid:

  • Random: A random sample or random experiment should be used to collect the data for both samples.
  • Categorical: The variables we are studying should be categorical.
  • Size: The expected number of observations at each level of the variable should be at least 5.

If these conditions are met, we can then conduct the test. The following example show how to conduct a chi-square test for independence.

Example: Chi-Square Test for Independence

We want to know whether or not gender is associated with political party preference. We take a simple random sample of 500 voters and survey them on their political party preference. Here are the results:

Republican Democrat Independent Total
Male 120 90 40 250
Female 110 95 45 250
Total 230 185 85 500


Does gender seem to be associated with political party preference? Use a 0.05 level of significance.

Step 1. State the hypotheses. 

The null hypothesis (H0): Gender and political party preference is independent.

The alternative hypothesis: (Ha): Gender and political party preference is not independent.

Step 2. Determine a significance level to use.

The problem tells us that we are to use a .05 level of significance.

Step 3. Find the test statistic.

The test statistic is X2 = Σ [ (Oi – Ei)2 / Ei ]

Where Σ is just a fancy symbol that means “sum”, Oi is the observed frequency at level i of the variable, and Ei is the expected frequency at level i of the variable.

Notice that we surveyed an equal amount of males and females. This means that if there is no association between gender and political party preference, we can expect that each party is split 50/50 between males and females.

For example, we would expect that 50% of all the people who said they were republican to be females. That is, .50 * 230 = 115. We would also expect .50 * 230 = 115 males. Let’s find the expected number and observed number of people for each political party:

Expected
Republican Democrat Independent Total
Male 115 92.5 42.5 250
Female 115 92.5 42.5 250
Total 230 185 85 500
Observed
Republican Democrat Independent Total
Male 120 90 40 250
Female 110 95 45 250
Total 230 185 85 500

Lastly, calculate the Chi-Square test statistic X2:   (120 – 115)2 / 115   +   (110 – 115)2 / 115   +   (90 – 92.5)2 / 92.5   +   (95 – 92.5)2 / 92.5   +   (40 – 42.5)2 / 42.5   +   (45 – 42.5)2 / 42.5 = .864

Use the Chi-Square Calculator with a degrees of freedom = (r-1)*(c-1) (where r = # rows, c = # columns) = (2-1)*(3-1) = 2, Chi-square critical value = .864,  and click “Calculate p-value” to find that the p-value = .35079. Then 1 – .35079= .649.

Step 4. Reject or fail to reject the null hypothesis.

Since the p-value (.649) is not less than our significance level of .05, we fail to reject the null hypothesis.

Step 5. Interpret the results. 

Since we failed to reject the null hypothesis, we do not have sufficient evidence to state that there is an association between gender and political party preference.

Leave a Reply

Your email address will not be published. Required fields are marked *