Fisher’s Exact Test is used to determine whether or not there is a significant association between two categorical variables.
It is typically used as an alternative to the Chi-Square Test of Independence when one or more of the cell counts in a 2×2 table is less than 5.
This tutorial explains how to perform Fisher’s Exact Test in Python.
Example: Fisher’s Exact Test in Python
Suppose we want to know whether or not gender is associated with political party preference at a particular college.
To explore this, we randomly poll 25 students on campus. The number of students who are Democrats or Republicans, based on gender, is shown in the table below:
Democrat | Republican | |
---|---|---|
Female | 8 | 4 |
Male | 4 | 9 |
To determine if there is a statistically significant association between gender and political party preference, we can use the following steps to perform Fisher’s Exact Test in Python:
Step 1: Create the data.
First, we will create a table to hold our data:
data = [[8, 4], [4, 9]]
Step 2: Perform Fisher’s Exact Test.
Next, we can perform Fisher’s Exact Test using the fisher_exact function from the SciPy library, which uses the following syntax:
fisher_exact(table, alternative=’two-sided’)
where:
- table: A 2×2 contingency table
- alternative: Defines the alternative hypothesis. Default is ‘two-sided’, but you can also choose ‘less’ or ‘greater’ for one-sided tests.
The following code shows how to use this function in our specific example:
import scipy.stats as stats print(stats.fisher_exact(data)) (4.5, 0.1152)
The p-value for the tests is 0.1152.
Fisher’s Exact Test uses the following null and alternative hypotheses:
- H_{0}: (null hypothesis) The two variables are independent.
- H_{1}: (alternative hypothesis) The two variables are not independent.
Since this p-value is not less than 0.05, we do not reject the null hypothesis.
Thus, we don’t have sufficient evidence to say that there is a significant association between gender and political party preference.
In other words, gender and political party preference are independent.
Hello Zach,
Thanks for the explanation it’s clear and concise.
I have one question: what does the 4.5 in the output represent?
thanks