An **F-test **is used to test whether two population variances are equal. The null and alternative hypotheses for the test are as follows:

**H _{0}:** σ

_{1}

^{2}= σ

_{2}

^{2}(the population variances are equal)

**H _{1}:** σ

_{1}

^{2}≠ σ

_{2}

^{2}(the population variances are

*not*equal)

This tutorial explains how to perform an F-test in Python.

**Example: F-Test in Python**

Suppose we have the following two samples:

x = [18, 19, 22, 25, 27, 28, 41, 45, 51, 55]y = [14, 15, 15, 17, 18, 22, 25, 25, 27, 34]

We can use the following function to perform an F-test to determine if the two populations these samples came from have equal variances:

import numpy as np #define F-test function def f_test(x, y): x = np.array(x) y = np.array(y) f = np.var(x, ddof=1)/np.var(y, ddof=1) #calculate F test statistic dfn = x.size-1 #define degrees of freedom numerator dfd = y.size-1 #define degrees of freedom denominator p = 1-scipy.stats.f.cdf(f, dfn, dfd) #find p-value of F test statistic return f, p #perform F-test f_test(x, y) (4.38712, 0.019127)

The F test statistic is **4.38712 **and the corresponding p-value is **0.019127**. Since this p-value is less than .05, we would reject the null hypothesis. This means we have sufficient evidence to say that the two population variances are *not *equal.

**Notes**

- The F test statistic is calculated as s
_{1}^{2}/ s_{2}^{2}. By default, numpy.var calculates the population variance. To calculate the sample variance, we need to specify**ddof=1**. - The p-value corresponds to 1 – cdf of the F distribution with numerator degrees of freedom = n
_{1}-1 and denominator degrees of freedom = n_{2}-1. - This function only works when the first sample variance is larger than the second sample variance. Thus, define the two samples in such a way that they work with the function.

**When to Use the F-Test**

The F-test is typically used to answer one of the following questions:

**1.** Do two samples come from populations with equal variances?

**2.** Does a new treatment or process reduce the variability of some current treatment or process?

*Find more Python tutorials on Statology here.*