How to Perform a Jarque-Bera Test in Python


The Jarque-Bera test is a goodness-of-fit test that determines whether or not sample data have skewness and kurtosis that matches a normal distribution.

The test statistic of the Jarque-Bera test is always a positive number and the further it is from zero, the more evidence that the sample data does not follow a normal distribution.

This tutorial explains how to conduct a Jarque-Bera test in Python.

How to Perform a Jarque-Bera test in Python

To conduct a Jarque-Bera test in Python we can use the jarque_bera function from the Scipy library, which uses the following syntax:

jarque_bera(x)

where:

  • x: an array of observations

This function returns a test statistic and a corresponding p-value.

Example 1

Suppose we perform a Jarque-Bera test on a list of 5,000 values that follow a normal distribution:

import numpy as np
import scipy.stats as stats

#generate array of 5000 values that follow a standard normal distribution
np.random.seed(0)
data = np.random.normal(0, 1, 5000)

#perform Jarque-Bera test
stats.jarque_bera(data)

(statistic=1.2287, pvalue=0.54098)

The test statistic is 1.2287 and the corresponding p-value is 0.54098. Since this p-value is not less than .05, we fail to reject the null hypothesis. We don’t have sufficient evidence to say that this data has skewness and kurtosis that is significantly different from a normal distribution.

This result shouldn’t be surprising since the data that we generated is composed of 5000 random variables that follow a normal distribution.

Example 2

Now suppose we perform a Jarque-Bera test on a list of 5,000 values that follow a uniform distribution:

import numpy as np
import scipy.stats as stats

#generate array of 5000 values that follow a uniform distribution
np.random.seed(0)
data = np.random.uniform(0, 1, 5000)

#perform Jarque-Bera test
stats.jarque_bera(data)

(statistic=300.1043, pvalue=0.0)

The test statistic is 300.1043 and the corresponding p-value is 0.0. Since this p-value is less than .05, we reject the null hypothesis. Thus, we have sufficient evidence to say that this data has skewness and kurtosis that is significantly different from a normal distribution.

This result also shouldn’t be surprising since the data that we generated is composed of 5000 random variables that follow a uniform distribution, which should have skewness and kurtosis that are much different than a normal distribution.

When to Use the Jarque-Bera Test

The Jarque-Bera Test is typically used for large datasets (n > 2000) in which other normality tests (like the Shapiro-Wilk test) are unreliable.

This is an appropriate test to use before you perform some analysis in which it’s assumed that the dataset follows a normal distribution. A Jarque-Bera test can tell you whether or not this assumption is satisfied.

Leave a Reply

Your email address will not be published.