How to Perform Runs Test in Python


Runs test is a statistical test that is used to determine whether or not a dataset comes from a random process.

The null and alternative hypotheses of the test are as follows:

H0 (null): The data was produced in a random manner.

Ha (alternative): The data was not produced in a random manner.

This tutorial explains two methods you can use to perform Runs test in Python.

Example: Runs Test in Python

We can perform Runs test on a given dataset in Python by using the runstest_1samp() function from the statsmodels library, which uses the following syntax:

runstest_1samp(x, cutoff=’mean’, correction=True) 

where:

  • x: Array of data values
  • cutoff: The cutoff to use to split the data into large and small values. Default is ‘mean’ but you can also specify ‘median’ as an alternative.
  • correction: For a sample size below 50, this function subtracts 0.5 as a correction. You can specify False to turn this correction off.

This function produces a z-test statistic and a corresponding p-value as the output.

The following code shows how to perform Run’s test using this function in Python:

from statsmodels.sandbox.stats.runs import runstest_1samp 

#create dataset
data = [12, 16, 16, 15, 14, 18, 19, 21, 13, 13]

#Perform Runs test
runstest_1samp(data, correction=False)

(-0.6708203932499369, 0.5023349543605021)

The z-test statistic turns out to be -0.67082 and the corresponding p-value is 0.50233. Since this p-value is not less than α = .05, we fail to reject the null hypothesis. We have sufficient evidence to say that the data was produced in a random manner.

Note: For this example we turned off the correction when calculating the test statistic. This matches the formula that is used to perform a Runs test in R, which does not use a correction when performing the test.

Leave a Reply

Your email address will not be published.