How to Calculate a Binomial Confidence Interval in Python


A confidence interval for a binomial probability is calculated using the following formula:

Confidence Interval = p  +/-  z*(√p(1-p) / n)

where:

  • p: proportion of “successes”
  • z: the chosen z-value
  • n: sample size

The easiest way to calculate this type of confidence interval in Python is to use the proportion_confint() function from the statsmodels package:

proportion_confint(countnobsalpha=0.05method='normal')

where:

  • count: Number of successes
  • nobs: Total number of trials
  • alpha: Significance level (default is 0.05)
  • method: Method to use for confidence interval (default is “normal”)

The following example shows how to use this function in practice.

Example: Calculate Binomial Confidence Interval in Python

Suppose we want to estimate the proportion of residents in a county that are in favor of a certain law.

We decide to select a random sample of 100 residents and find that 56 of them are in favor of the law.

We can use the proportion_confint() function to calculate the 95% confidence interval for the true proportion of residents who suppose this law in the entire county:

from statsmodels.stats.proportion import proportion_confint

#calculate 95% confidence interval with 56 successes in 100 trials
proportion_confint(count=56, nobs=100)

(0.4627099463758483, 0.6572900536241518)

The 95% confidence interval for the true proportion of residents in the county that support the law is [.4627, .6573].

By default, this function uses the asymptotic normal approximation to calculate the confidence interval. However, we can use the method argument to use a different method.

For example, the default function used in the R programming language to calculate a binomial confidence interval is the Wilson Score Interval.

We can use the following syntax to specify this method when calculating the confidence interval in Python:

from statsmodels.stats.proportion import proportion_confint

#calculate 95% confidence interval with 56 successes in 100 trials
proportion_confint(count=56, nobs=100, method='wilson')

(0.4622810465167698, 0.6532797336983921)

This tells us that the 95% confidence interval for the true proportion of residents in the county that support the law is [.4623, .6533].

This confidence interval is just slightly different than the one calculated using the normal approximation.

Note that we can also adjust the alpha value to calculate a different confidence interval.

For example, we can set alpha to be 0.10 to calculate a 90% confidence interval:

from statsmodels.stats.proportion import proportion_confint

#calculate 90% confidence interval with 56 successes in 100 trials
proportion_confint(count=56, nobs=100, alpha=0.10, method='wilson')

(0.47783814499647415, 0.6390007285095451)

This tells us that the 90% confidence interval for the true proportion of residents in the county that support the law is [.4778, .6390].

Note: You can find the complete documentation for the proportion_confint() function here.

Additional Resources

The following tutorials explain how to perform other common operations in Python:

How to Plot a Confidence Interval in Python
How to Use the Binomial Distribution in Python

Leave a Reply

Your email address will not be published.