A confidence interval for a binomial probability is calculated using the following formula:
Confidence Interval = p +/- z*(√p(1-p) / n)
where:
- p: proportion of “successes”
- z: the chosen z-value
- n: sample size
The easiest way to calculate this type of confidence interval in Python is to use the proportion_confint() function from the statsmodels package:
proportion_confint(count, nobs, alpha=0.05, method='normal')
where:
- count: Number of successes
- nobs: Total number of trials
- alpha: Significance level (default is 0.05)
- method: Method to use for confidence interval (default is “normal”)
The following example shows how to use this function in practice.
Example: Calculate Binomial Confidence Interval in Python
Suppose we want to estimate the proportion of residents in a county that are in favor of a certain law.
We decide to select a random sample of 100 residents and find that 56 of them are in favor of the law.
We can use the proportion_confint() function to calculate the 95% confidence interval for the true proportion of residents who suppose this law in the entire county:
from statsmodels.stats.proportion import proportion_confint #calculate 95% confidence interval with 56 successes in 100 trials proportion_confint(count=56, nobs=100) (0.4627099463758483, 0.6572900536241518)
The 95% confidence interval for the true proportion of residents in the county that support the law is [.4627, .6573].
By default, this function uses the asymptotic normal approximation to calculate the confidence interval. However, we can use the method argument to use a different method.
For example, the default function used in the R programming language to calculate a binomial confidence interval is the Wilson Score Interval.
We can use the following syntax to specify this method when calculating the confidence interval in Python:
from statsmodels.stats.proportion import proportion_confint #calculate 95% confidence interval with 56 successes in 100 trials proportion_confint(count=56, nobs=100, method='wilson') (0.4622810465167698, 0.6532797336983921)
This tells us that the 95% confidence interval for the true proportion of residents in the county that support the law is [.4623, .6533].
This confidence interval is just slightly different than the one calculated using the normal approximation.
Note that we can also adjust the alpha value to calculate a different confidence interval.
For example, we can set alpha to be 0.10 to calculate a 90% confidence interval:
from statsmodels.stats.proportion import proportion_confint #calculate 90% confidence interval with 56 successes in 100 trials proportion_confint(count=56, nobs=100, alpha=0.10, method='wilson') (0.47783814499647415, 0.6390007285095451)
This tells us that the 90% confidence interval for the true proportion of residents in the county that support the law is [.4778, .6390].
Note: You can find the complete documentation for the proportion_confint() function here.
Additional Resources
The following tutorials explain how to perform other common operations in Python:
How to Plot a Confidence Interval in Python
How to Use the Binomial Distribution in Python