When constructing confidence intervals, it’s important that certain assumptions are met. If these assumptions are violated, then the confidence interval can become unreliable.
Here are the six assumptions you should check when constructing a confidence interval:
Assumption #1: Random Sampling
The data should be collected using a random sampling method (a method in which each individual in a population is equally likely to be included in the sample) so that the sample data you’re working with is representative of the overall population of interest.
Assumption #2: Independence
Each observation in the sample data should be independent of every other observation. This means that no two observations in a sample are related to each other or affect each other in any way.
If you use a random sampling method to collect the data, this assumption is typically met.
Assumption #3: Large Sample
In order to apply the Central Limit Theorem, our sample size must be sufficiently large. In general, we consider “sufficiently large” to be 30 or larger. However, this number can vary based on the underlying shape of the population distribution.
- If the population distribution is symmetric, sometimes a sample size as small as 15 is sufficient.
- If the population distribution is skewed, generally a sample size of at least 30 is needed.
- If the population distribution is extremely skewed, then a sample size of 40 or higher may be necessary.
Assumption #4: The 10% Condition
The sample size should be less than or equal to 10% of the population size. This further ensures that the observations in the data are independent.
Assumption #5: The Success / Failure Condition
When working with confidence intervals that involve proportions, there should be at least 10 expected successes and 10 expected failures in a sample in order to use the normal distribution as an approximation.
Assumption #6: Homogeneity of Variances
When working with confidence intervals that involve two samples, it’s assumed that the two populations that the samples came from have equal variances.
As a rule of thumb, if the ratio of the larger variance to the smaller variance is less than 4, then we can assume the variances are approximately equal and use the two sample t-test.
For example, if sample 1 has a variance of 24.5 and sample 2 has a variance of 15.2 then the ratio of the larger sample variance to the smaller would be calculated as 24.5 / 15.2 = 1.61.
Since this ratio is less than 4, we could assume that the variances between the two groups are approximately equal.