Why is Sample Size Important? (Explanation & Examples)


Sample size refers to the total number of individuals involved in an experiment or study.

Sample size is important because it directly affects how precisely we can estimate population parameters.

To understand why this is the case, it helps to have a basic understanding of confidence intervals.

A Brief Explanation of Confidence Intervals

In statistics, we’re often interested in measuring population parameters – numbers that describe some characteristic of an entire population.

For example, we might be interested in measuring the mean height of all individuals in a certain city.

However, it’s often too costly and time-consuming to go around and collect data on every individual in a population so we typically take a random sample from the population instead and use data from the sample to estimate the population parameter.

For example, we might collect data on the height of 100 random individuals in the city. We can then calculate the mean height of the individuals in the sample. However, we can’t be certain that the sample mean exactly matches the population mean.

To account for this uncertainty, we can create a confidence interval. A confidence interval is a range of values that is likely to contain a population parameter with a certain level of confidence.

The formula to calculate a confidence interval for a population mean is:

Confidence Interval = x  +/-  z*(s/√n)

where:

  • x: sample mean
  • z: the chosen z-value
  • s: sample standard deviation
  • n: sample size

The z-value that you will use is dependent on the confidence level that you choose. The following table shows the z-value that corresponds to popular confidence level choices:

Confidence Level z-value
0.90 1.645
0.95 1.96
0.99 2.58

The Relationship Between Sample Size & Confidence Intervals

Suppose we want to estimate the mean weight of a population of turtles. We collect a random sample of turtles with the following information:

  • Sample size n = 25
  • Sample mean weight x = 300
  • Sample standard deviation s = 18.5

Here is how to find calculate the 90% confidence interval for the true population mean weight:

90% Confidence Interval: 300 +/-  1.645*(18.5/√25) = [293.91, 306.09]

We are 90% confident that the true mean weight of the turtles in the population is between 293.91 and 306.09 pounds.

Now suppose instead of 25 turtles, we actually collect data for 50 turtles. 

Here is how to find calculate the 90% confidence interval for the true population mean weight:

90% Confidence Interval: 300 +/-  1.645*(18.5/√50) = [295.79, 304.30]

Notice that this confidence interval is narrower than the previous confidence interval. This means our estimate of the true population mean weight of turtles is more precise.

Now suppose we instead collected data for 100 turtles. 

Here is how to find calculate the 90% confidence interval for the true population mean weight:

90% Confidence Interval: 300 +/-  1.645*(18.5/√100) = [296.96, 303.04]

Notice that this confidence interval is even narrower than the previous confidence interval.

The following table summarizes each of the confidence interval widths:

Here’s the takeaway: The larger the sample size, the more precisely we can estimate a population parameter.

Additional Resources

The following tutorials provide other helpful explanations of confidence intervals and sample size.

An Introduction to Confidence Intervals
4 Examples of Confidence Intervals in Real Life
Population vs. Sample: What’s the Difference?

Leave a Reply

Your email address will not be published.