# Undercoverage Bias: Explanation, Examples, & How to Prevent It

This tutorial provides an explanation of undercoverage bias, along with several examples of undercoverage bias and how it can be prevented in studies.

## What is Undercoverage Bias?

Undercoverage bias is the bias that occurs when some members of a population are inadequately represented in the sample. This type of bias often occurs in convenience sampling, in which you collect a sample that is easy to obtain but is often prone to undercoverage of certain members of a population.

## Why is Undercoverage Bias a Problem?

Undercoverage bias is a problem because it causes the sample to be unrepresentative of the population. The point of collecting data for a sample is to obtain data in a way that is quicker and easier than collecting data for an entire population, and to be able to extrapolate the findings from the sample to the larger population.

In order to extrapolate the findings, though, the sample needs to be representative of our population as a whole. Ideally we would like our sample to be a “mini” version of the population. Unfortunately, undercoverage bias can cause the people in our sample to be significantly different than the people in the larger population.

For example, suppose researchers want to know what citizens in a certain city think of a potential new law. To collect data, they go to a nearby library and ask people that walk in what they think of the potential new law. Although this is a convenient way to gather data, the researchers risk undercoverage of several types of people, including:

• People who are housebound
• People who simply don’t like visiting the library
• People who go to a different library in a different part of the city

Because this study excludes certain types of people, the results of the study are unlikely to be representative of the population. For example, suppose the people who go to this particular library are far more likely to be supportive of the potential new law compared to the rest of the population. This means that when the results of the survey are in, it will appear that a large percentage of citizens in this city support the potential new law, when in fact most of the citizens do not.

The visual below illustrates this problem: suppose the green circles represent people who are in favor of the new law while the red circles represent people who are opposed to the new law:

Notice how most of the people who are in favor of the new law are included in the sample, yet the sample is not representative of the larger population. The results of the survey would show that most people are in favor of the new law, when in fact this is not true.

## Examples of Undercoverage Bias

The following examples illustrate several cases in which undercoverage bias can occur.

### Example 1

Researchers want to learn what citizens in a certain city think of having a new park built. In order to collect data, researchers attend a local town meeting and ask people there about their thoughts.  Unfortunately, this form of convenience sampling is likely to suffer from undercoverage of the following groups:

• People who have no access to transportation to go to the town meetings
• People who aren’t even aware of that fact that town meetings take place
• People who work in the evenings and are simply unable to attend town meetings

Thus, the opinions of these people will not be included in the results of the study. Because of this undercoverage of these specific groups, the sample is unlikely to be representative of the larger population.

### Example 2

Researchers want to know how many hours per day people watch TV in a particular county. To collect data for the study, they randomly pick names from a local phonebook and call people to ask them about their TV consumption. This is a form of convenience sampling and it is likely to suffer from undercoverage of the following groups:

• Very wealthy people who do not list their phone numbers in local phonebooks
• Young people who only use cellphones and do not have their numbers listed in local phonebooks

Thus, the amount of TV that very wealthy people and young people watch will be undercovered in this study. Because of this undercoverage of these specific groups, the sample is unlikely to be representative of the larger population.

### Example 3

Researchers want to know what citizens in a particular city think of a new traffic law so they give out a questionnaire to people that walk by at a local mall. This is a form of convenience sampling and it’s likely to suffer from undercoverage of the following groups:

• People who have no access to transportation to go to the mall (and thus are largely unaffected by traffic laws)
• People who don’t like going to the mall (and thus may choose not to drive in busy areas)
• People who go to a different mall in a different city

Thus, the opinions of these people will not be included in the results of the study. Because of this undercoverage of these specific groups, the sample is unlikely to be representative of the larger population.

## How to Prevent Undercoverage Bias

Undercoverage bias often occurs as a result of convenience sampling. To eliminate (or at least minimize) the effects of undercoverage bias, a better form of sampling is using a simple random sample. In this type of sample, every member of a population has an equal chance of being selected to be in the sample.

The benefit of this approach is that simple random samples are usually representative of the population we’re interested in since every member has an equal chance of being included in the sample. When we use this approach instead of convenience sampling, we can be more confident in our ability to extrapolate the findings from the sample to the larger population since it’s likely that members from every (or nearly every) group in the population are included in the sample.