# Control Selection Bias: Definition & Examples

In statistics, a case-control study is a type of study that seeks to understand factors associated with a particular disease.

This type of study uses a group of cases, which are individuals who have the particular disease of interest.

Researchers then attempt to create a group of controls, which are individuals who are similar to the individuals in the case group but don’t have the particular disease of interest.

However, one common type of bias that can occur in this type of study is known as control selection bias.

This occurs when the group of individuals who belong to the control group are not actually representative of the population that produced the cases.

The main reason this occurs is because the researchers, for one reason or another, are more likely to include individuals who had exposure to the factor of interest that may affect their probability of developing the disease of interest.

## Example: Control Selection Bias in Lung Study

Suppose that a medical researcher is interested in studying the association between smoking and lung cancer.

For the cases group, the researcher is able to include 100 individuals who actively have some form of lung disease.

For the control group, suppose the researcher simply goes around a local hospital and collects a simple random sample of 100 patients to see if they have lung disease.

Because the researcher went to a hospital to gather individuals for the control group, there is a much higher likelihood that an individual in this hospital had a prior smoking condition compared to if the researcher simply went around a local community and collected a simple random sample of individuals.

This means that the control group is unlikely to be representative of the target population.

## Problems Caused by Control Selection Bias

When control selection bias occurs in a study, it means that the results of the study are unreliable.

In particular, the odds ratio calculated by a researcher is likely to be inaccurate because the actual odds of a patient being exposed to some factor (like smoking) in the control group are not representative of the odds that an individual is exposed to this factor in the target population.

In the previous example, the odds of an individual being a smoker in the control group are likely to be higher than the odds of an individual being a smoker in the general population, which will make smoking seem more prevalent that it is in the real world.

In medical studies, odds ratios are one of the main ways that researchers are able to assess the association of some lifestyle factor (such as smoking) with the probability of developing some disease.

When control selection bias is present in a study, the odds ratios become unreliable and the conclusion of the entire study becomes unreliable.

The main way to prevent control selection bias is to ensure that researchers use a sampling method that is likely to produce a representative sample of individuals in the control group.