Often researchers are interested in answering some question about a population, such as:
- What is the mean height of students at a certain school?
- What is the mean household income in a certain city?
- What is the mean square footage of houses in a certain country?
- What proportion of residents in a certain county support a certain law?
The target population is the complete collection of items that researchers are interested in.
Since it’s often too time-consuming and expensive to go around and collect data on every individual in a target population, researchers will instead take a sample from the target population, which is simply a subset of the population.
The list of items from which a sample is obtained is known as the sampling frame. Ideally the sampling frame will be exactly equal to the target population, but this is rarely the case in practice.
Sampling Frame Example
Suppose researchers are interested in estimating the proportion of residents over 18 years old in a certain county that support a certain law.
The target population includes every resident in the city over 18 years old. For simplicity, let’s assume the city contains 100,000 such residents.
Ideally our sampling frame would contain all 100,000 such residents so that we could obtain a sample that is representative of the target population. However, in reality our sampling frame usually doesn’t match our target population perfectly for a list of reasons including:
- Some residents may have moved since the most recent census.
- Some residents may have turned 18 since the most recent census.
- The city may not have complete information about each resident.
- The city may not have means to contact certain residents.
For this reason, our sampling frame (the list of residents over 18 that we’re able to actually obtain information on) likely won’t match our target population perfectly.
Thus, when we go to collect a random sample of residents to survey it’s unlikely that our sample will be perfectly representative of the target population. This is known as sampling frame error.
While it’s typically impossible to get a sampling frame that perfectly matches the target population, researchers often try to make the sampling frame as similar to the target population as possible.
Thus, when they use data from the sample to draw inferences about the target population, they can be reasonably confident that their inferences will hold true.