Researchers are often interested in answering questions about populations like:
- What is the average height of a certain species of plant?
- What is the average weight of a certain species of bird?
- What percentage of citizens in a certain city support a certain law?
One way to answer these questions is to go around and collect data on every single individual in the population of interest.
However, this is typically too costly and time-consuming which is why researchers instead take a sample of the population and use the data from the sample to draw conclusions about the population as a whole.
There are many different methods researchers can potentially use to obtain individuals to be in a sample. These are known as sampling methods.
In this post we share the most commonly used sampling methods in statistics, including the benefits and drawbacks of the various methods.
Probability Sampling Methods
The first class of sampling methods is known as probability sampling methods because every member in a population has an equal probability of being selected to be in the sample.
Simple random sample
Definition: Every member of a population has an equal chance of being selected to be in the sample. Randomly select members through the use of a random number generator or some means of random selection.
Example: We put the names of every student in a class into a hat and randomly draw out names to get a sample of students.
Benefit: Simple random samples are usually representative of the population we’re interested in since every member has an equal chance of being included in the sample.
Stratified random sample
Definition: Split a population into groups. Randomly select some members from each group to be in the sample.
Example: Split up all students in a school according to their grade – freshman, sophomores, juniors, and seniors. Ask 50 students from each grade to complete a survey about the school lunches.
Benefit: Stratified random samples ensure that members from each group in the population are included in the survey.
Cluster random sample
Definition: Split a population into clusters. Randomly select some of the clusters and include all members from those clusters in the sample.
Example: A company that gives whale watching tours wants to survey its customers. Out of ten tours they give one day, they randomly select four tours and ask every customer about their experience.
Benefit: Cluster random samples get every member from some of the groups, which is useful when each group is reflective of the population as a whole.
Systematic random sample
Definition: Put every member of a population into some order. Choosing a random starting point and select every nth member to be in the sample.
Example: A teacher puts students in alphabetical order according to their last name, randomly chooses a starting point, and picks every 5th student to be in the sample.
Benefit: Systematic random samples are usually representative of the population we’re interested in since every member has an equal chance of being included in the sample.
Non-probability Sampling Methods
Another class of sampling methods is known as non-probability sampling methods because not every member in a population has an equal probability of being selected to be in the sample.
This type of sampling method is sometimes used because it’s much cheaper and more convenient compared to probability sampling methods. It’s often used during exploratory analysis when researchers simply want to gain an initial understanding of a population.
However, the samples that result from these sampling methods cannot be used to draw inferences about the populations they came from because they typically aren’t representative samples.
Definition: Choose members of a population that are readily available to be included in the sample.
Example: A researcher stands in front of a library during the day and polls people that happen to walk by.
Drawback: Location and time of day will affect the results. More than likely, the sample will suffer from undercoverage bias since certain people (e.g. those who work during the day) will not be represented as much in the sample.
Voluntary response sample
Definition: A researcher puts out a request for volunteers to be included in a study and members of a population voluntarily decide to be included in the sample or not.
Example: A radio host asks listeners to go online and take a survey on his website.
Drawback: People who voluntarily respond will likely have stronger opinions (positive or negative) than the rest of the population, which makes them an unrepresentative sample. Using this sampling method, the sample is likely to suffer from nonresponse bias – certain groups of people are simply less likely to provide responses.
Definition: Researchers recruit initial subjects to be in a study and then ask those initial subjects to recruit additional subjects to be in the study. Using this approach, the sample size “snowballs” bigger and bigger as each additional subject recruits more subjects.
Example: Researchers are conducting a study of individuals with rare diseases, but it’s difficult to find individuals who actually have the disease. However, if they can find just a few initial individuals to be in the study then they can ask them to recruit further individuals they may know through a private support group or through some other means.
Drawback: Sampling bias is likely to occur. Because initial subjects recruit additional subjects, it’s likely that many of the subjects will share similar traits or characteristics that might be unrepresentative of the larger population under study. Thus, findings from the sample can’t be extrapolated to the population.
Definition: Researchers recruit individuals based on who they think will be most useful based on the purpose of their study.
Example: Researchers want to know about the opinions that individuals in a city have about a potential new rock climbing gym being placed in the city square so they purposely seek out individuals that hang out at other rock climbing gyms around the city.
Drawback: The individuals in the sample are unlikely to be representative of the overall population Thus, findings from the sample can’t be extrapolated to the population.