A frequency distribution describes how often different values occur in a dataset.
For example, suppose we gather a simple random sample of 400 households in a city and record the number of pets in each household. The following table shows the results:
This table represents a frequency distribution.
A related distribution is known as a relative frequency distribution, which shows the relative frequency of each value in a dataset as a percentage of all frequencies.
For example, in the previous table we saw that there were 400 total households. To find the relative frequency of each value in the distribution, we simply divide each individual frequency by 400:
Note that relative frequency distributions have the following properties:
- Each individual relative frequency is between 0% and 100%.
- The sum of all individual relative frequencies adds up to 100%.
If these conditions are not met, then the relative frequency distribution is not valid.
Why Relative Frequency Distributions Are Useful
Relative frequency distributions are useful because they allow us to understand how common a value is in a dataset relative to all other values.
In the previous example we saw that 150 households had just one pet. But this number by itself isn’t particularly useful.
Instead, knowing that 37.5% of all households in the sample had just one pet is more useful to know. This helps us understand that a little more than 1 in 3 households had just one pet, which gives us some perspective on how “common” it is to own just one pet.
Visualizing a Relative Frequency Distribution
The most common way to visualize a relative frequency distribution is to create a relative frequency histogram, which displays the individual data values along the x-axis of a graph and uses bars to represent the relative frequencies of each class along the y-axis.
For example, here’s what a relative frequency histogram would look like for the data in our previous example:
The x-axis displays the number of pets in the household and the y-axis displays the relative frequency of households that have that number of pets.
This histogram is a useful way for us to visualize the distribution of relative frequencies.