Comparing Distributions

Histograms and boxplots are commonly used to compare different distributions.

Comparing Distributions with Histograms

Example 1

The following two histograms show the distribution of scores on a particular test for two different classes:

Question: The student with the lowest test score belonged to which class?
Answer: Mrs Burton’s class. There is one student in her class who scored a 50-59, but no students in Mrs. Wood’s class who did so.

Question: Which class had a higher median test score?
Answer: Mrs. Wood’s class. The center value in Mrs. Wood’s class distribution is greater than the center value in Mrs. Burton’s class distribution.

Example 2

The following two histograms show the distribution of taco sales for both Austin and Marks’ taco stands over the past 50 days:

Question: Which taco stand experienced more variability in their taco sales?
Answer: Austin’s taco stand experienced more variability in sales since his distribution of sales is more “spread out” than Mark’s.

Question: Which taco stand sold the most tacos in a single day?
Answer: Austin’s taco stand sold the most tacos in a single day. Austin sold eight tacos on three different days and Mark never had a day where he sold eight tacos.

Comparing Distributions with Boxplots

Example 1

The two boxplots below show the distribution of plant heights (in inches) in two different gardens:

Question: In which garden is the median plant height the tallest?
Answer: The median plant height is tallest in Michael’s garden. Recall that the vertical line inside the box indicates the median. The median plant height in Michael’s garden is about 15 inches compared to 10 inches in Jan’s garden.

Question: In which garden does the smallest plant belong?
Answer: The smallest plant belongs to Jan’s garden. Recall that the dot furthest to the left on the boxplot indicates the minimum value in a dataset. In this case, the smallest plant in Jan’s garden is about one inch tall and the smallest plant in Michael’s garden is about four inches tall.

Example 2

The two boxplots below show the distribution of daily traffic tickets issued in two different cities during a certain month:

Question: Which city had the highest median number of tickets issued per day?
Answer: Both cities issued about the same median number of tickets per day. Notice how the vertical line inside the box for both boxplots is located at six.

Question: Which city had a larger variation of tickets issued per day?
Answer: City A had a larger variation of tickets issued per day. Notice how the box in city A, which has end points located at the lower quartile (Q1) and upper quartile (Q3), is much wider than the box in city B. This indicates that there was more variation in the number of tickets issued per day in city A.