Skewness is a way to describe the symmetry of a distribution.
A distribution is left skewed if it has a “tail” on the left side of the distribution:
A distribution is right skewed if it has a “tail” on the right side of the distribution:
And a distribution has no skew if it’s symmetrical on both sides:
Note that left skewed distributions are sometimes called “negatively-skewed” distributions and right skewed distributions are sometimes called “positively-skewed” distributions.
Properties of Skewed Distributions
The following diagrams show where the mean, median and mode are typically located in different distributions.
Left Skewed Distribution: Mean < Median < Mode
In a left skewed distribution, the mean is less than the median.
Right Skewed Distribution: Mode < Median < Mean
In a right skewed distribution, the mean is greater than the median.
No Skew: Mean = Median = Mode
In a symmetrical distribution, the mean, median, and mode are all equal.
Using Box Plots to Visualize Skewness
A box plot is a type of plot that displays the five number summary of a dataset, which includes:
- The minimum value
- The first quartile (the 25th percentile)
- The median value
- The third quartile (the 75th percentile)
- The maximum value
To make a box plot, we draw a box from the first to the third quartile. Then we draw a vertical line at the median. Lastly, we draw “whiskers” from the quartiles to the minimum and maximum value.
Depending on the location of the median value in the boxplot, we can tell whether or not a distribution is left skewed, right skewed, or symmetrical.
When the median is closer to the bottom of the box and the whisker is shorter on the lower end of the box, the distribution is right skewed.
When the median is closer to the top of the box and the whisker is shorter on the upper end of the box, the distribution is left skewed.
When the median is in the middle of the box and the whiskers are roughly equal on each side, the distribution is symmetrical.
Examples of Skewed Distributions
Here are some real-life examples of skewed distributions.
Left-Skewed Distribution: The distribution of age of deaths.
The distribution of the age of deaths in most populations is left-skewed. Most people live to be between 70 and 80 years old, with fewer and fewer living less than this age.
Right-Skewed Distribution: The distribution of household incomes.
The distribution of household incomes in the U.S. is right-skewed, with most households earning between $40k and $80k per year but with a long right tail of households that earn much more.
No Skew: The distribution of male heights.
It’s well-known that the height of males is roughly normally distributed and has no skew. For example, the average height of a male in the U.S. is roughly 69.1 inches. The distribution of heights is roughly symmetrical, with some being shorter and some being taller.