CDF vs. PDF: What’s the Difference?

This tutorial provides a simple explanation of the difference between a PDF (probability density function) and a CDF (cumulative density function) in statistics.

Random Variables

Before we can define a PDF or a CDF, we first need to understand random variables.

random variable, usually denoted as X, is a variable whose values are numerical outcomes of some random process. There are two types of random variables: discrete and continuous.

Discrete Random Variables

discrete random variable is one which can take on only a countable number of distinct values like 0, 1, 2, 3, 4, 5…100, 1 million, etc. Some examples of discrete random variables include:

  • The number of times a coin lands on tails after being flipped 20 times.
  • The number of times a dice lands on the number after being rolled 100 times.

Continuous Random Variables

continuous random variable is one which can take on an infinite number of possible values. Some examples of continuous random variables include:

  • Height of a person
  • Weight of an animal
  • Time required to run a mile

For example, the height of a person could be 60.2 inches, 65.2344 inches, 70.431222 inches, etc. There are an infinite amount of possible values for height.

Rule of Thumb: If you can count the number of outcomes, then you are working with a discrete random variable (e.g. counting the number of times a coin lands on heads). But if you can measure the outcome, you are working with a continuous random variable (e.g. measuring, height, weight, time, etc.)

Probability Density Functions

probability density function (pdf) tells us the probability that a random variable takes on a certain value.

For example, suppose we roll a dice one time. If we let denote the number that the dice lands on, then the probability density function for the outcome can be described as follows:

P(x < 1) : 0

P(x = 1) : 1/6

P(x = 2) : 1/6

P(x = 3) : 1/6

P(x = 4) : 1/6

P(x = 5) : 1/6

P(x = 6) : 1/6

P(x > 6) : 0

Note that this is an example of a discrete random variable, since can only take on integer values.

For a continuous random variable, we cannot use a PDF directly, since the probability that takes on any exact value is zero.

For example, suppose we want to know the probability that a burger from a particular restaurant weighs a quarter-pound (0.25 lbs). Since weight is a continuous variable, it can take on an infinite number of values. For example, a given burger might actually weight 0.250001 pounds, or 0.24 pounds, or 0.2488 pounds. The probability that a given burger weights exactly .25 pounds is essentially zero.

Cumulative Density Functions

cumulative density function (cdf) tells us the probability that a random variable takes on a value less than or equal to x.

For example, suppose we roll a dice one time. If we let denote the number that the dice lands on, then the cumulative density function for the outcome can be described as follows:

P(x ≤ 0) : 0

P(x ≤ 1) : 1/6

P(x ≤ 2) : 2/6

P(x ≤ 3) : 3/6

P(x ≤ 4) : 4/6

P(x ≤ 5) : 5/6

P(x ≤ 6) : 6/6

P(x > 6) : 0

Notice that the probability that is less than or equal to is 6/6, which is equal to 1. This is because the dice will land on either 1, 2, 3, 4, 5, or 6 with 100% probability.

This example uses a discrete random variable, but a continuous density function can also be used for a continuous random variable.

Cumulative density functions have the following properties:

  • The probability that a random variable takes on a value less than the smallest possible value is zero. For example, the probability that a dice lands on a value less than 1 is zero.
  • The probability that a random variable takes on a value less than or equal to the largest possible value is one. For example, the probability that a dice lands on a value of 1, 2, 3, 4, 5, or 6 is one. It must land on one of those numbers.
  • The cdf is always non-decreasing. That is, the probability that a dice lands on a number less than or equal to 1 is 1/6, the probability that it lands on a number less than or equal to 2 is 2/6, the probability that it lands on a number less than or equal to 3 is 3/6, etc. The cumulative probabilities are always non-decreasing.

The Relationship Between a CDF and a PDF

In technical terms, a probability density function (pdf) is the derivative of a cumulative density function (cdf). 

Futhermore, the area under the curve of a pdf between negative infinity and is equal to the value of on the cdf.

For an in-depth explanation of the relationship between a pdf and a cdf, along with the proof for why the pdf is the derivative of the cdf, refer to a statistical textbook.

Leave a Reply

Your email address will not be published. Required fields are marked *