A probability distribution in statistics is said to have a **memoryless property** if the probability of some future event occurring is not affected by the occurrence of past events.

There are only two probability distributions that have the memoryless property:

- The exponential distribution with non-negative real numbers.
- The geometric distribution with non-negative integers.

Both of these probability distributions are used to model the expected amount of time before some event occurs.

It turns out that, at any given point, knowing how much time has already passed does not actually inform us as to whether or not some event is more likely to happen sooner or later.

The following examples help us gain a better intuition of the memoryless property.

**An Intuition of the Memoryless Property**

Consider the following examples:

**Not Memoryless**

It is known that a certain brand of laptops last about 6 years, on average, before they die. Thus, if we know that a particular laptop is 5 years old then the expected time until it dies is quite short. However, if another laptop is only 1 year old then the expected time until it dies is quite long.

In this example, knowing the amount of time that has passed during the lifespan of each laptop informs us as to how long the laptop will continue to work until it dies. Thus, this probability distribution would not have a memoryless property.

**Memoryless**

Suppose Jessica owns a convenience store. She wants to know how long she will have to wait until the next customer enters the store.

In this example, knowing when the last customer entered the store is not actually helpful at predicting when the next customer will enter because each customer is independent and exhibits individual behavior.

Thus, this probability distribution would have a memoryless property. In other words, the probability of some future event occurring is not affected by the occurrence of past events.

**The Memoryless Property: A Formal Definition**

In formal statistical terms, a random variable *X* is said to follow a probability distribution with a memoryless property if for any *a* and *b *in {0, 1, 2, …} it’s true that:

Pr(X > *a* + *b* | X ≥ *a*) = Pr(X > *b*)

For example, suppose we have some probability distribution with a memoryless property and we let *X* be the number of trials until the first success. If *a* = 30 and *b* = 10 then we would say:

- Pr(X >
*a*+*b*| X ≥*a*) = Pr(X >*b*) - Pr(X > 30 + 10 | X ≥
*30*) = Pr(X > 10) - Pr(X > 40 | X ≥
*30*) = Pr(X > 10)

In other words, if we’ve had 30 trials without a success then the probability that we’ll have to wait until trial #40 or more to experience a success is the same as the probability of starting at zero and waiting until trial #10 or more to experience a success.

Because this probability distribution has a memoryless property, it means that knowing how many failures we’ve had up to a certain point still does not inform us as to the likelihood of failure in the future.

**The Memoryless Property: An Example**

Suppose that an average of 30 customers per hour enter a store and the time between arrivals is exponentially distributed. On average 2 minutes elapse between successive visits.

Suppose that 10 minutes has passed since the last customer arrived. Since this is an unusually long amount of time, it would seem more likely for a customer to arrive within the next minute.

However, because the exponential distribution has a memoryless property, this turns out not to be the case. The time spent waiting on the next customer to arrive is not dependent on how long it has been since the last customer arrived.

We can prove this by using the CDF of the exponential distribution:

**CDF:** 1 – e^{-λx}

where λ is calculated as 1 / average time between arrivals. In our example, λ = 1/2 = 0.5.

If we let *a* = 10 and *b* = 1, then we have:

- Pr(X >
*a*+*b*| X ≥*a*) = Pr(X >*b*) - Pr(X > 10 + 1 | X ≥ 10) = Pr(X > 1) = 1 – (1 – e
^{-(0.5)(1)}) = 0.6065

Regardless of the amount of time that has passed since the arrival of the last customer, the probability that more than one minute will pass until the next arrival is **0.6065**.

Tossing a coin and the probability of getting either a heads or a tails is not affected by the occurrence of past events.. And it is also a discrete uniform distribution. So, why doesn’t the uniform distribution have a memoryless property?