How to Use ecdf() Function in R


You can use the ecdf function in R to calculate and plot an empirical cumulative distribution function.

Here is the most common way to use this function:

#calculate empirical cumulative distribution function of data
p = ecdf(data)

#plot empirical cumulative distribution function
plot(p)

The following example shows how to use this function in practice.

Example: How to Use ecdf() Function in R

For this example, let’s create a vector of 1,000 random values that follow a standard normal distribution:

#make this example reproducible
set.seed(1)

#create vector of 1,000 random values that follow standard normal distribution
data = rnorm(1000)

#view first six values in vector
head(data)

[1] -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078 -0.8204684

We can use the ecdf function to calculate the empirical cumulative distribution function of this dataset and then use the plot function to visualize it:

#calculate empirical cumulative distribution function of data
p = ecdf(data)

#plot empirical cumulative distribution function
plot(p)

Note that you can also use the xlab, ylab and main arguments within the plot function to add an x-axis label, y-axis label and title to the plot, respectively:

#calculate empirical cumulative distribution function of data
p = ecdf(data)

#plot empirical cumulative distribution function with axis labels and title
plot(p, xlab='x', ylab='CDF', main='CDF of Data') 

ecdf function in R

The x-axis displays the values from the dataset.

The y-axis displays the cumulative distribution function.

Related: The Difference Between a CDF vs. PDF in Statistics

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Plot a Normal Distribution in R
A Guide to dnorm, pnorm, qnorm, and rnorm in R
How to Perform a Shapiro-Wilk Test for Normality in R

Leave a Reply

Your email address will not be published. Required fields are marked *