You can use the fitdistr() function from the MASS package in R to estimate the parameters of a distribution by maximizing the likelihood function.
This function uses the following basic syntax:
fitdistr(x, densefun, …)
where:
- x: A numeric vector representing the values of the distribution
- densefun: the distribution to estimate the parameters for
Note that the densefun argument accepts the following potential distribution names: beta, cauchy, chi-squared, exponential, gamma, geometric, lognormal, logistic, negative binomial, normal, Poisson, t and Weibull.
The following example shows how to use the fitdistr() function in practice.
Example: How to Use fitdistr() Function to Fit Distributions in R
Suppose we use the rnorm() function in R to generate a vector of 200 values that follow a normal distribution:
#make this example reproducible set.seed(1) #generate sample of 200 observations that follows normal dist with mean=10 and sd=3 data <- rnorm(200, mean=10, sd=3) #view first 6 observations in sample head(data) [1] 8.120639 10.550930 7.493114 14.785842 10.988523 7.538595
We can use the hist() function to create a histogram to visualize the distribution of data values:
hist(data, col='steelblue')
We can see that the data does indeed look normally distributed.
We can then use the fitdistr() function to estimate the parameters of this distribution:
library(MASS)
#estimate parameters of distribution
fitdistr(data, "normal")
mean sd
10.1066189 2.7803148
( 0.1965979) ( 0.1390157)
The fitdistr() function estimates that the vector of values follows a normal distribution with a mean of 10.1066189 and standard deviation of 2.7803148.
These values shouldn’t be surprising since we generated the data using the rnorm() function with a mean value of 10 and standard deviation of 3.
Additional Resources
The following tutorials explain how to perform other common tasks in R:
How to Plot a Normal Distribution in R
How to Generate a Normal Distribution in R
How to Perform a Shapiro-Wilk Test for Normality in R