Often you may want to calculate descriptive statistics for each column in a data frame in R.

One of the easiest ways to do so is by using the **describe()** function from the **psych **package in R, which can be used to perform this exact task.

The **describe()** function uses the following syntax:

**describe(x, na.rm=TRUE, interp=FALSE, skew = TRUE, ranges = TRUE, …)
**

where:

**x:**Name of vector or matrix to be replicated**na.rm**: Whether NA values should be removed when calculating statistics**interp**: Whether the median should be standard or interpolated**skew**: Whether the skewness and kurtosis should be calculated**ranges**: Whether the range should be calculated

The following example shows how to use the **describe()** function in practice to calculate descriptive statistics for each column in a data frame in R.

**Note**: Before using the **describe()** function, you may need to first install the **psych **package. You can use the following syntax to do so:

**install.packages('psych')**

Once the **psych **package has been installed, you can proceed to use the **describe()** function.

**Example: How to Use the describe() Function in R**

Suppose that we create the following data frame in R that contains information about various basketball players:

**#create data frame
df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
points=c(99, 68, 86, 88, 95, 74, 78, 93),
assists=c(22, 28, 31, 35, 34, 45, 28, 31),
rebounds=c(30, 28, 24, 24, 30, 36, 30, 29))
#view data frame
df
team points assists rebounds
1 A 99 22 30
2 A 68 28 28
3 A 86 31 24
4 A 88 35 24
5 B 95 34 30
6 B 74 45 36
7 B 78 28 30
8 B 93 31 29
**

The data frame contains the following information about eight different basketball players:

**team**: The team they are on.**points**: Their total points scored in the season.**assists**: Their total assist in the season.**rebounds**: Their total rebounds in the season.

Suppose that we would like to calculate descriptive statistics for each of these variables at once, including the mean, median, range, etc.

We can use the following syntax with the **describe()** function to do so:

**library(psych)
#calculate descriptive statistics for each variable in data frame
describe(df)
vars n mean sd median trimmed mad min max range skew kurtosis
team* 1 8 1.50 0.53 1.5 1.50 0.74 1 2 1 0.00 -2.23
points 2 8 85.12 10.88 87.0 85.12 12.60 68 99 31 -0.25 -1.62
assists 3 8 31.75 6.71 31.0 31.75 4.45 22 45 23 0.55 -0.51
rebounds 4 8 28.88 3.83 29.5 28.88 1.48 24 36 12 0.30 -0.85
se
team* 0.19
points 3.85
assists 2.37
rebounds 1.36
**

The **describe()** function returns a variety of descriptive statistics for each variable.

**Note**: By default, the **describe()** function attempts to calculate descriptive statistics for all variables, even ones that are not numeric. In this particular example the **team** column is a character so it doesn’t make sense to interpret the values in the **team** row of the output.

Here is how to interpret each value in the output:

**n**: Total number of observations**mean**: The mean value**sd**: The standard deviation of values**median**: The median value**trimmed**: The trimmed mean (10% trimmed from top and bottom)**mad**: The mean absolute deviation of values**min**: The minimum value**max**: The maximum value**range**: The range of values (max – min)**skew**: The skewness of values**kurtosis**: The kurtosis of values**se**: The standard error of values

By using the **describe()** function we are able to gain a strong understanding of the distribution of values for each variable in our data frame.

**Additional Resources**

The following tutorials explain how to perform other common tasks in R:

How to Create a Frequency Table by Group in R

How to Create a Frequency Polygon in R

How to Create Relative Frequency Tables in R