You can use the following syntax to calculate summary statistics for all numeric variables in a data frame in R using functions from the **dplyr** package:

library(dplyr) library(tidyr) df %>% summarise(across(where(is.numeric), .fns = list(min = min, median = median, mean = mean, stdev = sd, q25 = ~quantile(., 0.25), q75 = ~quantile(., 0.75), max = max))) %>% pivot_longer(everything(), names_sep='_', names_to=c('variable', '.value'))

The **summarise()** function comes from the **dplyr** package and is used to calculate summary statistics for variables.

The **pivot_longer()** function comes from the **tidyr** package and is used to format the output to make it easier to read.

This particular syntax calculates the following summary statistics for each numeric variable in a data frame:

- Minimum value
- Median value
- Mean value
- Standard deviation
- 25th percentile
- 75th percentile
- Maximum value

The following example shows how to use this function in practice.

**Example: Calculate Summary Statistics in R Using dplyr**

Suppose we have the following data frame in R that contains information about various basketball players:

#create data frame df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'), points=c(12, 15, 19, 14, 24, 25, 39, 34), assists=c(6, 8, 8, 9, 12, 6, 8, 10), rebounds=c(9, 9, 8, 10, 8, 4, 3, 3)) #view data frame df team points assists rebounds 1 A 12 6 9 2 A 15 8 9 3 A 19 8 8 4 A 14 9 10 5 B 24 12 8 6 B 25 6 4 7 B 39 8 3 8 B 34 10 3

We can use the following syntax to calculate summary statistics for each numeric variable in the data frame:

library(dplyr) library(tidyr) #calculate summary statistics for each numeric variable in data frame df %>% summarise(across(where(is.numeric), .fns = list(min = min, median = median, mean = mean, stdev = sd, q25 = ~quantile(., 0.25), q75 = ~quantile(., 0.75), max = max))) %>% pivot_longer(everything(), names_sep='_', names_to=c('variable', '.value')) # A tibble: 3 x 8 variable min median mean stdev q25 q75 max 1 points 12 21.5 22.8 9.74 14.8 27.2 39 2 assists 6 8 8.38 2.00 7.5 9.25 12 3 rebounds 3 8 6.75 2.92 3.75 9 10

** **From the output we can see:

- The minimum value in the points column is
**12**. - The median value in the points column is
**21.5**. - The mean value in the points column is
**22.8**.

And so on.

**Note**: In this example, we utilized the dplyr **across()** function. You can find the complete documentation for this function here.

**Additional Resources**

The following tutorials explain how to perform other common functions using dplyr:

How to Summarise Data But Keep All Columns Using dplyr

How to Summarise Multiple Columns Using dplyr

How to Calculate Standard Deviation Using dplyr