In Excel, **pivot tables** offer an easy way to group and summarize data.

For example, if we have the following dataset in Excel then we can use a pivot table to quickly summarize the total sales by region:

This tells us:

- Region A had 51 total sales
- Region B had 85 total sales
- Region C had 140 total sales

Or we could summarize by another metric such as the average sales by region:

It turns out that we can quickly create similar pivot tables in R by using the **group_by()** and **summarize()** functions from the dplyr package.

This tutorial provides several examples of how to do so.

**Example: Create Pivot Tables in R**

First, let’s create the same dataset in R that we used in the previous examples from Excel:

#create data frame df <- data.frame(region=c('A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C'), device=c('X', 'X', 'Y', 'X', 'Y', 'Y', 'X', 'X', 'Y', 'Y'), sales=c(12, 18, 21, 22, 34, 29, 38, 36, 34, 32)) #view data frame df region device sales 1 A X 12 2 A X 18 3 A Y 21 4 B X 22 5 B Y 34 6 B Y 29 7 C X 38 8 C X 36 9 C Y 34 10 C Y 32

Next, let’s load the dplyr package and use the **group_by()** and **summarize()** functions to group by region and find the sum of sales by region:

library(dplyr) #find sum of sales by region df %>% group_by(region) %>% summarize(sum_sales = sum(sales)) # A tibble: 3 x 2 region sum_sales 1 A 51 2 B 85 3 C 140

We can see that these numbers match the numbers shown in the introductory Excel example.

We can also calculate the average sales by region:

#find average sales by region df %>% group_by(region) %>% summarize(mean_sales = mean(sales)) # A tibble: 3 x 2 region mean_sales 1 A 17 2 B 28.3 3 C 35

Once again, these numbers match the numbers shown in the Excel example from earlier.

Note that we can also group by multiple variables. For example, we could find the sum of sales grouped by region *and* device type:

#find sum of sales by region and device type df %>% group_by(region, device) %>% summarize(sum_sales = sum(sales)) # A tibble: 6 x 3 # Groups: region [3] region device sum_sales 1 A X 30 2 A Y 21 3 B X 22 4 B Y 63 5 C X 74 6 C Y 66

**Additional Resources**

How to Perform a VLOOKUP (Similar to Excel) in R

The Complete Guide: How to Group & Summarize Data in R

# Dear Zach: You can aggregate with R base.

# Sum of sales by region:

aggregate(sales~region, df, sum)

# Mean of sales by region:

aggregate(sales~region, df, mean)

# Sum of sales by region and device:

aggregate(sales~region+device, df, sum)