Often you may want to calculate the sum by group in R. There are three methods you can use to do so:

**Method 1: Use base R.**

aggregate(df$col_to_aggregate, list(df$col_to_group_by), FUN=sum)

**Method 2: Use the dplyr() package.**

**library(dplyr)
df %>%
group_by(col_to_group_by) %>%
summarise(Freq = sum(col_to_aggregate))
**

**Method 3: Use the data.table package.**

**library(data.table)
dt[ ,list(sum=sum(col_to_aggregate)), by=col_to_group_by]
**

The following examples show how to use each of these methods in practice.

**Method 1: Calculate Sum by Group Using Base R**

The following code shows how to use the **aggregate() **function from base R to calculate the sum of the points scored by team in the following data frame:

#create data frame df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'), pts=c(5, 8, 14, 18, 5, 7, 7), rebs=c(8, 8, 9, 3, 8, 7, 4)) #view data frame df team pts rebs 1 a 5 8 2 a 8 8 3 b 14 9 4 b 18 3 5 b 5 8 6 c 7 7 7 c 7 4 #find sum of points scored by team aggregate(df$pts, list(df$team), FUN=sum) Group.1 x 1 a 13 2 b 37 3 c 14

**Method 2: Calculate Sum by Group Using dplyr**

The following code shows how to use the **group_by()** and **summarise()** functions from the **dplyr** package to calculate the sum of points scored by team in the following data frame:

library(dplyr)#create data frame df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'), pts=c(5, 8, 14, 18, 5, 7, 7), rebs=c(8, 8, 9, 3, 8, 7, 4)) #find sum of points scored by teamdf %>% group_by(team) %>% summarise(Freq = sum(pts)) # A tibble: 3 x 2 team Freq <chr> <dbl> 1 a 13 2 b 37 3 c 14

**Method 3: Calculate Sum by Group Using data.table**

The following code shows how to use the **data.table** package to calculate the sum of points scored by team in the following data frame:

library(data.table)#create data frame df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'), pts=c(5, 8, 14, 18, 5, 7, 7), rebs=c(8, 8, 9, 3, 8, 7, 4)) #convert data frame to data table setDT(df) #find sum of points scored by teamdf[ ,list(sum=sum(pts)), by=team] team sum 1: a 13 2: b 37 3: c 14

Notice that all three methods return identical results.

**Note:** If you have an extremely large dataset, the data.table method will work the fastest among the three methods listed here.

**Additional Resources**

How to Calculate the Mean by Group in R

How to Calculate Quantiles by Group in R

Hi,

Firstly, thanks for your work it’s really helpful!

I have a question regarding this topic and would like to apply it to my data. (see below)

The problem is that I want to sum (AREA) by group (SPECIE), but those sums have to stay within the boundaries of the WMU.

So in short I want the sums of the AREA for each SPECIE within the WMU number. Not for the dataset as a whole.

I’ve been trying different ways of doing this but to no avail. I hope you will be able to help me.

Kind Regards

Geoffrey

WMU SPECIE AREA

104 kamsalamander 125.69

104 rugstreeppad 10.7

104 rugstreeppad 75.36

104 roerdomp 27.10

105 wespendief 702.88

105 reiger 14.00

105 ijsvogel 45.25

105 wespendief 77.52

105 kamsalamander 125.69

… … …

Sometimes it’s the straightforward examples of simple stuff that are the most lifesaving when you feel like an idiot for spending an hour trying to figure out what should take 2 seconds to do. Thanks so much.