How to Calculate the Sum by Group in R (With Examples)


Often you may want to calculate the sum by group in R. There are three methods you can use to do so:

Method 1: Use base R.

aggregate(df$col_to_aggregate, list(df$col_to_group_by), FUN=sum) 

Method 2: Use the dplyr() package.

library(dplyr)

df %>%
  group_by(col_to_group_by) %>%
  summarise(Freq = sum(col_to_aggregate))

Method 3: Use the data.table package.

library(data.table)

dt[ ,list(sum=sum(col_to_aggregate)), by=col_to_group_by]

The following examples show how to use each of these methods in practice.

Method 1: Calculate Sum by Group Using Base R

The following code shows how to use the aggregate() function from base R to calculate the sum of the points scored by team in the following data frame:

#create data frame
df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'),
                 pts=c(5, 8, 14, 18, 5, 7, 7),
                 rebs=c(8, 8, 9, 3, 8, 7, 4))

#view data frame
df

  team pts rebs
1    a   5    8
2    a   8    8
3    b  14    9
4    b  18    3
5    b   5    8
6    c   7    7
7    c   7    4

#find sum of points scored by team
aggregate(df$pts, list(df$team), FUN=sum)

  Group.1  x
1       a 13
2       b 37
3       c 14

Method 2: Calculate Sum by Group Using dplyr

The following code shows how to use the group_by() and summarise() functions from the dplyr package to calculate the sum of points scored by team in the following data frame:

library(dplyr) 

#create data frame
df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'),
                 pts=c(5, 8, 14, 18, 5, 7, 7),
                 rebs=c(8, 8, 9, 3, 8, 7, 4))

#find sum of points scored by team 
df %>%
  group_by(team) %>%
  summarise(Freq = sum(pts))

# A tibble: 3 x 2
  team   Freq
  <chr> <dbl>
1 a        13
2 b        37
3 c        14  

Method 3: Calculate Sum by Group Using data.table

The following code shows how to use the data.table package to calculate the sum of points scored by team in the following data frame:

library(data.table) 

#create data frame
df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'),
                 pts=c(5, 8, 14, 18, 5, 7, 7),
                 rebs=c(8, 8, 9, 3, 8, 7, 4))

#convert data frame to data table 
setDT(df)

#find sum of points scored by team 
df[ ,list(sum=sum(pts)), by=team]

   team sum
1:    a  13
2:    b  37
3:    c  14

Notice that all three methods return identical results.

Note: If you have an extremely large dataset, the data.table method will work the fastest among the three methods listed here.

Additional Resources

How to Calculate the Mean by Group in R
How to Calculate Quantiles by Group in R

2 Replies to “How to Calculate the Sum by Group in R (With Examples)”

  1. Hi,

    Firstly, thanks for your work it’s really helpful!

    I have a question regarding this topic and would like to apply it to my data. (see below)
    The problem is that I want to sum (AREA) by group (SPECIE), but those sums have to stay within the boundaries of the WMU.
    So in short I want the sums of the AREA for each SPECIE within the WMU number. Not for the dataset as a whole.

    I’ve been trying different ways of doing this but to no avail. I hope you will be able to help me.

    Kind Regards
    Geoffrey

    WMU SPECIE AREA
    104 kamsalamander 125.69
    104 rugstreeppad 10.7
    104 rugstreeppad 75.36
    104 roerdomp 27.10
    105 wespendief 702.88
    105 reiger 14.00
    105 ijsvogel 45.25
    105 wespendief 77.52
    105 kamsalamander 125.69
    … … …

  2. Sometimes it’s the straightforward examples of simple stuff that are the most lifesaving when you feel like an idiot for spending an hour trying to figure out what should take 2 seconds to do. Thanks so much.

Leave a Reply

Your email address will not be published. Required fields are marked *