How to Sum Columns Based on a Condition in R


You can use the following basic syntax to sum columns based on condition in R:

#sum values in column 3 where col1 is equal to 'A'
sum(df[which(df$col1=='A'), 3])

The following examples show how to use this syntax in practice with the following data frame:

#create data frame
df <- data.frame(conference = c('East', 'East', 'East', 'West', 'West', 'East'),
                 team = c('A', 'A', 'A', 'B', 'B', 'C'),
                 points = c(11, 8, 10, 6, 6, 5),
                 rebounds = c(7, 7, 6, 9, 12, 8))

#view data frame
df

  conference team points rebounds
1       East    A     11        7
2       East    A      8        7
3       East    A     10        6
4       West    B      6        9
5       West    B      6       12
6       East    C      5        8

Example 1: Sum One Column Based on One Condition

The following code shows how to find the sum of the points column for the rows where team is equal to ‘A’:

#sum values in column 3 (points column) where team is equal to 'A'
sum(df[which(df$team=='A'), 3])

[1] 29

The following code shows how to find the sum of the rebounds column for the rows where points is greater than 9:

#sum values in column 4 (rebounds column) where points is greater than 9
sum(df[which(df$points > 9), 4])

[1] 13

Example 2: Sum One Column Based on Multiple Conditions

The following code shows how to find the sum of the points column for the rows where team is equal to ‘A’ and conference is equal to ‘East’:

#sum values in column 3 (points column) where team is 'A' and conference is 'East'
sum(df[which(df$team=='A' & df$conference=='East'), 3])

[1] 29

Note that the & operator stands for “and” in R.

Example 3: Sum One Column Based on One of Several Conditions

The following code shows how to find the sum of the points column for the rows where team is equal to ‘A’ or ‘C’:

#sum values in column 3 (points column) where team is 'A' or 'C'
sum(df[which(df$team == 'A' | df$team =='C'), 3])

[1] 34

Note that the | operator stands for “or” in R.

Additional Resources

The following tutorials explain how to perform other common functions in R:

How to Sum Specific Columns in R
How to Sum Specific Rows in R
How to Calculate Sum by Group in R

Featured Posts

One Reply to “How to Sum Columns Based on a Condition in R”

  1. Hi Zach
    Let’s say I have this dataset
    df1=data.frame(team = c(‘A’, ‘A’, ‘A’, ‘B’, ‘B’, ‘C’,’C’,’F’,’G’),
    points = c(11, 8, 10, 6, 6, 5,12,4,17))
    and this
    dfsum <- data.frame(team = c('A', 'B', 'C', 'D', 'E', 'F'), points=c(0,0,0,0,0,0))

    I want to make conditional sum in dfsum, column points so that it sums points from df1 if team in dfsum equals team in df1. I know I can group df1 based on team and perform merge, but I would like to do it with sumif.
    I tried with nested for loop but it does not work…

    nrow1=nrow(df1)
    nrow2=nrow(dfsum)

    for(i in 1:nrow2){
    for(j in 1:nrow1){

    if (dfsum[i,1]==df1[j,1]){
    dfsum[i,2]=cumsum(df1[j,2])
    }
    }

    }

Leave a Reply

Your email address will not be published. Required fields are marked *