How to Perform Row-wise Operations in dplyr


Often you may want to perform various operations row-wise on a data frame in R.

The easiest way to do so is by using the rowwise() function from the dplyr package in R.

This function can be used to perform a variety of operations row-wise on a data frame and the following example shows how to use this function in practice.

Note: Before using the rowwise() function you may need to first use the following syntax to install the dplyr package:

install.package('dplyr')

Once the dplyr package is installed, you can then use the rowwise() function to perform operations row-wise on a data frame without encountering any errors.

Example: How to Perform Operations Row-Wise in dplyr

Suppose we create the following data frame in R that contains information about various basketball players:

#create data frame
df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
                 points=c(99, 68, 86, 88, 95, 74, 78, 93),
                 assists=c(22, 28, 31, 35, 34, 45, 28, 31),
                 rebounds=c(30, 28, 24, 24, 30, 36, 30, 29))

#view data frame
df

  team points assists rebounds
1    A     99      22       30
2    A     68      28       28
3    A     86      31       24
4    A     88      35       24
5    B     95      34       30
6    B     74      45       36
7    B     78      28       30
8    B     93      31       29

The data frame contains the following columns:

  • team: The team that the player belongs to
  • points: The total points scored by the player during the season
  • assists: The total assists made by the player during the season
  • rebounds: The total rebounds made by the player during the season

Suppose that we would like to use the mutate() function to add a new column that calculates the mean of all values across the points, assists and rebounds columns.

We could use the following syntax to do so:

library(dplyr)

#calculate mean value across points, assists and rebounds
df %>% mutate(mean_val = mean(c(points, assists, rebounds)))

  team points assists rebounds mean_val
1    A     99      22       30 48.58333
2    A     68      28       28 48.58333
3    A     86      31       24 48.58333
4    A     88      35       24 48.58333
5    B     95      34       30 48.58333
6    B     74      45       36 48.58333
7    B     78      28       30 48.58333
8    B     93      31       29 48.58333

Notice that a new column has been added named mean_val.

This new column contains the mean value across the points, assists and rebounds columns for all rows in the data frame.

However, suppose that we would instead like to calculate the mean value across the points, assists and rebounds columns for each individual row.

We could use the rowwise() function with the following syntax to do so:

library(dplyr)

#calculate mean value across points, assists and rebounds rowwise
df %>% rowwise() %>% mutate(mean_val = mean(c(points, assists, rebounds)))

  team  points assists rebounds mean_val
               
1 A         99      22       30     50.3
2 A         68      28       28     41.3
3 A         86      31       24     47  
4 A         88      35       24     49  
5 B         95      34       30     53  
6 B         74      45       36     51.7
7 B         78      28       30     45.3
8 B         93      31       29     51  

Notice that the mutate() function creates a new column named mean_val once again, but this time the values in the column represent the mean values across the points, assists and rebounds columns for each individual row.

For example, the mean value across the three specific columns for the first row is:

  • Mean of First Row: (99 + 22 + 30) / 3 = 50.3

The mean value of every individual row in the data frame is calculated in a similar manner.

Note: You can find the complete documentation for the rowwise() function in the dplyr package here.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Use slice_max() in dplyr
How to Rename Columns Using dplyr
How to Add Row to Data Frame Using dplyr
How to Use the pull() Function in dplyr

Leave a Reply

Your email address will not be published. Required fields are marked *