Often you may want to group the rows of a data.table in R based on multiple columns.

You can use the following basic syntax to do so:

dt[, mean_points:=mean(points), by=c('team', 'position')]

This particular example will group the rows of the data.table named **dt** by the **team** and **position** columns and then create a new column named **mean_points** that calculates the mean value of the **points** column.

Note that we chose to use the **mean()** function to calculate the mean value of one column, grouped by two other columns, but you can use whatever function you’d like when summarizing your own data.table in R.

The following example shows how to group a data.table by multiple columns in R in practice.

**Example: How to Group data.table by Multiple Columns in R**

Suppose we create the following data.table in R that contains information about various basketball players:

library(data.table) #create data table dt <- data.table(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'), position=c('G', 'G', 'F', 'F', 'G', 'G', 'F', 'F'), points=c(99, 68, 86, 88, 95, 74, 78, 93), assists=c(22, 28, 45, 35, 34, 45, 28, 31)) #view data table dt team position points assists 1: A G 99 22 2: A G 68 28 3: A F 86 45 4: A F 88 35 5: B G 95 34 6: B G 74 45 7: B F 78 28 8: B F 93 31

The data.table contains the following columns:

**team**: The team name a player belongs to**position**: The position of the player**points**: The total points scored by the player**assists**: The total assists made by the player

Suppose that we would like to calculate the mean value of the **points** column, grouped by the values in the **team** and **position** columns.

We can use the following syntax to do so:

library(data.table) #calculate mean of points column, grouped by team and position column dt[, mean_points:=mean(points), by=c('team', 'position')] #view updated data table dt team position points assists mean_points 1: A G 99 22 83.5 2: A G 68 28 83.5 3: A F 86 45 87.0 4: A F 88 35 87.0 5: B G 95 34 84.5 6: B G 74 45 84.5 7: B F 78 28 85.5 8: B F 93 31 85.5

The new column named **mean_points** contains the mean value of the **points** column, grouped by the values in the **team** and **position** columns.

For example, we can see:

- The mean points scored by players on team A in position G is
**83.5**. - The mean points scored by players on team A in position F is
**87.0**. - The mean points scored by players on team B in position G is
**84.5**. - The mean points scored by players on team B in position F is
**85.5**.

Note that you can replace the **mean()** function with any function that you would like to calculate a different descriptive statistic.

For example, we could use the following syntax to instead calculate the max of the **points** column, grouped by the **team** and **position** columns:

library(data.table) #calculate max of points column, grouped by team and position column dt[, max_points:=max(points), by=c('team', 'position')] #view updated data table dt team position points assists max_points 1: A G 99 22 99 2: A G 68 28 99 3: A F 86 45 88 4: A F 88 35 88 5: B G 95 34 95 6: B G 74 45 95 7: B F 78 28 93 8: B F 93 31 93

The new column named **max_points** contains the max value of the **points** column, grouped by the values in the **team** and **position** columns.

**Additional Resource**

The following tutorials explain how to perform other common tasks in R:

How to Filter a data.table in R

How to Sort a data.table in R

How to Remove Empty Rows from Data Frame in R