You can use one of the following methods to calculate the standard deviation by group in R:

**Method 1: Use base R**

**aggregate(df$col_to_aggregate, list(df$col_to_group_by), FUN=sd) **

**Method 2: Use dplyr**

**library(dplyr)
df %>%
group_by(col_to_group_by) %>%
summarise_at(vars(col_to_aggregate), list(name=sd))
**

**Method 3: Use data.table**

**library(data.table)
setDT(df)
dt[ ,list(sd=sd(col_to_aggregate)), by=col_to_group_by]
**

The following examples show how to use each of these methods in practice with the following data frame in R:

**#create data frame
df <- data.frame(team=rep(c('A', 'B', 'C'), each=6),
points=c(8, 10, 12, 12, 14, 15, 10, 11, 12,
18, 22, 24, 3, 5, 5, 6, 7, 9))
#view data frame
df
team points
1 A 8
2 A 10
3 A 12
4 A 12
5 A 14
6 A 15
7 B 10
8 B 11
9 B 12
10 B 18
11 B 22
12 B 24
13 C 3
14 C 5
15 C 5
16 C 6
17 C 7
18 C 9**

**Method 1: Calculate Standard Deviation by Group Using Base R**

The following code shows how to use the **aggregate() **function from base R to calculate the standard deviation of points scored by team:

#calculate standard deviation of points by team aggregate(df$points, list(df$team), FUN=sd) Group.1 x 1 A 2.562551 2 B 6.013873 3 C 2.041241

**Method 2: Calculate ****Standard Deviation ****by Group Using dplyr**

The following code shows how to use the **group_by****()** and **summarise_at()** functions from the **dplyr** package to calculate the standard deviation of points scored by team:

library(dplyr)#calculate standard deviation of points scored by teamdf %>% group_by(team) %>% summarise_at(vars(points), list(name=sd))# A tibble: 3 x 2 team name 1 A 2.56 2 B 6.01 3 C 2.04

**Method 3: Calculate Standard Deviation by Group Using data.table**

The following code shows how to calculate the standard deviation of points scored by team using functions from the **data.table** package:

library(data.table)#convert data frame to data table setDT(df) #calculate standard deviation of points scored by teamdf[ ,list(sd=sd(points)), by=team] team sd 1: A 2.562551 2: B 6.013873 3: C 2.041241

Notice that all three methods return the same results.

**Note**: If you’re working with an extremely large data frame, it’s recommended to use the **dplyr** or **data.table** approach since these packages perform much faster than base R.

