This tutorial explains how to calculate the standard deviation in R, including an explanation of the formula used as well as several examples.

**What is Standard Deviation?**

The **standard deviation **is a common way to measure how “spread out” values are in a dataset. The formula to find the standard deviation of a sample is:

√Σ (x_{i} – μ)^{2} / (n-1)

where **Σ** is a fancy symbol that means “sum”, **x _{i}** is the i

^{th}value in the dataset,

**μ**is the mean value of the dataset, and

**n**is the sample size.

**How to Calculate Standard Deviation in R**

We can use the built-in **sd() **function to easily calculate the standard deviation of a sample in R.

For example, the following code illustrates how to find the sample standard deviation of a dataset:

#create dataset data <- c(1, 3, 4, 6, 11, 14, 17, 20, 22, 23) #find standard deviation sd(data) #[1] 8.279157

Note that the standard deviation is equivalent to the square root of the variance:

sqrt(var(data)) #[1] 8.279157

Note that we could also write our own custom function to find the sample standard deviation:

#create custom function to find standard deviation find_sd <- function(x) { sqrt(sum((x-mean(x))^2/(length(x)-1))) } #find standard deviation find_sd(data) #[1] 8.279157

Also note that we must specify **na.rm = TRUE** if we wish to calculate the sample standard deviation of a dataset and there are missing values present:

#create vector of values with NA data_NA <- c(1, NA, 4, 6, NA, 14, 17, 20, 22, 23) #attempt to find standard deviation sd(data_NA) #[1] NA #find standard deviation by excluding missing values sd(data_NA, na.rm = TRUE) #[1] 8.61788

**How to Calculate Several Standard Deviations in R At Once**

In the previous examples, we showed how to find the standard deviation for a single vector of values. However, we can also use the **sd() **function to find the standard deviation of one or more variables in a dataset.

For example, consider the built-in R dataset **mtcars**:

#view first six lines of mtcars dataset head(mtcars) # mpg cyl disp hp drat wt qsec vs am gear carb #Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 #Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 #Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 #Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 #Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 #Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

To find the standard deviation of the variable *mpg*, we can use the following code:

#find standard deviation ofmpgsd(mtcars$mpg) #[1] 6.026948

We can also find the standard deviation of several variables at once by using the apply() function. For example, the following code illustrates how to find the standard deviation of the variables *mpg, cyl*, and *wt *all at once:

#find standard deviation ofmpg,cyl, andwtapply(mtcars[ , c('mpg', 'cyl', 'wt')], 2, sd)

And we can find the standard deviation of every single variable in the dataset by using the following code:

#find standard deviation of all variables apply(mtcars, 2, sd) # mpg cyl disp hp drat wt # 6.0269481 1.7859216 123.9386938 68.5628685 0.5346787 0.9784574 # qsec vs am gear carb # 1.7869432 0.5040161 0.4989909 0.7378041 1.6152000