How to Easily Find the Standard Deviation of Specific Columns in R

How to find the standard deviation of columns in R

Suppose we have the following data frame in R:

#create a data frame with three columns and five rows
data <- data.frame(a = c(1, 2, 3, 4, 5),
                   b = c(6, 7, 8, 9, 10),
                   c = c(11, 12, 13, 14, 15))
data

#  a  b  c
#1 1  6 11
#2 2  7 12
#3 3  8 13
#4 4  9 14
#5 5 10 15

In order to find the standard deviation of each column in this data frame, we can use the following piece of code:

#find standard deviation of each column
apply(data, 2, sd)

#       a        b        c 
#1.581139 1.581139 1.581139 

This returns a numeric vector of three values that represent the standard deviations of each column in the data frame.

This single line of code utilizes the built-in R function apply(), which can be used when you want to apply a function to the rows or columns of a matrix or data frame.

The basic syntax for the apply() function is as follows:

apply(X, MARGIN, FUN)

  • X is the name of the matrix or data frame
  • MARGIN indicates which dimension to perform an operation across (1 = row, 2 = column)
  • FUN is the specific operation you want to perform (e.g. min, max, sum, mean, etc.)

In this case X = data, MARGIN = 2 (for columns), and FUN = sd (for standard deviation).

Thus, apply(data, 2, sd) allowed us to find the standard deviation for each column in our data frame.

Leave a Reply

Your email address will not be published. Required fields are marked *