How to Calculate the Mean of a Column in R (With Examples)


You can use one of the following methods to calculate the mean of a column in R:

#calculate mean using column name
mean(df$my_column)

#calculate mean using column name (ignore missing values)
mean(df$my_column, na.rm=TRUE)

#calculate mean using column position
mean(df[, 1])

#calculation mean of all numeric columns
colMeans(df[sapply(df, is.numeric)])

The following examples show how to use each method with the following data frame in R:

#create data frame
df <- data.frame(team=c('A', 'A', 'A', 'B', 'B', 'B'),
                 points=c(99, 90, 93, 86, 88, 82),
                 assists=c(33, 28, 31, 39, NA, 30))

#view data frame
df

  team points assists
1    A     99      33
2    A     90      28
3    A     93      31
4    B     86      39
5    B     88      NA
6    B     82      30

Example 1: Calculate Mean Using Column Name

The following code shows how to calculate the mean of the ‘points’ column using the column name:

#calculate mean of 'points' column
mean(df$points)

[1] 89.66667

The mean value in the ‘points’ column is 89.66667.

Example 2: Calculate Mean Using Column Name (Ignore Missing Values)

If we attempt to calculate the mean of a column that has missing values, we’ll receive NA as a result:

#attempt to calculate mean of 'assists' column
mean(df$assists)

[1] NA

We must use na.rm=TRUE to ignore missing values when calculating the column mean:

#calculate mean of 'assists' column and ignore missing values
mean(df$assists, na.rm=TRUE)

[1] 32.2

The mean value in the ‘assists’ column is 32.2.

Example 3: Calculate Mean Using Column Position

The following code shows how to calculate the mean of the column in index position 2:

#calculate mean of column in index position 2
mean(df[, 2])

[1] 89.66667

The mean value of the column in index position 2 (the ‘points’ column) is 89.66667.

Example 4: Calculate Mean of All Numeric Columns

The following code shows how to calculate the mean of all numeric columns in the data frame:

#calculate mean of all numeric columns
colMeans(df[sapply(df, is.numeric)], na.rm=TRUE)

  points  assists 
89.66667 32.20000

The output displays the mean value of each numeric column in the data frame.

Additional Resources

The following tutorials explain how to calculate other mean values in R:

How to Calculate a Trimmed Mean in R
How to Calculate Geometric Mean in R
How to Calculate a Weighted Mean in R

Leave a Reply

Your email address will not be published. Required fields are marked *