R: How to Find Columns with All Missing Values


You can use the following methods to find columns in a data frame in R that contain all missing values:

Method 1: Use Base R

#check if each column has all missing values
all_miss <- apply(df, 2, function(x) all(is.na(x)))

#display columns with all missing values
names(all_miss[all_miss>0])   

Method 2: Use purrr Package

library(purrr)

#display columns with all missing values
df %>% keep(~all(is.na(.x))) %>% names

Both methods produce the same result, but the purrr approach tends to be quicker for extremely large data frames.

The following examples show how to use each method with the following data frame in R:

#create data frame
df <- data.frame(points=c(21, 15, 10, 4, 4, 9, 12, 10),
                 assists=c(NA, NA, NA, NA, NA, NA, NA, NA),
                 rebounds=c(8, 12, 14, 10, 7, 9, 8, 5),
                 steals=c(NA, NA, NA, NA, NA, NA, NA, NA))

#view data frame
df

  points assists rebounds steals
1     21      NA        8     NA
2     15      NA       12     NA
3     10      NA       14     NA
4      4      NA       10     NA
5      4      NA        7     NA
6      9      NA        9     NA
7     12      NA        8     NA
8     10      NA        5     NA

Example 1: Find Columns with All Missing Values Using Base R

The following code shows how to find the columns in the data frame with all missing values:

#check if each column has all missing values
all_miss <- apply(df, 2, function(x) all(is.na(x)))

#display columns with all missing values
names(all_miss[all_miss>0])   

[1] "assists" "steals" 

From the output we can see that the assists and steals columns have all missing values.

Example 2: Find Columns with All Missing Values Using purrr Package

The following code shows how to find the columns in the data frame with all missing values by using functions from the purrr package:

library(purrr)

#display columns with all missing values
df %>% keep(~all(is.na(.x))) %>% names

[1] "assists" "steals" 

From the output we can see that the assists and steals columns have all missing values.

This matches the output from the base R method.

Additional Resources

The following tutorials explain how to perform other common operations with missing values in R:

How to Impute Missing Values in R
How to Replace NAs with Strings in R
How to Replace NAs with Zero in dplyr

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *