How to Remove Rows with Some or All NAs in R


Often you may want to remove rows with all or some NAs (missing values) in a data frame in R.

This tutorial explains how to remove these rows using base R and the tidyr package. We’ll use the following data frame for each of the following examples:

#create data frame with some missing values
df <- data.frame(points = c(12, NA, 19, 22, 32),
                 assists = c(4, NA, 3, NA, 5),
                 rebounds = c(5, NA, 7, 12, NA))

#view data frame
df

  points assists rebounds
1     12       4        5
2     NA      NA       NA
3     19       3        7
4     22      NA       12
5     32       5       NA

Remove NAs Using Base R

The following code shows how to use complete.cases() to remove all rows in a data frame that have a missing value in any column:

#remove all rows with a missing value in any column
df[complete.cases(df), ]

  points assists rebounds
1     12       4        5
3     19       3        7

The following code shows how to use complete.cases() to remove all rows in a data frame that have a missing value in specific columns:

#remove all rows with a missing value in the third column
df[complete.cases(df[ , 3]),]

  points assists rebounds
1     12       4        5
3     19       3        7
4     22      NA       12

#remove all rows with a missing value in either the first or third column
df[complete.cases(df[ , c(1,3)]),]

  points assists rebounds
1     12       4        5
3     19       3        7
4     22      NA       12

Remove NAs Using Tidyr

The following code shows how to use drop_na() from the tidyr package to remove all rows in a data frame that have a missing value in any column:

#load tidyr package
library(tidyr)

#remove all rows with a missing value in any column
df %>% drop_na()

  points assists rebounds
1     12       4        5
3     19       3        7

The following code shows how to use drop_na() from the tidyr package to remove all rows in a data frame that have a missing value in specific columns:

#load tidyr package
library(tidyr)

#remove all rows with a missing value in the third column
df %>% drop_na(rebounds)

  points assists rebounds
1     12       4        5
3     19       3        7
4     22      NA       12

You can find more R tutorials here.

Leave a Reply

Your email address will not be published.