How to Fix in R: invalid factor level, NA generated


One warning message you may encounter when using R is:

Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "C") :
  invalid factor level, NA generated

This warning occurs when you attempt to add a value to a factor variable in R that does not already exist as a defined level.

The following example shows how to address this warning in practice.

How to Reproduce the Warning

Suppose we have the following data frame in R:

#create data frame
df <- data.frame(team=factor(c('A', 'A', 'B', 'B', 'B')),
                 points=c(99, 90, 86, 88, 95))

#view data frame
df

  team points
1    A     99
2    A     90
3    B     86
4    B     88
5    B     95

#view structure of data frame
str(df)

'data.frame':	5 obs. of  2 variables:
 $ team  : Factor w/ 2 levels "A","B": 1 1 2 2 2
 $ points: num  99 90 86 88 95

We can see that the team variable is a factor with two levels: “A” and “B”

Now suppose we attempt to add a new row to the end of the data frame using a value of “C” for team:

#add new row to end of data frame
df[nrow(df) + 1,] = c('C', 100)

Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "C") :
  invalid factor level, NA generated

We receive a warning message because the value “C” does not already exist as a factor level for the team variable.

It’s important to note that this is simply a warning message and R will still add the new row to the end of the data frame, but it will use a value of NA instead of “C”:

#view updated data frame
df

  team points
1    A     99
2    A     90
3    B     86
4    B     88
5    B     95
6   NA    100

How to Avoid the Warning

To avoid the invalid factor level warning, we must first convert the factor variable to a character variable and then we can convert it back to a factor variable after adding the new row:

#convert team variable to character
df$team <- as.character(df$team)

#add new row to end of data frame
df[nrow(df) + 1,] = c('C', 100)

#convert team variable back to factor
df$team <- as.factor(df$team)

#view updated data frame
df

  team points
1    A     99
2    A     90
3    B     86
4    B     88
5    B     95
6    C    100

Notice that we’re able to successfully add a new row to the end of the data frame and we avoid a warning message.

We can also check that the value “C” has been added as a factor level to the team variable:

#view structure of updated data frame
str(df)

'data.frame':	6 obs. of  2 variables:
 $ team  : Factor w/ 3 levels "A","B","C": 1 1 2 2 2 3
 $ points: chr  "99" "90" "86" "88" ...

Additional Resources

The following tutorials explain how to fix other common errors in R:

How to Fix in R: Arguments imply differing number of rows
How to Fix in R: error in select unused arguments
How to Fix in R: replacement has length zero

Leave a Reply

Your email address will not be published.