How to Use the unnest() Function in R


Often you may want to expand a list-column in R containing data frames into rows and columns.

The easiest way to do so is by using the unnest() function from the tidyr package in R.

This function uses the following basic syntax:

unnest(data, cols, keep_empty = FALSE, …)

where:

  • data: The name of a data frame
  • cols: List-columns to unnest
  • keep_empty: By default, you get one row of output for each element of the list that you are unnesting. This means that if there’s a size-0 element, then that entire row will be dropped from the output. If you want to preserve all rows, use keep_empty = TRUE to replace size-0 elements with a single row of missing values.

The following example shows how to use this function in practice.

Note: You may first need to install the tidyr package in R before you can use the unnest() function. You can use the following syntax to do so:

install.packages('tidyr')

Once the tidyr package is successfully installed, you will be able to use the unnest() function without encountering any errors.

Example: How to Use the unnest() Function in R

Suppose we create the following data frame with two columns in which each of the values in the second column contains a list:

#create data frame
df <- tibble(
  col1 = 1:3,
  col2 = list(
    tibble(a = 10, b=12),
    tibble(a = 1, b = 2),
    tibble(a = 1:3, b = 3:1, c = 4)
  )
)

#view data frame
df

# A tibble: 3 × 2
col1 col2 
<int> <list> 
1 1 <tibble [1 × 2]>
2 2 <tibble [1 × 2]>
3 3 <tibble [3 × 3]>

Notice that when we view the data frame, each value in the column named col2 is simply shown as a tibble with a particular size.

Suppose instead that we would like to view all of the values within each table in this column.

For example, suppose we would like to view the actual values that we specified such as a = 1:3, b = 3:1, c = 4, etc.

We can use the unnest() function with the following syntax to do so:

library(tidyr)

#unnest col2 in the data frame
df %>% unnest(col2)

# A tibble: 5 × 4
   col1     a     b     c
     
1     1    10    12    NA
2     2     1     2    NA
3     3     1     3     4
4     3     2     2     4
5     3     3     1     4

Notice that this outputs all of the values within the lists that we were previously unable to see in the column named col2.

Note that the new tibble also displays columns a, b and c, which were the names that we gave to the various elements in the lists when we first created the data frame.

Also note that the column names a, b and c replace the previous column name of col2.

Note that in this example we used the unnest() function to unnest one list-column but you can use similar syntax to unnest multiple columns if you would like. Simply include multiple column names in the unnest() function.

Note: You can find the complete documentation for the unnest() function from the tidyr package here.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Use slice_max() in dplyr
How to Rename Columns Using dplyr
How to Add Row to Data Frame Using dplyr
How to Use the pull() Function in dplyr

Leave a Reply

Your email address will not be published. Required fields are marked *