How to Use the setDT() Function in R


Often you may want to convert a data frame to a data table in R.

The biggest advantage of using a data table is that you can perform operations on large datasets much faster than you can compared to using data frames from base R.

This is especially important when working with datasets that may have tens of thousands or hundreds of thousands of rows, or even more.

The easiest way to convert a data frame to a data table is by using the setDT() function from the data.table package in R, which can be used to perform this exact task.

The setDT() function uses the following syntax:

setDT(x, keep.rownames=FALSE, key=NULL, check.names=FALSE)

where:

  • x: Name of the data frame to convert to a data table
  • keep.rownames: Whether to keep the row names from the data table in a new column
  • key: Character vector of one or more column names to pass to setkeyv
  • check.names: Whether to check names for valid formats before converting data frame to data table

The following example shows how to use the setDT() function in practice in R.

Example: How to Use the setDT() Function in R

Suppose that we create the following data frame in R named df that contains information about various basketball players:

#create data frame
df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
                 points=c(22, 39, 24, 18, 15, 10, 28, 23),
                 assists=c(3, 8, 8, 6, 10, 14, 8, 17))

#view data frame
df

  team points assists
1    A     22       3
2    A     39       8
3    A     24       8
4    A     18       6
5    B     15      10
6    B     10      14
7    B     28       8
8    B     23      17

We can use the class() function to check the class of the object named df:

#check class of df
class(df)

[1] "data.frame"

This returns data.frame, which tells us that the object df is currently a data frame.

If we would like to convert this data frame to a data table then we can use the setDT() function as follows:

library(data.table)

#convert df to data table
setDT(df)

#check class of df
class(df)

[1] "data.table" "data.frame"

Notice that when we use the class() function again to view the class of df we can see that it is shown as both a data.frame and data.table.

Note: When using the setDT() function, no output will be returned.

Once we have converted our data frame to a data table, we can then use syntax that is specific to the data.table package to perform operations on the data table.

For example, we can use the following syntax to filter the rows of the data table to only show the rows where the team column is equal to A:

#filter rows where team is 'A'
df[team== 'A', ]

   team points assists
1:    A     22       3
2:    A     39       8
3:    A     24       8
4:    A     18       6

Notice that the filtered data table only contains rows where the value in the team column is equal to A.

Note that this is just one operation we can perform with data.table syntax now that the data frame has been converted to a data table.

Feel free to perform any other operation you would like.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Use dcast Function from data.table in R
How to Filter a data.table in R
How to Use rbindlist in R to Make One Data Table from Many

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *