A data frame in R can be displayed in a wide or long format.
Depending on your goal, you may want the data frame to be in one of these specific formats.
The easiest way to reshape data between these formats is to use the following two functions from the tidyr package in R:
- pivot_longer(): Reshapes a data frame from wide to long format.
- pivot_wider(): Reshapes a data frame from long to wide format.
The following examples show how to use each function in practice.
Example 1: Reshape Data from Wide to Long
Suppose we have the following data frame in R that is currently in a wide format:
#create data frame df <- data.frame(player=c('A', 'B', 'C', 'D'), year1=c(12, 15, 19, 19), year2=c(22, 29, 18, 12)) #view data frame df player year1 year2 1 A 12 22 2 B 15 29 3 C 19 18 4 D 19 12
We can use the pivot_longer() function to pivot this data frame into a long format:
library(tidyr) #pivot the data frame into a long format df %>% pivot_longer(cols=c('year1', 'year2'), names_to='year', values_to='points') # A tibble: 8 x 3 player year points 1 A year1 12 2 A year2 22 3 B year1 15 4 B year2 29 5 C year1 19 6 C year2 18 7 D year1 19 8 D year2 12
Notice that the column names year1 and year2 are now used as values in a new column called “year” and the values from these original columns are placed into one new column called “points.”
The final result is a long data frame.
Note: You can find the complete documentation for the pivot_longer() function here.
Example 2: Reshape Data from Long to Wide
Suppose we have the following data frame in R that is currently in a long format:
#create data frame df <- data.frame(player=rep(c('A', 'B'), each=4), year=rep(c(1, 1, 2, 2), times=2), stat=rep(c('points', 'assists'), times=4), amount=c(14, 6, 18, 7, 22, 9, 38, 4)) #view data frame df player year stat amount 1 A 1 points 14 2 A 1 assists 6 3 A 2 points 18 4 A 2 assists 7 5 B 1 points 22 6 B 1 assists 9 7 B 2 points 38 8 B 2 assists 4
We can use the pivot_wider() function to pivot this data frame into a wide format:
library(tidyr) #pivot the data frame into a wide format df %>% pivot_wider(names_from = stat, values_from = amount) # A tibble: 4 x 4 player year points assists 1 A 1 14 6 2 A 2 18 7 3 B 1 22 9 4 B 2 38 4
Notice that the values from the stat column are now used as column names and the values from the amount column are used as cell values in these new columns.
The final result is a wide data frame.
Note: You can find the complete documentation for the pivot_wider() function here.
The following tutorials explain how to perform other common tasks in R: