There are two common ways to create a new data frame from an existing data frame in R:
Method 1: Select Column Names from Existing Data Frame
new_df <- df[c('var1', 'var3', 'var4')]
Method 2: Select & Rename Column Names from Existing Data Frame
new_df <- data.frame('new_var1' = df$var1, 'new_var2' = df$var2, 'new_var3' = df$var3)
The following examples show how to use each method with the following data frame in R:
#create data frame
df <- data.frame(team=c('A', 'A', 'A', 'B', 'B', 'B'),
points=c(19, 14, 14, 29, 25, 30),
assists=c(4, 5, 5, 4, 12, 10),
rebounds=c(9, 7, 7, 6, 10, 11))
#view data frame
df
team points assists rebounds
1 A 19 4 9
2 A 14 5 7
3 A 14 5 7
4 B 29 4 6
5 B 25 12 10
6 B 30 10 11
Example 1: Select Column Names from Existing Data Frame
The following code shows how to create a new data frame by selecting several column names from an existing data frame:
#define new data frame
new_df <- df[c('team', 'assists', 'points')]
#view new data frame
new_df
team assists points
1 A 4 19
2 A 5 14
3 A 5 14
4 B 4 29
5 B 12 25
6 B 10 30
The new data frame contains three columns (team, assists, points) from the existing data frame.
Example 2: Select & Rename Column Names from Existing Data Frame
The following code shows how to create a new data frame by selecting and renaming several columns from an existing data frame:
#define new data frame
new_df <- data.frame('team_name' = df$team,
'total_assists' = df$assists,
'total_points' = df$points)
#view new data frame
new_df
team_name total_assists total_points
1 A 4 19
2 A 5 14
3 A 5 14
4 B 4 29
5 B 12 25
6 B 10 30
The new data frame contains three columns (team, assists, points) from the existing data frame, but we have specified new names for each of the columns in the new data frame.
This approach is particularly useful if you know ahead of time that you want to rename the columns in the new data frame.
Additional Resources
The following tutorials explain how to perform other common tasks in R:
How to Append Rows to a Data Frame in R
How to Keep Certain Columns in R
How to Select Only Numeric Columns in R