How to Insert a Column Into a Pandas DataFrame


Often you may want to insert a new column into a pandas DataFrame. Fortunately this is easy to do using the pandas insert() function, which uses the following syntax:

insert(loc, column, value, allow_duplicates=False)

where:

  • loc: Index to insert column in. First column is 0.
  • column: Name to give to new column.
  • value: Array of values for the new column.
  • allow_duplicates: Whether or not to allow new column name to match existing column name. Default is False.

This tutorial shows several examples of how to use this function in practice.

Example 1: Insert New Column as First Column

The following code shows how to insert a new column as the first column of an existing DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'points': [25, 12, 15, 14, 19],
                   'assists': [5, 7, 7, 9, 12],
                   'rebounds': [11, 8, 10, 6, 6]})

#view DataFrame
df
        points	assists	rebounds
0	25	5	11
1	12	7	8
2	15	7	10
3	14	9	6
4	19	12	6

#insert new column 'player' as first column
player_vals = ['A', 'B', 'C', 'D', 'E']
df.insert(loc=0, column='player', value=player_vals)
df

        player	points	assists	rebounds
0	A	25	5	11
1	B	12	7	8
2	C	15	7	10
3	D	14	9	6
4	E	19	12	6

Example 2: Insert New Column as a Middle Column

The following code shows how to insert a new column as the third column of an existing DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'points': [25, 12, 15, 14, 19],
                   'assists': [5, 7, 7, 9, 12],
                   'rebounds': [11, 8, 10, 6, 6]})

#insert new column 'player' as third column
player_vals = ['A', 'B', 'C', 'D', 'E']
df.insert(loc=2, column='player', value=player_vals)
df

        points	assists	player	rebounds
0	25	5	A	11
1	12	7	B	8
2	15	7	C	10
3	14	9	D	6
4	19	12	E	6

Example 3: Insert New Column as Last Column

The following code shows how to insert a new column as the last column of an existing DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'points': [25, 12, 15, 14, 19],
                   'assists': [5, 7, 7, 9, 12],
                   'rebounds': [11, 8, 10, 6, 6]})

#insert new column 'player' as last column
player_vals = ['A', 'B', 'C', 'D', 'E']
df.insert(loc=len(df.columns), column='player', value=player_vals)
df

        points	assists	player	rebounds
0	25	5	A	11
1	12	7	B	8
2	15	7	C	10
3	14	9	D	6
4	19	12	E	6

Note that using len(df.columns) allows you to insert a new column as the last column in any dataFrame, no matter how many columns it may have.

You can find the complete documentation for the insert() function here.

Leave a Reply

Your email address will not be published.