How to Combine Two Columns in Pandas (With Examples)


You can use the following syntax to combine two text columns into one in a pandas DataFrame:

df['new_column'] = df['column1'] + df['column2']

If one of the columns isn’t already a string, you can convert it using the astype(str) command:

df['new_column'] = df['column1'].astype(str) + df['column2']

And you can use the following syntax to combine multiple text columns into one:

df['new_column'] = df[['col1', 'col2', 'col3', ...]].agg(' '.join, axis=1) 

The following examples show how to combine text columns in practice.

Example 1: Combine Two Columns

The following code shows how to combine two text columns into one in a pandas DataFrame:

import pandas as pd

#create dataFrame
df = pd.DataFrame({'team': ['Mavs', 'Lakers', 'Spurs', 'Cavs'],
                   'first': ['Dirk', 'Kobe', 'Tim', 'Lebron'],
                   'last': ['Nowitzki', 'Bryant', 'Duncan', 'James'],
                   'points': [26, 31, 22, 29]})

#combine first and last name column into new column, with space in between 
df['full_name'] = df['first'] + ' ' + df['last']

#view resulting dataFrame
df

	team	first	last	 points	full_name
0	Mavs	Dirk	Nowitzki 26	Dirk Nowitzki
1	Lakers	Kobe	Bryant	 31	Kobe Bryant
2	Spurs	Tim	Duncan	 22	Tim Duncan
3	Cavs	Lebron	James	 29	Lebron James

We joined the first and last name column with a space in between, but we could also use a different separator such as a dash:

#combine first and last name column into new column, with dash in between 
df['full_name'] = df['first'] + '-' + df['last']

#view resulting dataFrame
df

	team	first	last	 points	full_name
0	Mavs	Dirk	Nowitzki 26	Dirk-Nowitzki
1	Lakers	Kobe	Bryant	 31	Kobe-Bryant
2	Spurs	Tim	Duncan	 22	Tim-Duncan
3	Cavs	Lebron	James	 29	Lebron-James

Example 2: Convert to Text & Combine Two Columns

The following code shows how to convert one column to text, then join it to another column:

import pandas as pd

#create dataFrame
df = pd.DataFrame({'team': ['Mavs', 'Lakers', 'Spurs', 'Cavs'],
                   'first': ['Dirk', 'Kobe', 'Tim', 'Lebron'],
                   'last': ['Nowitzki', 'Bryant', 'Duncan', 'James'],
                   'points': [26, 31, 22, 29]})

#convert points to text, then join to last name column 
df['name_points'] = df['last'] + df['points'].astype(str)

#view resulting dataFrame
df

        team	first	last	 points	name_points
0	Mavs	Dirk	Nowitzki 26	Nowitzki26
1	Lakers	Kobe	Bryant	 31	Bryant31
2	Spurs	Tim	Duncan	 22	Duncan22
3	Cavs	Lebron	James	 29	James29

Example 3: Combine More Than Two Columns

The following code shows how to join multiple columns into one column:

import pandas as pd

#create dataFrame
df = pd.DataFrame({'team': ['Mavs', 'Lakers', 'Spurs', 'Cavs'],
                   'first': ['Dirk', 'Kobe', 'Tim', 'Lebron'],
                   'last': ['Nowitzki', 'Bryant', 'Duncan', 'James'],
                   'points': [26, 31, 22, 29]})

#join team, first name, and last name into one column
df['team_and_name'] = df[['team', 'first', 'last']].agg(' '.join, axis=1)

#view resulting dataFrame
df

	team	first	last	 points	team_name
0	Mavs	Dirk	Nowitzki 26	Mavs Dirk Nowitzki
1	Lakers	Kobe	Bryant	 31	Lakers Kobe Bryant
2	Spurs	Tim	Duncan	 22	Spurs Tim Duncan
3	Cavs	Lebron	James	 29	Cavs Lebron James

Additional Resources

Pandas: How to Find the Difference Between Two Columns
Pandas: How to Find the Difference Between Two Rows
Pandas: How to Sort Columns by Name

Leave a Reply

Your email address will not be published.