How to Create a Tuple from Two Columns in Pandas


You can use the following basic syntax to create a tuple from two columns in a pandas DataFrame:

df['new_column'] = list(zip(df.column1, df.column2))

This particular formula creates a new column called new_column, which is a tuple formed by column1 and column2 in the DataFrame.

The following example shows how to use this syntax in practice.

Example: Create Tuple from Two Columns in Pandas

Suppose we have the following pandas DataFrame that contains information about various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
                   'points': [18, 22, 19, 14, 14, 11, 20, 28],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4]})

#view DataFrame
print(df)

  team  points  assists
0    A      18        5
1    B      22        7
2    C      19        7
3    D      14        9
4    E      14       12
5    F      11        9
6    G      20        9
7    H      28        4

We can use the following syntax to create a new column called points_assists, which is a tuple formed by the values in the points and assists columns:

#create new column that is a tuple of points and assists columns
df['points_assists'] = list(zip(df.points, df.assists))

#view updated DataFrame
print(df)

  team  points  assists points_assists
0    A      18        5        (18, 5)
1    B      22        7        (22, 7)
2    C      19        7        (19, 7)
3    D      14        9        (14, 9)
4    E      14       12       (14, 12)
5    F      11        9        (11, 9)
6    G      20        9        (20, 9)
7    H      28        4        (28, 4)

The new column called points_assists is a tuple formed by the points and assists columns.

Note that you can also include more than two columns in a tuple if you’d like.

For example, the following code shows how to create a tuple that uses values from all three original columns in the DataFrame:

#create new column that is a tuple of team, points and assists columns
df['all_columns'] = list(zip(df.team, df.points, df.assists))

#view updated DataFrame
print(df)

  team  points  assists  all_columns
0    A      18        5   (A, 18, 5)
1    B      22        7   (B, 22, 7)
2    C      19        7   (C, 19, 7)
3    D      14        9   (D, 14, 9)
4    E      14       12  (E, 14, 12)
5    F      11        9   (F, 11, 9)
6    G      20        9   (G, 20, 9)
7    H      28        4   (H, 28, 4)

You can use this same basic syntax to create a tuple column with as many columns as you’d like.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Drop Duplicate Rows in Pandas
How to Drop Duplicate Columns in Pandas
How to Count Duplicates in Pandas

Leave a Reply

Your email address will not be published.