How to Use the itertuples() Function in Pandas


Often you may want to iterate over a pandas DataFrame using itertuples.

The easiest way to do so is by using the itertuples() function, which uses the following syntax:

DataFrame.itertuples(index=True, name=’Pandas’)

where:

  • index: Whether or not to return index as first element of tuple
  • name: The name to give to the return namedtuples

The following example shows how to use the itertuples() function in practice with a pandas DataFrame.

Example: How to Use the itertuples() Function in Pandas

Suppose we create the following pandas DataFrame that contains information about various basketball players, including the team they belong to and their total number of points scored:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'C', 'C', 'C', 'D'],
                   'points': [12, 18, 18, 22, 30, 41, 12, 29]})

#view DataFrame
print(df)

  team  points
0    A      12
1    A      18
2    B      18
3    B      22
4    C      30
5    C      41
6    C      12
7    D      29

Suppose that we would like to iterate over each row in the DataFrame using tuples.

We can use the following syntax with the itertuples() function to do so:

#iterate over each row in DataFrame and print values using tuples
for row in df.itertuples():
    print(row)

Pandas(Index=0, team='A', points=12)
Pandas(Index=1, team='A', points=18)
Pandas(Index=2, team='B', points=18)
Pandas(Index=3, team='B', points=22)
Pandas(Index=4, team='C', points=30)
Pandas(Index=5, team='C', points=41)
Pandas(Index=6, team='C', points=12)
Pandas(Index=7, team='D', points=29)

The output displays the index, team name and points scored by each player on each row.

For example, from the output we can see:

  • The first player has an Index value of 0, team value of ‘A’ and points value of 12.
  • The first player has an Index value of 1, team value of ‘A’ and points value of 18.
  • The first player has an Index value of 2, team value of ‘B’ and points value of 18.

And so on.

By default, the itertuples() returns the index value as the first element of the tuple.

However, there are times when you may not need to show the index value.

In this scenario, you specify index=False within the itertuples() function as follows:

#iterate over each row in DataFrame and print values using tuples
for row in df.itertuples(index=False):
    print(row)

Pandas(team='A', points=12)
Pandas(team='A', points=18)
Pandas(team='B', points=18)
Pandas(team='B', points=22)
Pandas(team='C', points=30)
Pandas(team='C', points=41)
Pandas(team='C', points=12)
Pandas(team='D', points=29)

Notice that the index value is no longer shown for each tuple in the output.

Instead, only the value in the team column and the points column is shown in each tuple.

Also notice that each row in the output has a prefix of Pandas, since this is the default name to give to each tuple.

However, we can use the name function to specify any other name that we would like.

For example, we could use the following syntax to use a prefix of Result in each row instead:

#iterate over each row in DataFrame and print values using tuples
for row in df.itertuples(name='Result'):
    print(row)

Result(Index=0, team='A', points=12)
Result(Index=1, team='A', points=18)
Result(Index=2, team='B', points=18)
Result(Index=3, team='B', points=22)
Result(Index=4, team='C', points=30)
Result(Index=5, team='C', points=41)
Result(Index=6, team='C', points=12)
Result(Index=7, team='D', points=29)

Notice that “Result” is now shown as the name for each tuple instead.

Note: You can find the complete documentation for the itertuples() function in pandas here.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Use pct_change() in Pandas
Pandas: How to Use isin() for Multiple Columns
How to Create a Tuple from Two Columns in Pandas

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *