Pandas: How to Calculate Percentage of Total Within Group


You can use the following syntax to calculate the percentage of a total within groups in pandas:

df['values_var'] / df.groupby('group_var')['values_var'].transform('sum')

The following example shows how to use this syntax in practice.

Example: Calculate Percentage of Total Within Group

Suppose we have the following pandas DataFrame that shows the points scored by basketball players on various teams:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'],
                   'points': [12, 29, 34, 14, 10, 11, 7, 36, 34, 22]})

#view DataFrame
print(df)

  team  points
0    A      12
1    A      29
2    A      34
3    A      14
4    A      10
5    B      11
6    B       7
7    B      36
8    B      34
9    B      22

We can use the following syntax to create a new column in the DataFrame that shows the percentage of total points scored, grouped by team:

#calculate percentage of total points scored grouped by team
df['team_percent'] = df['points'] / df.groupby('team')['points'].transform('sum')

#view updated DataFrame
print(df)

  team  points  team_percent
0    A      12      0.121212
1    A      29      0.292929
2    A      34      0.343434
3    A      14      0.141414
4    A      10      0.101010
5    B      11      0.100000
6    B       7      0.063636
7    B      36      0.327273
8    B      34      0.309091
9    B      22      0.200000

The team_percent column shows the percentage of total points scored by that player within their team.

For example, players on team A scored a total of 99 points.

Thus, the player in the first row of the DataFrame who scored 12 points scored a total of 12/99 = 12.12% of the total points for team A.

Similarly, the player in the second row of the DataFrame who scored 29 points scored a total of 29/99 = 29.29% of the total points for team A.

And so on.

Note: You can find the complete documentation for the GroupBy function here.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

Pandas: How to Calculate Cumulative Sum by Group
Pandas: How to Count Unique Values by Group
Pandas: How to Calculate Mode by Group
Pandas: How to Calculate Correlation By Group

Leave a Reply

Your email address will not be published. Required fields are marked *