How to Calculate Cumulative Percentage in Pandas


You can use the following basic syntax to calculate the cumulative percentage of values in a column of a pandas DataFrame:

#calculate cumulative sum of column
df['cum_sum'] = df['col1'].cumsum()

#calculate cumulative percentage of column (rounded to 2 decimal places)
df['cum_percent'] = round(100*df.cum_sum/df['col1'].sum(),2)

The following example shows how to use this syntax in practice.

Example: Calculate Cumulative Percentage in Pandas

Suppose we have the following pandas DataFrame that shows the number of units a company sells during consecutive years:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'year': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
                   'units_sold': [60, 75, 77, 87, 104, 134, 120, 125, 140, 150]})

#view DataFrame
print(df)

   year  units_sold
0     1          60
1     2          75
2     3          77
3     4          87
4     5         104
5     6         134
6     7         120
7     8         125
8     9         140
9    10         150

Next, we can use the following code to add a column that shows the cumulative number of units sold and cumulative percentage of units sold:

#calculate cumulative sum of units sold
df['cum_sum'] = df['units_sold'].cumsum()

#calculate cumulative percentage of units sold
df['cum_percent'] = round(100*df.cum_sum/df['units_sold'].sum(),2)

#view updated DataFrame
print(df)

   year  units_sold  cum_sum  cum_percent
0     1          60       60         5.60
1     2          75      135        12.59
2     3          77      212        19.78
3     4          87      299        27.89
4     5         104      403        37.59
5     6         134      537        50.09
6     7         120      657        61.29
7     8         125      782        72.95
8     9         140      922        86.01
9    10         150     1072       100.00

We interpret the cumulative percentages as follows:

  • 5.60% of all sales were made in year 1.
  • 12.59 of all sales were made in years 1 and 2 combined.
  • 19.78% of all sales were made in years 1, 2, and 3 combined.

And so on.

Note that you can simply change the value in the round() function to change the number of decimal points shown as well.

For example, we could round the cumulative percentage to zero decimal places instead:

#calculate cumulative sum of units sold
df['cum_sum'] = df['units_sold'].cumsum()

#calculate cumulative percentage of units sold
df['cum_percent'] = round(100*df.cum_sum/df['units_sold'].sum(),0)

#view updated DataFrame
print(df)

   year  units_sold  cum_sum  cum_percent
0     1          60       60          6.0
1     2          75      135         13.0
2     3          77      212         20.0
3     4          87      299         28.0
4     5         104      403         38.0
5     6         134      537         50.0
6     7         120      657         61.0
7     8         125      782         73.0
8     9         140      922         86.0
9    10         150     1072        100.0

The cumulative percentages are now rounded to zero decimal places.

Additional Resources

The following tutorials explain how to perform other common operations in Python:

How to Create Frequency Tables in Python
How to Calculate Relative Frequency in Python

Leave a Reply

Your email address will not be published.