How to Use pct_change() in Pandas


Often you may want to calculate the percentage change between one value and another in pandas.

The easiest way to do so is by using the pct_change() function, which uses the following syntax:

DataFrame.pct_change(periods=1, …)

where:

  • periods: Number of periods to shift to calculate percent change

Note that the default value for periods is 1, which means pandas assumes you would like to calculate the percentage change between the current row in the DataFrame and the value in the previous row.

Feel free to change this number if you would like to calculate the percentage change between a given row and a row that is beyond the previous row.

The following example shows how to use the pct_change() function in practice with a pandas DataFrame.

Example: How to Use pct_change() in Pandas

Suppose we create the following pandas DataFrame that contains information about total sales and refunds made at a particular retail store:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'period': [1, 2, 3, 4, 5, 6, 7, 8],
                   'sales': [122, 140, 188, 134, 199, 215, 200, 249],
                   'refunds': [10, 22, 24, 20, 14, 18, 10, 12]})

#view DataFrame
print(df)

   period  sales  refunds
0       1    122       10
1       2    140       22
2       3    188       24
3       4    134       20
4       5    199       14
5       6    215       18
6       7    200       10
7       8    249       12

Suppose that we would like to calculate the percentage change in sales between each consecutive period.

We can use the following syntax to do so:

#calculate percent change between each period in 'sales' column
df['sales_change'] = df['sales'].pct_change()

#view updated DataFrame
print(df)

   period  sales  refunds  sales_change
0       1    122       10           NaN
1       2    140       22      0.147541
2       3    188       24      0.342857
3       4    134       20     -0.287234
4       5    199       14      0.485075
5       6    215       18      0.080402
6       7    200       10     -0.069767
7       8    249       12      0.245000

The new column named sales_change contains the percentage change in the sales column between each consecutive period.

For example:

The percentage change in sales between period 1 and 2 is (140-122) / 122 = 14.7%.

The percentage change in sales between period 2 and 3 is (188-140) / 140 = 34.2%.

The percentage change in sales between period 3 and 4 is (134-188) / 188 = -28.7%.

And so on.

Note: The first value in the sales_change column displays a value of NaN because there is no previous period to use to calculate the percentage change.

If you would like the values to be shown in percentages then you can multiple the results by 100:

#calculate percent change between each period in 'sales' column
df['sales_change'] = df['sales'].pct_change() * 100

#view updated DataFrame
print(df)

   period  sales  refunds  sales_change
0       1    122       10           NaN
1       2    140       22     14.754098
2       3    188       24     34.285714
3       4    134       20    -28.723404
4       5    199       14     48.507463
5       6    215       18      8.040201
6       7    200       10     -6.976744
7       8    249       12     24.500000

The values in the sales_change column are now shown in terms of percentages.

Also note that you could calculate the percentage change between non-consecutive periods if you’d like.

For example, you can specify periods=2 within the pct_change() function to calculate the percentage change in sales between a current sales period and 2 previous sales periods:

#calculate percent change between every 2 periods in 'sales' column
df['sales_change'] = df['sales'].pct_change(periods=2)

#view updated DataFrame
print(df)

   period  sales  refunds  sales_change
0       1    122       10           NaN
1       2    140       22           NaN
2       3    188       24      0.540984
3       4    134       20     -0.042857
4       5    199       14      0.058511
5       6    215       18      0.604478
6       7    200       10      0.005025
7       8    249       12      0.158140

The new column named sales_change contains the percentage change in the sales column between a given period and two periods prior.

For example:

The percentage change in sales between period 1 and 3 is (188-122) / 122 = 54.09%.

The percentage change in sales between period 2 and 3 is (134-140) / 140 = -4.28%.

The percentage change in sales between period 3 and 4 is (199-188) / 188 = 5.85%.

And so on.

Note: You can find the complete documentation for the pct_change() function in pandas here.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Use qcut() in Pandas
How to Select Only Numeric Columns in Pandas
How to Convert Categorical Variable to Numeric in Pandas

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *