You can use the following basic syntax to use the ffill() function in pandas to forward fill values based on a condition in another column:
df['sales'] = df.groupby('store')['sales'].ffill()
This particular example will forward fill values in the sales column only if the previous value in the store column is equal to the current value in the store column.
The following example shows how to use this syntax in practice.
Example: Use ffill Based on Condition in Pandas
Suppose we have the following pandas DataFrame that contains information about the total sales made by two different retail stores during four business quarters:
import pandas as pd import numpy as np #create DataFrame df = pd.DataFrame({'store': ['A', 'A', 'B', 'A', 'B', 'A', 'B', 'B'], 'quarter': [1, 2, 1, 3, 2, 4, 3, 4], 'sales': [12, 22, 30, np.nan, 24, np.nan, np.nan, np.nan]}) #view DataFrame print(df) store quarter sales 0 A 1 12.0 1 A 2 22.0 2 B 1 30.0 3 A 3 NaN 4 B 2 24.0 5 A 4 NaN 6 B 3 NaN 7 B 4 NaN
Notice that there are multiple NaN values in the sales column.
Suppose we would like to fill in these NaN values using the previous value in the sales column but we want to make sure that values correspond to the correct store.
We can use the following syntax to do so:
#group by store and forward fill values in sales column df['sales'] = df.groupby('store')['sales'].ffill() #view updated DataFrame print(df) store quarter sales 0 A 1 12.0 1 A 2 22.0 2 B 1 30.0 3 A 3 22.0 4 B 2 24.0 5 A 4 22.0 6 B 3 24.0 7 B 4 24.0
Notice that the NaN values in the sales column have been replaced by the previous sales value and that the values correspond to the correct store.
For example:
- The NaN value in row index position 3 has been replaced by the value 22, which was the most recent value in the sales column that corresponded to store A.
- The NaN value in row index position 6 has been replaced by the value 24, which was the most recent value in the sales column that corresponded to store B.
And so on.
Note: You can find the complete documentation for the pandas ffill() function here.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
How to Count Missing Values in Pandas
How to Drop Rows with NaN Values in Pandas
How to Drop Rows that Contain a Specific Value in Pandas