Often you may want to fill NaN values in a pandas DataFrame by using the next valid observation to fill the NaN value.

The most efficient way to do so is by using the **bfill()** function, which uses the following syntax:

**DataFrame.bfill(axis=None, inplace=False, limit=None, limit_area=None, …)**

where:

**axis**: The axis to use (0=Series, 1=DataFrame)**inplace**: Whether to fill in-place or not**limit**: Max number of NaN values to fill**limit_area**: Restriction to use if limit is specified to be True

The following example shows how to use the **bfill()** function in practice with a pandas DataFrame.

**Example: How to Use the bfill() Function in Pandas**

Suppose we create the following pandas DataFrame that contains information about various basketball players:

import pandas as pd import numpy as np #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'C', 'C', 'C'], 'points': [12, np.nan, np.nan, 22, 30, 41, 12], 'assists': [8, 10, 11, 11, 7, np.nan, 8]}) #view DataFrame print(df) team points assists 0 A 12.0 8.0 1 A NaN 10.0 2 B NaN 11.0 3 B 22.0 11.0 4 C 30.0 7.0 5 C 41.0 NaN 6 C 12.0 8.0

Notice that there are several NaN values in the DataFrame.

Suppose that we would like to use the **bfill()** function to fill in the missing values in the DataFrame.

We can use the following syntax to do so:

#fill in NaN values in each column of DataFrame df.bfill() team points assists 0 A 12.0 8.0 1 A 22.0 10.0 2 B 22.0 11.0 3 B 22.0 11.0 4 C 30.0 7.0 5 C 41.0 8.0 6 C 12.0 8.0

Notice that each of the NaN values have been filled in with the next available values in each column.

Note that we can also fill in NaN values for one specific column if we’d like.

For example, we can use the following syntax to fill in the NaN values in the **points** column only:

#fill in NaN values in points column only df['points'] = df['points'].bfill() #view updated DataFrame print(df) team points assists 0 A 12.0 8.0 1 A 22.0 10.0 2 B 22.0 11.0 3 B 22.0 11.0 4 C 30.0 7.0 5 C 41.0 NaN 6 C 12.0 8.0

Notice that only the NaN values in the **points** column have been filled in while the NaN value in the **assists** column has remain untouched.

Also note that we can use the **limit** argument to limit the number of consecutive NaN values that should be filled in.

For example, we can specify **limit=1** to only fill in the first NaN value in the **points** column:

#fill in NaN values in points column only (limit of 1) df['points'] = df['points'].bfill(limit=1) #view updated DataFrame print(df) team points assists 0 A 12.0 8.0 1 A NaN 10.0 2 B 22.0 11.0 3 B 22.0 11.0 4 C 30.0 7.0 5 C 41.0 NaN 6 C 12.0 8.0

Notice that only the first NaN value in the **points** column has been replaced while the next NaN value has simply been left untouched.

In practice, you may choose to use the **limit** argument when it only makes sense to fill in a NaN value with the next available value instead of filling in consecutive values.

**Note**: You can find the complete documentation for the **bfill()** function in pandas here.

**Additional Resources**

The following tutorials explain how to perform other common tasks in pandas:

How to Use the Rolling.apply() Function in Pandas

How to Use the nunique() Function in Pandas

How to Use the get_loc() Function in Pandas

How to Use idxmin() Function in Pandas