Pandas: How to Fill NaN Values with Mode


You can use the following syntax to replace NaN values in a column of a pandas DataFrame with the mode value of the column:

df['col1'] = df['col1'].fillna(df['col1'].mode()[0])

The following example shows how to use this syntax in practice.

Example: Replace Missing Values with Mode in Pandas

Suppose we have the following pandas DataFrame with some missing values:

import numpy as np
import pandas as pd

#create DataFrame with some NaN values
df = pd.DataFrame({'rating': [np.nan, 85, np.nan, 88, 94, 90, 75, 75, 87, 86],
                   'points': [25, np.nan, 14, 16, 27, 20, 12, 15, 14, 19],
                   'assists': [5, 7, 7, np.nan, 5, 7, 6, 9, 9, 7],
                   'rebounds': [11, 8, 10, 6, 6, 9, 6, 10, 10, 7]})

#view DataFrame
df

        rating	points	assists	rebounds
0	NaN	25.0	5.0	11
1	85.0	NaN	7.0	8
2	NaN	14.0	7.0	10
3	88.0	16.0	NaN	6
4	94.0	27.0	5.0	6
5	90.0	20.0	7.0	9
6	75.0	12.0	6.0	6
7	75.0	15.0	9.0	10
8	87.0	14.0	9.0	10
9	86.0	19.0	7.0	7

We can use the fillna() function to fill the NaN values in the rating column with the mode value of the rating column:

#fill NaNs with column mode in 'rating' column
df['rating'] = df['rating'].fillna(df['rating'].mode()[0])

#view updated DataFrame 
df

	rating	points	assists	rebounds
0	75.0	25.0	5.0	11
1	85.0	NaN	7.0	8
2	75.0	14.0	7.0	10
3	88.0	16.0	NaN	6
4	94.0	27.0	5.0	6
5	90.0	20.0	7.0	9
6	75.0	12.0	6.0	6
7	75.0	15.0	9.0	10
8	87.0	14.0	9.0	10
9	86.0	19.0	7.0	7

The mode value in the rating column was 75 so each of the NaN values in the rating column were filled with this value.

Note: You can find the complete online documentation for the fillna() function here.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Count Missing Values in Pandas
How to Drop Rows with NaN Values in Pandas
How to Drop Rows that Contain a Specific Value in Pandas

Leave a Reply

Your email address will not be published.