The pandas fillna() function is useful for filling in missing values in columns of a pandas DataFrame.

This tutorial provides several examples of how to use this function to fill in missing values for multiple columns of the following pandas DataFrame:

import pandas as pd import numpy as np #create DataFrame df = pd.DataFrame({'team': ['A', np.nan, 'B', 'B', 'B', 'C', 'C', 'C'], 'points': [25, np.nan, 15, np.nan, 19, 23, 25, 29], 'assists': [5, 7, 7, 9, 12, 9, np.nan, 4], 'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]}) #view DataFrame print(df) team points assists rebounds 0 A 25.0 5.0 11 1 NaN NaN 7.0 8 2 B 15.0 7.0 10 3 B NaN 9.0 6 4 B 19.0 12.0 6 5 C 23.0 9.0 5 6 C 25.0 NaN 9 7 C 29.0 4.0 12

**Example 1: Fill in Missing Values of All Columns**

The following code shows how to fill in missing values with a zero for *all* columns in the DataFrame:

#replace all missing values with zero df.fillna(value=0, inplace=True) #view DataFrame print(df) team points assists rebounds 0 A 25.0 5.0 11 1 0 0.0 7.0 8 2 B 15.0 7.0 10 3 B 0.0 9.0 6 4 B 19.0 12.0 6 5 C 23.0 9.0 5 6 C 25.0 0.0 9 7 C 29.0 4.0 12

**Example 2: Fill in Missing Values of Multiple Columns**

The following code shows how to fill in missing values with a zero for just the points and assists columns in the DataFrame:

#replace missing values in points and assists columns with zero df[['points', 'assists']] = df[['points', 'assists']].fillna(value=0) #view DataFrame print(df) team points assists rebounds 0 A 25.0 5.0 11 1 NaN 0.0 7.0 8 2 B 15.0 7.0 10 3 B 0.0 9.0 6 4 B 19.0 12.0 6 5 C 23.0 9.0 5 6 C 25.0 0.0 9 7 C 29.0 4.0 12

**Example 3: Fill in Missing Values of Multiple Columns with Different Values**

The following code shows how to fill in missing values in three different columns with three different values:

#replace missing values in three columns with three different values df.fillna({'team':'Unknown', 'points': 0, 'assists': 'zero'}, inplace=True) #view DataFrame print(df) team points assists rebounds 0 A 25.0 5 11 1 Unknown 0.0 7 8 2 B 15.0 7 10 3 B 0.0 9 6 4 B 19.0 12 6 5 C 23.0 9 5 6 C 25.0 zero 9 7 C 29.0 4 12

Notice that each of the missing values in the three columns were replaced with some unique value.

Just getting into Python. Super helpful. Thank you.

How would you approach filling NAs with mean() or median() values rather than constants?

so for example, you have 4 columns, one column you want to replace NAs with mean(), one with median(), and one with 0.

One column is roughly normally distributed, so the mean is fine to replace with. A second column is right skewed, so I want to use median, and the third column, I just want to see 3 done at the same time.

Or do you have to fillna one column at a time when imputing central tendency values?

Thanks kindly,