Pandas: How to Drop Columns with NaN Values


You can use the following methods to drop columns from a pandas DataFrame with NaN values:

Method 1: Drop Columns with Any NaN Values

df = df.dropna(axis=1)

Method 2: Drop Columns with All NaN Values

df = df.dropna(axis=1, how='all')

Method 3: Drop Columns with Minimum Number of NaN Values

df = df.dropna(axis=1, thresh=2)

The following examples show how to use each method in practice with the following pandas DataFrame:

import pandas as pd
import numpy as np

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'B'],
                   'position': [np.nan, 'G', 'F', 'F', 'C', 'G'],
                   'points': [11, 28, 10, 26, 6, 25],
                   'rebounds': [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]})

#view DataFrame
print(df)

  team position  points  rebounds
0    A      NaN      11       NaN
1    A        G      28       NaN
2    A        F      10       NaN
3    B        F      26       NaN
4    B        C       6       NaN
5    B        G      25       NaN

Example 1: Drop Columns with Any NaN Values

The following code shows how to drop columns with any NaN values:

#drop columns with any NaN values
df = df.dropna(axis=1)

#view updated DataFrame
print(df)

  team  points
0    A      11
1    A      28
2    A      10
3    B      26
4    B       6
5    B      25

Notice that the position and rebounds columns were dropped since they both had at least one NaN value.

Example 2: Drop Columns with All NaN Values

The following code shows how to drop columns with all NaN values:

#drop columns with all NaN values
df = df.dropna(axis=1, how='all')

#view updated DataFrame
print(df)

  team position  points
0    A      NaN      11
1    A        G      28
2    A        F      10
3    B        F      26
4    B        C       6
5    B        G      25

Notice that the rebounds column was dropped since it was the only column with all NaN values.

Example 3: Drop Columns with Minimum Number of NaN Values

The following code shows how to drop columns with at least two NaN values:

#drop columns with at least two NaN values
df = df.dropna(axis=1, thresh=2)

#view updated DataFrame
print(df)

  team position  points
0    A      NaN      11
1    A        G      28
2    A        F      10
3    B        F      26
4    B        C       6
5    B        G      25

Notice that the rebounds column was dropped since it was the only column with at least two NaN values.

Note: You can find the complete documentation for the dropna() function in pandas here.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Drop First Column in Pandas
How to Drop Duplicate Columns in Pandas
How to Drop All Columns Except Specific Ones in Pandas

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *