Pandas: How to Replace Empty Strings with NaN


You can use the following syntax to replace empty strings with NaN values in pandas:

df = df.replace(r'^\s*$', np.nan, regex=True)

The following example shows how to use this syntax in practice.

Related: How to Replace NaN Values with String in Pandas

Example: Replace Empty Strings with NaN

Suppose we have the following pandas DataFrame that contains information about various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'B', ' ', 'D', 'E', ' ', 'G', 'H'],
                   'position': [' ', 'G', 'G', 'F', 'F', ' ', 'C', 'C'],
                   'points': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})

#view DataFrame
df

	team	position points	rebounds
0	A		 5	11
1	B	G	 7	8
2		G	 7	10
3	D	F	 9	6
4	E	F	 12	6
5			 9	5
6	G	C	 9	9
7	H	C	 4	12

Notice that there are several empty strings in both the team and position columns.

We can use the following syntax to replace these empty strings with NaN values:

import numpy as np

#replace empty values with NaN
df = df.replace(r'^\s*$', np.nan, regex=True)

#view updated DataFrame
df

	team	position points	rebounds
0	A	NaN	 5	11
1	B	G	 7	8
2	NaN	G	 7	10
3	D	F	 9	6
4	E	F	 12	6
5	NaN	NaN	 9	5
6	G	C	 9	9
7	H	C	 4	127

Notice that each of the empty strings have been replaced with NaN.

Note: You can find the complete documentation for the replace function in pandas here.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Impute Missing Values in Pandas
How to Count Missing Values in Pandas
How to Fill NaN Values with Mean in Pandas

Leave a Reply

Your email address will not be published.