Pandas: How to Strip Whitespace from Columns


You can use the following methods to strip whitespace from columns in a pandas DataFrame:

Method 1: Strip Whitespace from One Column

df['my_column'] = df['my_column'].str.strip()

Method 2: Strip Whitespace from All String Columns

df = df.apply(lambda x: x.str.strip() if x.dtype == 'object' else x)

The following examples show how to use each method in practice with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['Mavs', ' Heat', ' Nets ', 'Cavs', 'Hawks', 'Jazz '],
                   'position': ['Point Guard', ' Small Forward', 'Center  ',
                                'Power Forward', ' Point Guard ', 'Center'],
                   'points': [11, 8, 10, 6, 22, 29]})

#view DataFrame
print(df)

     team        position  points
0    Mavs     Point Guard      11
1    Heat   Small Forward       8
2   Nets         Center        10
3    Cavs   Power Forward       6
4   Hawks    Point Guard       22
5   Jazz           Center      29

Example 1: Strip Whitespace from One Column

The following code shows how to strip whitespace from every string in the position column:

#strip whitespace from position column
df['position'] = df['position'].str.strip()

#view updated DataFrame
print(df)

     team       position  points
0    Mavs    Point Guard      11
1    Heat  Small Forward       8
2   Nets          Center      10
3    Cavs  Power Forward       6
4   Hawks    Point Guard      22
5   Jazz          Center      29

Notice that all whitespace has been stripped from each string that had whitespace in the position column.

Example 2: Strip Whitespace from All String Columns

The following code shows how to strip whitespace from each string in all string columns of the DataFrame:

#strip whitespace from all string columns
df = df.apply(lambda x: x.str.strip() if x.dtype == 'object' else x)

#view updated DataFrame
print(df)

    team       position  points
0   Mavs    Point Guard      11
1   Heat  Small Forward       8
2   Nets         Center      10
3   Cavs  Power Forward       6
4  Hawks    Point Guard      22
5   Jazz         Center      29

Notice that all whitespace has been stripped from both the team and position columns, which are the two string columns in the DataFrame.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

Pandas: How to Select Columns Containing a Specific String
Pandas: How to Filter Rows Based on String Length
How to Create Pandas DataFrame from a String

Leave a Reply

Your email address will not be published. Required fields are marked *