You can use the following methods to select columns in a pandas DataFrame based on partial matching:
Method 1: Select Columns Based on One Partial Match
#select columns that contain 'team'
df.loc[:, df.columns.str.contains('team')]
Method 2: Select Columns Based on Multiple Partial Matches
#select columns that contain 'team' or 'rebounds'
df.loc[:, df.columns.str.contains('team|rebounds')]
The following examples show how to use each method with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'team_name': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'], 'team_points': [5, 7, 7, 9, 12, 9, 9, 4], 'assists': [11, 8, 10, 6, 6, 5, 9, 12], 'rebounds': [6, 7, 7, 6, 10, 12, 10, 9]}) #view DataFrame print(df) team_name team_points assists rebounds 0 A 5 11 6 1 A 7 8 7 2 A 7 10 7 3 A 9 6 6 4 B 12 6 10 5 B 9 5 12 6 B 9 9 10 7 B 4 12 9
Example 1: Select Columns Based on One Partial Match
The following code shows how to select all columns in the pandas DataFrame that contain ‘team’ in the column name:
#select columns that contain 'team'
df_team_cols = df.loc[:, df.columns.str.contains('team')]
#view results
print(df_team_cols)
team_name team_points
0 A 5
1 A 7
2 A 7
3 A 9
4 B 12
5 B 9
6 B 9
7 B 4
Notice that both columns that contain ‘team’ in the name are returned.
Example 2: Select Columns Based on Multiple Partial Matches
The following code shows how to select all columns in the pandas DataFrame that contain ‘team’ or ‘rebounds’ in the column name:
#select columns that contain 'team' or 'rebounds'
df_team_rebs = df.loc[:, df.columns.str.contains('team|rebounds')]
#view results
print(df_team_rebs)
team_name team_points rebounds
0 A 5 6
1 A 7 7
2 A 7 7
3 A 9 6
4 B 12 10
5 B 9 12
6 B 9 10
7 B 4 9
All columns that contain either ‘team’ or ‘rebounds’ in the name are returned.
Note: The | operator represents “OR” in pandas.
Feel free to use as many of these operators as you’d like to search for as many partial string matches as you’d like.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
How to Select Columns by Name in Pandas
How to Select Columns by Index in Pandas
How to Select Columns by Data Type in Pandas