You can use the following basic syntax to select rows that do not start with a specific string in a pandas DataFrame:
df[~df.my_column.str.startswith(('this', 'that'))]
This particular formula selects all rows in the DataFrame where the column called my_column does not start with the string this or the string that.
The following example shows how to use this syntax in practice.
Example: Select Rows that Do Not Start with String in Pandas
Suppose we have the following pandas DataFrame that contains information about sales for various stores:
import pandas as pd #create DataFrame df = pd.DataFrame({'store': ['Upper East', 'Upper West', 'Lower East', 'West', 'CTR'], 'sales': [150, 224, 250, 198, 177]}) #view DataFrame print(df) store sales 0 Upper East 150 1 Upper West 224 2 Lower East 250 3 West 198 4 CTR 177
We can use the following syntax to select all rows in the DataFrame that do not start with the strings ‘Upper’ or ‘Lower’ in the store column:
#select all rows where store does not start with 'Upper' or 'Lower'
df[~df.store.str.startswith(('Upper', 'Lower'))]
store sales
3 West 198
4 CTR 177
Notice that the only rows returned are the ones where the store column does not start with ‘Upper’ or ‘Lower.’
If you’d like, you can also define the tuple of strings outside of the startswith() function:
#define tuple of strings
some_strings = ('Upper', 'Lower')
#select all rows where store does not start with strings in tuple
df[~df.store.str.startswith(some_strings)]
store sales
3 West 198
4 CTR 177
This produces the same result as the previous method.
Note: You can find the complete documentation for the startswith function in pandas here.
Additional Resources
The following tutorials explain how to perform other common tasks in pandas:
Pandas: How to Filter Rows Based on String Length
Pandas: How to Check if Column Contains String
Pandas: How to Concatenate Strings from Using GroupBy