Pandas: How to Use startswith in query() Method


Often you may want to use the startswith() function within the query() method in pandas to filter for rows in a DataFrame where a column starts with a specific string.

You can use the following syntax to do so:

df.query('team.str.startswith("Ma")')

This particular query filters for rows in a pandas DataFrame where the team column starts with the string ‘Ma’.

Note: Be sure to use the syntax str.startswith() instead of just startswith() or else you will receive an error. This is a common mistake to avoid.

The following example shows how to use  this syntax in practice.

Example: How to Use startswith in query() Method in Pandas

Suppose we have the following pandas DataFrame that contains information about various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['Mavs', 'Magic', 'Nets', 'Heat', 'Mavs', 'Kings'],
                   'points': [18, 22, 19, 14, 14, 11],
                   'assists': [5, 7, 7, 9, 12, 9],
                   'rebounds': [11, 8, 10, 6, 6, 5]})

#view DataFrame
print(df)

    team  points  assists  rebounds
0   Mavs      18        5        11
1  Magic      22        7         8
2   Nets      19        7        10
3   Heat      14        9         6
4   Mavs      14       12         6
5  Kings      11        9         5

Now suppose that we would like to query for the rows where the value in the team column starts with ‘Ma’.

We can use the following syntax to do so:

#query for rows where team starts with 'Ma'
df.query('team.str.startswith("Ma")')

	team	points	assists	rebounds
0	Mavs	18	5	11
1	Magic	22	7	8
4	Mavs	14	12	6

Notice that the query() function returns all rows where the value in the team column starts with ‘Ma’.

Specifically we can see that this returns the rows where the team column is equal to either Mavs or Magic, both of which start with ‘Ma’ in their team name.

Note that we can use an or operator to search for rows where the value in the team column starts with one of several different patterns.

For example, we can use the following syntax to filter for rows where the value in the team columns starts with either ‘Ma’ or ‘Kin’ at the start of the string:

#query for rows where team starts with 'Ma' or 'Kin'
df.query('team.str.startswith("Ma") or team.str.startswith("Kin")')

team	points	assists	rebounds
0	Mavs	18	5	11
1	Magic	22	7	8
4	Mavs	14	12	6
5	Kings	11	9	5

We can see that this returns four rows where the string in the team column starts with either ‘Ma’ or ‘Kin’.

Specifically, we can see that the team names included in the output have the following names:

  • Magic
  • Mavs
  • Kings

Note that you can combine as many or operators as you would like to search for strings that start with even more patterns.

It’s also worth noting that the str.startswith() function is case-sensitive.

Thus, if we instead searched for ‘ma’ as our starting pattern then this would not return either of the rows that start with Mavs or Magic since these team names don’t match the case of the pattern that we searched for.

Keep this in mind when using the str.startswith() function when searching for string patterns.

Note: You can find the complete documentation for the pandas query() function here.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

Pandas: How to Filter Rows Based on String Length
Pandas: How to Drop Rows Based on Condition
Pandas: How to Use “NOT IN” Filter

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *