How to Use the fullmatch() Function in Pandas


Often you may want to check if each string in a pandas Series fully matches a specific regular expression.

The easiest way to do so is by using the fullmatch() function, which performs this exact task.

The fullmatch() function uses the following syntax:

pandas.Series.str.fullmatch(pat, case=True, flags=0, na=None)

where:

  • pat: Regular expression
  • case: Whether to be case-sensitive or not (default is True)
  • flags: Regex module flags to use
  • na: Fill value to use for missing values

The following example shows how to use the fullmatch() function in practice with a pandas DataFrame.

Example: How to Use the fullmatch() Function in Pandas

Suppose we create the following pandas DataFrame that contains information about various basketball players, including the team they play for and their average points scored per game:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['Mavs', 'Magic', 'Heat', 'mavs', 'Nets', 'hawks'],
                   'points': [29, 34, 12, 15, 22, 40]})

#view DataFrame
print(df)

    team  points
0   Mavs      29
1  Magic      34
2   Heat      12
3   mavs      15
4   Nets      22
5  hawks      40

Suppose that we would like to check if each string in the team column starts with the string ‘Ma’ or not.

We can use the fullmatch() function to perform this task because we need to use a regular expression to check if each string starts with a specific pattern.

We can use the fullmatch() function with the following syntax to do so:

#check if each string in team column starts with 'Ma'
df['team'].str.fullmatch(r'Ma.+')

0     True
1     True
2    False
3    False
4    False
5    False
Name: team, dtype: bool

Notice that we receive the following results as output:

  • The first row returns True because Mavs starts with Ma.
  • The second row returns True because Magic starts with Ma.
  • The third row returns False because Heat does not start with Ma.
  • The fourth row returns False because mavs does not start with Ma.
  • The fifth row returns False because Nets does not start with Ma.
  • The sixth row returns False because hawks does not start with Ma.

Notice that we didn’t specify a value for the case argument of the fullmatch() function, which means that we used the default setting of using case-sensitive matching.

This explains why the string mavs returned False for starting with Ma because it didn’t match the exact case that we specified in the regular expression.

Note that we could specify case=False to instead use case-insensitive matching.

We can use the following syntax to do so:

#check if each string in team column starts with 'Ma'
df['team'].str.fullmatch(r'Ma.+', case=False)

0     True
1     True
2    False
3     True
4    False
5    False
Name: team, dtype: bool

Notice that the fourth row now returns a value of True because the string Mavs does match a starting pattern of Ma, regardless of case.

Note that we used a regular expression to check if strings started with a specific pattern but you can use the fullmatch() function with any regular expression that you would like.

Note: You can find the complete documentation for the fullmatch() function in pandas here.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Use the Rolling.apply() Function in Pandas
How to Use the nunique() Function in Pandas
How to Use the get_loc() Function in Pandas
How to Use idxmin() Function in Pandas

Featured Posts

Leave a Reply

Your email address will not be published. Required fields are marked *