How to Use the findall() Function in Pandas


Often you may want to find all occurrences of a particular string in a pandas Series.

The easiest way to do so is by using the pandas findall() function, which uses the following syntax:

Series.str.findall(patflags=0)

where:

  • pat: The pattern to search for
  • flags: Whether to ignore the case (flags=re.IGNORECASE) or not.

Here are the most common ways to use this function in practice:

Method 1: Find All Occurrences of String (case-sensitive)

my_series.str.findall('this_pattern')

This particular example will find all occurrences of ‘this_pattern’ in the pandas series named my_series that match the exact case.

Method 2: Find All Occurrences of String (not case-sensitive)

import re
my_series.str.findall('this_pattern', flags=re.IGNORECASE) 

This particular example will find all occurrences of ‘this_pattern’ in the pandas series named my_series, regardless of case.

Method 3: Find All Occurrences of Strings that Begin with Value

my_series.str.findall('^this')

This particular example will find all items in the series that begin with ‘this’ in the pandas series named my_series.

Method 4: Find All Occurrences of Strings that End with Value

my_series.str.findall('this$')

This particular example will find all items in the series that end with ‘this’ in the pandas series named my_series.

The following examples show how to use each method in practice with the following pandas Series:

import pandas as pd

#create pandas Series
my_series = pd.Series(['Mavs', 'MAVS', 'Cavs', 'Magic', 'cavs'])

#view Series
print(my_series)

0     Mavs
1     MAVS
2     Cavs
3    Magic
4     cavs
dtype: object

Example 1: Find All Occurrences of String (case-sensitive)

The following code shows how to find all occurrences of the string ‘Cavs’ in the pandas Series:

my_series.str.findall('Cavs') 

0        []
1        []
2    [Cavs]
3        []
4        []
dtype: object

From the output we can see that the exact string ‘Cavs’ occurs only in index position 2 of the Series.

Example 2: Find All Occurrences of String (not case-sensitive)

The following code shows how to find all occurrences of the string ‘Cavs’ (regardless of case) in the pandas Series:

import re
my_series.str.findall('Cavs', flags=re.IGNORECASE) 

0        []
1        []
2    [Cavs]
3        []
4    [cavs]
dtype: object

From the output we can see that the string ‘Cavs’ (regardless of cases) occurs in index positions 2 and 4 of the Series.

Example 3: Find All Occurrences of Strings that Begin with Value

The following code shows how to find all occurrences of strings that begin with ‘Ma’ in the pandas Series:

my_series.str.findall('^Ma') 

0    [Ma]
1      []
2      []
3    [Ma]
4      []
dtype: object

From the output we can see that the strings in index positions 0 and 3 start with ‘Ma’ (case-sensitive).

Note: The symbol ^ is used to find strings that “begin with” the characters that follow the ^ symbol in regex.

Example 4: Find All Occurrences of Strings that End with Value

The following code shows how to find all occurrences of strings that end with ‘avs’ in the pandas Series:

my_series.str.findall('avs$') 

0    [avs]
1       []
2    [avs]
3       []
4    [avs]
dtype: object

From the output we can see that the strings in index positions 0, 2 and 4 all end in ‘avs’ (case-sensitive).

You can find the complete documentation for the findall() function in Pandas here.

Additional Resources

The following tutorials explain how to perform other common tasks in Pandas:

How to Plot a Pandas Series
How to Convert Pandas Series to DataFrame
How to Convert Pandas Series to NumPy Array

Leave a Reply

Your email address will not be published. Required fields are marked *