How to Use “NOT IN” Filter in Pandas (With Examples)


You can use the following syntax to perform a “NOT IN” filter in a pandas DataFrame:

df[~df['col_name'].isin(values_list)]

Note that the values in values_list can be either numeric values or character values.

The following examples show how to use this syntax in practice.

Example 1: Perform “NOT IN” Filter with One Column

The following code shows how to filter a pandas DataFrame for rows where a team name is not in a list of names:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'],
                   'points': [25, 12, 15, 14, 19, 23, 25, 29],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})

#define list of teams we don't want
values_list = ['A', 'B']

#filter for rows where team name is not in list
df[~df['team'].isin(values_list)]

        team	points	assists	rebounds
6	C	25	9	9
7	C	29	4	12

And the following code shows how to filter a pandas DataFrame for rows where the ‘points’ column does not contain certain values:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'],
                   'points': [25, 12, 15, 14, 19, 23, 25, 29],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})

#define list of values we don't want
values_list = [12, 15, 25]

#filter for rows where team name is not in list
df[~df['team'].isin(values_list)]

	team	points	assists	rebounds
3	B	14	9	6
4	B	19	12	6
5	B	23	9	5
7	C	29	4	12

Example 2: Perform “NOT IN” Filter with Multiple Columns

The following code shows how to filter a pandas DataFrame for rows where certain team names are not in one of several columns:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'star_team': ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'],
                   'backup_team': ['B', 'B', 'C', 'C', 'D', 'D', 'D', 'E'],
                   'points': [25, 12, 15, 14, 19, 23, 25, 29],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})

#define list of teams we don't want
values_list = ['C', 'E']

#filter for rows where team name is not in one of several columns
df[~df[['star_team', 'backup_team']].isin(values_list).any(axis=1)] 

        star_team backup_team  points	assists	rebounds
0	A	  B	       25	5	11
1	A	  B	       12	7	8
4	B	  D	       19	12	6
5	B	  D	       23	9	5

Notice that we filtered out every row where teams ‘C’ or ‘E’ appeared in either the ‘star_team’ column or the ‘backup_team’ column.

Additional Resources

The following tutorials explain how to perform other common filtering operations in pandas:

How to Use “Is Not Null” in Pandas
How to Filter a Pandas DataFrame by Column Values
How to Filter Pandas DataFrame Rows by Date
How to Filter a Pandas DataFrame on Multiple Conditions

Leave a Reply

Your email address will not be published. Required fields are marked *