You can use the following basic syntax to compare the values in three columns in pandas:
df['all_matching'] = df.apply(lambda x: x.col1 == x.col2 == x.col3, axis = 1)
This syntax creates a new column called all_matching that returns a value of True if all of the columns have matching values, otherwise it returns False.
The following example shows how to use this syntax in practice.
Example: Compare Three Columns in Pandas
Suppose we have the following pandas DataFrame with three columns:
import pandas as pd #create DataFrame df = pd.DataFrame({'A': [4, 0, 3, 3, 6, 8, 7, 9, 12], 'B': [4, 2, 3, 5, 6, 4, 7, 7, 12], 'C': [4, 0, 3, 5, 5, 10, 7, 9, 12]}) #view DataFrame print(df) A B C 0 4 4 4 1 0 2 0 2 3 3 3 3 3 5 5 4 6 6 5 5 8 4 10 6 7 7 7 7 9 7 9 8 12 12 12
We can use the following code to create a new column called all_matching that returns True if all three columns match in a given row and False if they do not:
#create new column that displays whether or not all column values match df['all_matching'] = df.apply(lambda x: x.A == x.B == x.C, axis = 1) #view updated DataFrame print(df) A B C all_matching 0 4 4 4 True 1 0 2 0 False 2 3 3 3 True 3 3 5 5 False 4 6 6 5 False 5 8 4 10 False 6 7 7 7 True 7 9 7 9 False 8 12 12 12 True
The new column called all_matching shows whether or not the values in all three columns match in a given row.
For example:
- All three values match in the first row, so True is returned.
- Not every value matches in the second row, so False is returned.
And so on.
Additional Resources
The following tutorials explain how to perform other common tasks in pandas:
How to Rename Columns in Pandas
How to Add a Column to a Pandas DataFrame
How to Change the Order of Columns in Pandas DataFrame