You can use the following methods to compare dates between two columns in a pandas DataFrame:
Method 1: Add New Column to DataFrame that Shows Date Comparison
df['met_due_date'] = df['comp_date'] < df['due_date']
This particular example adds a new column called met_due_date that returns True or False depending on whether the date in the comp_date column is before the date in the due_date column.
Method 2: Filter DataFrame Based on Date Comparison
df_met_due_date = df[df['comp_date'] < df['due_date']]
This particular example filters the DataFrame to only keep rows where the date in the comp_date column is before the date in the due_date column.
The following examples show how to use each of these methods in practice with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'task': ['A', 'B', 'C', 'D'], 'due_date': ['4-15-2022', '5-19-2022', '6-14-2022', '10-24-2022'], 'comp_date': ['4-14-2022', '5-23-2022', '6-24-2022', '10-7-2022']}) #convert due_date and comp_date columns to datetime format df[['due_date', 'comp_date']] = df[['due_date', 'comp_date']].apply(pd.to_datetime) #view DataFrame print(df) task due_date comp_date 0 A 2022-04-15 2022-04-14 1 B 2022-05-19 2022-05-23 2 C 2022-06-14 2022-06-24 3 D 2022-10-24 2022-10-07
Example 1: Add New Column to DataFrame that Shows Date Comparison
The following code shows how to add a new column called met_due_date that returns True or False depending on whether the date in the comp_date column is before the date in the due_date column.
import pandas as pd #create new column that shows if completion date is before due date df['met_due_date'] = df['comp_date'] < df['due_date'] #view updated DataFrame print(df) task due_date comp_date met_due_date 0 A 2022-04-15 2022-04-14 True 1 B 2022-05-19 2022-05-23 False 2 C 2022-06-14 2022-06-24 False 3 D 2022-10-24 2022-10-07 True
For each row in the DataFrame, the new met_due_date column shows whether the date in the comp_date column is before the date in the due_date column.
For example, we can see that task A had a due date of 4/15/2022 and a completion date of 4/14/2022.
Since the completion date was before the due date, the value in the met_due_date column is True.
Example 2: Filter DataFrame Based on Date Comparison
The following code shows how to filter the DataFrame to only contain rows where the date in the comp_date column is before the date in the due_date column.
import pandas as pd #filter for rows where completion date is before due date df_met_due_date = df[df['comp_date'] < df['due_date']] #view results print(df_met_due_date) task due_date comp_date 0 A 2022-04-15 2022-04-14 3 D 2022-10-24 2022-10-07
The new DataFrame has been filtered to only contain rows where the date in the comp_date column is before the date in the due_date column.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
How to Create a Date Range in Pandas
How to Convert Timestamp to Datetime in Pandas
How to Calculate a Difference Between Two Dates in Pandas