How to Fix in Pandas: SettingWithCopyWarning


One warning you may encounter when using pandas is:

SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.

This warning appears when pandas encounters something called chain assignment – the combination of chaining and assignment all in one step.

It’s important to note that this is merely a warning, not an error. Your code will still run, but the results may not always match what you thought they would be.

The easiest way to suppress this warning is to use the following bit of code:

pd.options.mode.chained_assignment = None

The following example shows how to address this warning in practice.

How to Reproduce the Warning

Suppose we create the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'A': [25, 12, 15, 14, 19, 23, 25, 29],
                   'B': [5, 7, 7, 9, 12, 9, 9, 4],
                   'C': [11, 8, 10, 6, 6, 5, 9, 12]})

#view DataFrame
df

	A	B	C
0	25	5	11
1	12	7	8
2	15	7	10
3	14	9	6
4	19	12	6
5	23	9	5
6	25	9	9
7	29	4	12

Now suppose we create a new DataFrame that only contains column ‘A’ from the original DataFrame and we divide each value in column ‘A’ by 2:

#define new DataFrame
df2 = df[['A']]

#divide all values in column 'A' by 2
df2['A'] = df['A'] / 2

/srv/conda/envs/notebook/lib/python3.7/site-packages/ipykernel_launcher.py:2:
SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

We receive the SettingWithCopyWarning message because we set new values for column ‘A’ on a “slice” from the original DataFrame.

However, if we look at the new DataFrame we created then we’ll see that each value was actually successfully divided by 2:

#view new DataFrame
df2

     A
0 12.5
1 6.0
2 7.5
3 7.0
4 9.5
5 11.5
6 12.5
7 14.5

Although we received a warning message, pandas still did what we thought it would do.

How to Avoid the Warning

To avoid the warning, it’s recommended to use the .loc[row indexer, col indexer] syntax as follows:

#define new DataFrame
df2 = df.loc[:, ['A']]

#divide each value in column 'A' by 2
df2['A'] = df['A'] / 2

#view result
df2

     A
0 12.5
1 6.0
2 7.5
3 7.0
4 9.5
5 11.5
6 12.5
7 14.5

The new DataFrame contains all of the values from column ‘A’ in the original DataFrame, divided by two, and no warning message appears.

If we’d like to prevent the warning message from ever showing, we can use the following bit of code:

#prevent SettingWithCopyWarning message from appearing
pd.options.mode.chained_assignment = None

For an in-depth explanation for why chained assignment should be avoided, refer to the online pandas documentation.

Additional Resources

How to Fix: No module named pandas
How to Fix: No module named numpy
How to Fix: columns overlap but no suffix specified

Leave a Reply

Your email address will not be published.