I intended to drop all rows in a dataframe that I no longer need using the following:
df = df[my_selection]
where my_selection
is a series of boolean values.
Later when I tried to add a column as follows:
df['New column'] = pd.Series(data)
I got the well-known "SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead"
Does this mean that df
is actually a slice of its former self?
Or why am I being accused of assigning values to a slice?
Demo code:
import pandas as pd
data = {
'A': pd.Series(range(8)),
'B': pd.Series(range(8,0,-1))
}
df = pd.DataFrame(data)
df
Output:
A B
0 0 8
1 1 7
2 2 6
3 3 5
4 4 4
5 5 3
6 6 2
7 7 1
This causes a warning:
my_selection = df['A'] < 4
df = df[my_selection]
df['C'] = pd.Series(range(4))
This does not create a warning:
df = pd.DataFrame(data)
df['C'] = pd.Series(range(8))
Should I be using df.drop?