I have a dataframe with various columns. I want to check whether each row satisfies a condition or not. The condition comes from a another CSV file, but here I provide a simplified example to illustrate my question:
The condition is that having a price less than 26000.
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
'Price': [22000,25000,27000,35000]
}
mydata = pd.DataFrame(cars, columns = ['Brand','Price'], index=['Car_1','Car_2','Car_3','Car_4'])
the data looks like this:
print (df)
Brand Price
Car_1 Honda Civic 22000
Car_2 Toyota Corolla 25000
Car_3 Ford Focus 27000
Car_4 Audi A4 35000
So, I created another column with np.nan
and in a for
loop, I check whether that row satisfices that condition, and if yes, then I give the value of True
to that cell.
mydata['condition'] = np.nan
Brand Price condition
Car_1 Honda Civic 22000 NaN
Car_2 Toyota Corolla 25000 NaN
Car_3 Ford Focus 27000 NaN
Car_4 Audi A4 35000 NaN
and my fore loop is this:
for i in range(mydata.shape[0]):
mydata.condition.iloc[i] = None
if (mydata.Price.iloc[i] <= 26000):
mydata.condition.iloc[i] = True
now, mydata
looks like this:
Brand Price condition
Car_1 Honda Civic 22000 True
Car_2 Toyota Corolla 25000 True
Car_3 Ford Focus 27000 None
Car_4 Audi A4 35000 None
and if I use dropna()
I will have the result I want:
filtered_results=mydata.dropna()
Brand Price condition
Car_1 Honda Civic 22000 True
Car_2 Toyota Corolla 25000 True
my problem is that I am getting a warning, as below:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
iloc._setitem_with_indexer(indexer, value)
My question is that what is the proper/efficient way of assigning value to a dataframe in this line to avoid the above error:
mydata.condition.iloc[i] = True
I apricate your help.