Error in pandas.loc calculation - Copy of a sliced dataframe

Question

I have two pandas dataframes, one named df1 and another named df2:

import pandas as pd

df1 = pd.DataFrame({'PERCENTAGE': [0.35,0.1105,0.0487,0.98],})

df2 = df1.loc[df1['PERCENTAGE'] > 0.4]

I'm trying to create a new column in df2 using this code:

df2['NEW_COLUMN'] = 1 - df2['PERCENTAGE']

But I'm getting the following error:

C:\PATH: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.

Is there a way to solve this without having to build this column directly in df1?

wwnde · Answer 1 · 2020-07-15T22:12:53.337

0

Lets Try df.loc[Query,Column]

m=df1.PERCENTAGE>0.4
df1.loc[m,'NewColumn']=1-df1.loc[m,'PERCENTAGE']
df2=df1.dropna()




     PERCENTAGE  NewColumn
3        0.98       0.02

edited Jul 15 '20 at 22:12

answered Jul 15 '20 at 21:37

wwnde

26,119
6
18
32

It doesn't work the way I want because df2 should be a different dataframe, with only observations where percentage is > 0.4. And in that dataframe, create a new column like the one you created. – Caldass_ Jul 15 '20 at 22:05
Thank you for your answer! This works, but I was looking for a more elegant way of doing it, otherwise I would do that column in df1 and then just filter the > 0.4 as df2. – Caldass_ Jul 15 '20 at 23:38
1

Humbled by the opportunity to answer your question. Keep coding – wwnde Jul 15 '20 at 23:41
Hmm, it's just that I thought that you should only up vote when the answer is a solution, hope my "thank you" didn't bother you. – Caldass_ Jul 16 '20 at 00:21
1

Keep well and keep coding. Coding is fun – wwnde Jul 16 '20 at 00:27

Error in pandas.loc calculation - Copy of a sliced dataframe

1 Answers1