0

I have a pandas dataframe:

df = pd.DataFrame({'AKey':[1, 9999, 1, 1, 9999, 2, 2, 2],\
    'AnotherKey':[1, 1, 1, 1, 2, 2, 2, 2]})

I want to assign a new value to a specific column and for each element having a specific value in that column.

Let say I want to assign the new value 8888 to the elements having value 9999. I tried the following:

df[df["AKey"]==9999]["AKey"]=8888

but it returns the following error:

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

So I tried to use loc

df.loc[df["AKey"]==9999]["AKey"]=8888

which returned the same error.

I would appreciate some help and some explanation on the error as I really can't wrap my head around it.

Nick ODell
  • 15,465
  • 3
  • 32
  • 66
CAPSLOCK
  • 6,243
  • 3
  • 33
  • 56

2 Answers2

1

You can use loc in this way:

df.loc[df["AKey"]==9999, "AKey"] = 8888

Producing the following output:

enter image description here

With your original code you are first slicing the dataframe with:

df.loc[df["AKey"]==9999]

Then assign a value for the sliced dataframe's column AKey.

["AKey"]=8888

In other words, you were updating the slice, not the dataframe itself.

From Pandas documentatiom:

.loc[] is primarily label based, but may also be used with a boolean array.

Breaking down the code:

df.loc[df["AKey"]==9999, "AKey"]

df["AKey"]==9999 will return a boolean array identifying the rows, and the string "Akey" will identify the column that will receive the new value, at once without slicing.

Daniel Labbe
  • 1,979
  • 3
  • 15
  • 20
1

Ok, I found a solution. It works if I use logical indexing to also identify the column.

df.loc[df["AKey"]==9999& df["AKey"]]=8888

However I would still appreciate help on the error I was receiving as it is not fully clear to me why Python thought that I was slicing instead of indexing

CAPSLOCK
  • 6,243
  • 3
  • 33
  • 56