0

I have a DataFrame object df. And I would like to modify job column so that all retired people are 1 and rest 0 (like shown here):

df['job'] = df['job'].apply(lambda x: 1 if x == "retired" else 0)

But I get a warning:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

Why did I get it here though? From what I read it applies to situations where I take a slice of rows and then a column, but here I am just modyfing elements in a row. Is there a better way to do that?

mikol
  • 185
  • 1
  • 10
  • Does this answer your question? [How to deal with SettingWithCopyWarning in Pandas?](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – ytu Dec 14 '19 at 17:48

3 Answers3

2

Use:

df['job']=df['job'].eq('retired').astype(int)

or

df['job']=np.where(df['job'].eq('retired'),1,0)
ansev
  • 30,322
  • 5
  • 17
  • 31
0

So here's an example dataframe:

import pandas as pd
import numpy as np

data = {'job':['retired', 'a', 'b', 'retired']}
df = pd.DataFrame(data)
print(df)

       job
0  retired
1        a
2        b
3  retired

Now, you can make use of numpy's where function:

df['job'] = np.where(df['job']=='retired', 1, 0)
print(df)

   job
0    1
1    0
2    0
3    1
Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
0

I would not suggest using apply here, as in the case of large data frame it could lower your performance.

I would prefer using numpy.select or numpy.where.

See This And This