4

I have a dataframe with 3 columns(A,B,C). I wanted to update column C with column A when column B=22. I have written update statement like this, but it's updating NaN for non matching rows. Could you tell me how to update data in dataframe?

df = pd.DataFrame(data=[[10,20,30],[11,21,31],[12,22,32]], columns=['A','B','C'])
df.C = df[df.B==22].A
Grayrigel
  • 3,474
  • 5
  • 14
  • 32

4 Answers4

3

One of several ways to do this, and yes it requires an additional package, but if you aren't aware of np.where it's pretty handy.

import numpy as np

df['C'] = np.where(df['B']==22, df['A'], df['C'])
Chris
  • 15,819
  • 3
  • 24
  • 37
3
df.loc[df.B==22, 'C'] = df.loc[df.B==22, 'A']
Arty
  • 14,883
  • 6
  • 36
  • 69
3

Let us try mask

df.C.mask(df.B==22, df.A,inplace=True)
df
    A   B   C
0  10  20  30
1  11  21  31
2  12  22  12
BENY
  • 317,841
  • 20
  • 164
  • 234
2

Another alternative is using loc and reindex:

df['C'] = df.loc[df.B==22,'A'].reindex(df.index).fillna(df['C'])

Ideally you can use np.where for such cases, however here is why your code doesnot work:

The below

df[df.B==22].A

Returns:

2    12

you will see that the index of the returned values is 2, so when you set df.C (use bracket notation instead of a . notation) , it updates the series named C to the result of previous which does not contain the other indexes (but only 2) , hence other indexes are set as np.nan

Also it is highly discouraged to use chained indexing when assigning values as it leads to this warning.

anky
  • 74,114
  • 11
  • 41
  • 70