26

Suppose I have the following DataFrame:

In [1]: df
Out[1]:
  apple banana cherry
0     0      3   good
1     1      4    bad
2     2      5   good

This works as expected:

In [2]: df['apple'][df.cherry == 'bad'] = np.nan
In [3]: df
Out[3]:
  apple banana cherry
0     0      3   good
1   NaN      4    bad
2     2      5   good

But this doesn't:

In [2]: df[['apple', 'banana']][df.cherry == 'bad'] = np.nan
In [3]: df
Out[3]:
  apple banana cherry
0     0      3   good
1     1      4    bad
2     2      5   good

Why? How can I achieve the conversion of both the 'apple' and 'banana' values without having to write out two lines, as in

In [2]: df['apple'][df.cherry == 'bad'] = np.nan
In [3]: df['banana'][df.cherry == 'bad'] = np.nan
abcd
  • 10,215
  • 15
  • 51
  • 85

3 Answers3

39

You should use loc and do this without chaining:

In [11]: df.loc[df.cherry == 'bad', ['apple', 'banana']] = np.nan

In [12]: df
Out[12]: 
   apple  banana cherry
0      0       3   good
1    NaN     NaN    bad
2      2       5   good

See the docs on returning a view vs a copy, if you chain the assignment is made to the copy (and thrown away) but if you do it in one loc then pandas cleverly realises you want to assign to the original.

Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • 1
    Is it easy to extend this to the case where np.nan is instead another data frame with the shape of ```df[['apple', 'banana']][df.cherry == 'bad']``` ? – dermen May 26 '15 at 22:02
  • @dermen it should "just work", though you may have issues aligning (on the index).... you may have to assign `other_df.values`. Perhaps worth asking as another question if that fails? – Andy Hayden May 26 '15 at 23:41
  • I was having troubles doing this, which lead me to this posting.. and as it turns out, assigning ```other_df.values``` is the way to go. – dermen May 27 '15 at 01:18
  • Is there a way to do this, but assign NaN to apple and 123 to banana ? – scrollout Sep 09 '21 at 16:31
5

It's because df[['apple', 'banana']][df.cherry == 'bad'] = np.nan assigning to the copy of DataFrame. Try this:

df.ix[df.cherry == 'bad', ['apple', 'banana']] = np.nan
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
  • 3
    Don't use api `ix` again, use `loc` or `iloc` instead, see this [What is meant by ".ix is deprecated" in Python Pandas?](https://www.quora.com/What-is-meant-by-ix-is-deprecated-in-Python-Pandas) – WY Hsu Oct 20 '18 at 13:36
1

While this question is broad, the answers seem very specific and not very versatile. This is just to clarify...

df = pandas.DataFrame({'Test1' :[1,2,3,4,5], 'Test2': [3,4,5,6,7], 'Test3': [5,6,7,8,9]})

   Test1 Test2 Test3
0  1     3     5
1  2     4     6
2  3     5     7
3  4     6     8
4  5     7     9

# When the index or row you want to edit is known
df.loc[3, ['Test1', 'Test2', 'Test3'] = [10, 12, 14]

# When you don't know the index but can find it by looking in a column for a specific value

df.loc[df[df['Test1'] == 4].index[0], ['Test1', 'Test2', 'Test3']] = [10, 12, 14]

   Test1 Test2 Test3
0  1     3     5
1  2     4     6
2  3     5     7
3  10    12    14
4  5     7     9

Both methods allow you to change the values of multiple columns in one line of code.

NL23codes
  • 1,181
  • 1
  • 14
  • 31