0

Please consider this example:

import numpy as np
import pandas as pd
df = pd.DataFrame([1,2,3,4,5], columns=["col"])

df[df["col"] == 3]["col"] = 11 # Does not work.
df["col"][df["col"] == 3] = 55 # Does work!

Although the assignments differ in their results, the underlying selections yield the same result:

df[df["col"] == 3]["col"] # Looks like the same as
df["col"][df["col"] == 3] # this

Why does one way work and the other does not?

Xiphias
  • 4,468
  • 4
  • 28
  • 51
  • 1
    You should probably read [this section](http://pandas.pydata.org/pandas-docs/stable/indexing.html#returning-a-view-versus-a-copy) of the docs. – DSM Feb 07 '14 at 17:31
  • @DSM Thank you, I understand the difference and see that both selections are not the same; this can be confirmed using the `is` operator for comparison. But is there a simple rule? The documentation looks rather complicated in this point. How far does the following statement apply to one but not the other select? "Whenever an array of labels or a boolean vector are involved in the indexing operation, the result will be a copy." In both versions `df["col"] == 3` is a boolean vector for selection, but in the former version the condition is used first; in the latter the Series is selected first. – Xiphias Feb 07 '14 at 17:38
  • 1
    @Tobias their is not a simple rule; best that I can tell u is that if the entire frame is a single dtype it should work equally for both types; but, that is exactly the reason we don't recommend chained indexing, because say u add a string column and it may suddenly stop working – Jeff Feb 07 '14 at 17:45
  • a related answer: http://stackoverflow.com/questions/21463589/pandas-chained-assignments/21463854#21463854 – Jeff Feb 07 '14 at 17:45
  • @Jeff Thank you for the link. So your advice is to select the column first when a view instead of a copy is desired? – Xiphias Feb 07 '14 at 17:48
  • @Tobias, the document is probably a little confusing, but the example is very clear: `dfb['c'][dfb.a.str.startswith('o')] = 42` works and `dfb[dfb.a.str.startswith('o')]['c'] = 42` does not. Exactly the same as yours. – CT Zhu Feb 07 '14 at 17:49
  • @CT Zhu Thanks, that's the same, indeed. I'm trying to find a general rule like "always select the column first if you want to avoid a copy". – Xiphias Feb 07 '14 at 17:52
  • 1
    @Tobias select using a multi-axis indexer, e.g. ``df.loc['col',df['col'] == 3]`` (and assign the same way) – Jeff Feb 07 '14 at 17:56

0 Answers0