NaNs in Pandas dataframe after columnwise operations with multi-index

Question

I'm looking for some help understanding the results of the below. Why do I get NaN for df.loc[1, 'c2']? Since I don't get the same type of error when there is only one index, it must have something to do with not specifying the second level of the multi-index in the calculation, but I'm having trouble figuring out the exact problem. Why does it only work when I use .values?

df = pd.DataFrame({'i': [1,1,2,2], 'i2':[1,2,1,2], 'a':[10,20,30,40], 'b':[100,100,300,400]})

df = df.set_index('i')

df.loc[1, 'c1'] = df.loc[1, 'a'] / df.loc[1, 'b']                #Works

df = df.reset_index()
df = df.set_index(['i', 'i2'])

df.loc[1, 'c2'] = df.loc[1, 'a'] / df.loc[1, 'b']                #Fails (NaN)

df.loc[1, 'c2'].index.equals(df.loc[1, 'a'].index)               #True
df.loc[1, 'c2'].index.equals(df.loc[1, 'b'].index)               #True

df.loc[1, 'c3'] = df.loc[1, 'a'].values / df.loc[1, 'b'].values  #Works
df.loc[1, 'c4'] = (df.loc[1, 'a'] / df.loc[1, 'b']).values       #Works

Does this answer your question? [Select rows in pandas MultiIndex DataFrame](https://stackoverflow.com/questions/53927460/select-rows-in-pandas-multiindex-dataframe) — Ben.T, Apr 21 '20 at 12:52
I don't think so, unless I'm simply missing the key details. Looking there, it appears that the way I did it should work, though that answer is all about selecting and not assigning which is where my problem is. — modus, Apr 21 '20 at 18:57
my mistake, I understand better your question. because it works with `.values` I would say the problem is index alignement, even if you check that index are equal. It seems for example that if you select the slice(None) for the second level, it works `df.loc[(1,slice(None)), 'c5'] = df.loc[(1,slice(None)), 'a'] / df.loc[(1,slice(None)), 'b'] ` — Ben.T, Apr 21 '20 at 20:13

score 0 · Answer 1 · answered Apr 21 '20 at 12:50

I'm not familiar enough with pandas' indexing internals to say why it's working the way that it is. I can confirm that I'm seeing the same behavior.

This is just a hunch, but maybe it's that using the scalar 1 as an index value is a little ambiguous. Using a range/slice seems to fix the issue, so perhaps it helps pandas to resolve some of that ambiguity? Again, it's just a hunch.

df.loc[1:1, 'c1'] = df.loc[1:1, 'a'] / df.loc[1:1, 'b']

NaNs in Pandas dataframe after columnwise operations with multi-index

1 Answers1