2

I have a problem with assigning a series like object to a slice of a Pandas dataframe. Maybe I'm not using the Datafarme the way it is intended to, so some enlightment will be greatly appreciated. I've already read the following articles:

pandas: slice a MultiIndex by range of secondary index

Returning a view versus a copy

As far as I understand the way I'm evoking the slice with one .loc call does ensure I'm getting not a copy of the data. Obviously also the original dataframe gets altered, but instead of the expected data I get NaN values. See the appended code snipet.

Do I have to iterate over the desired section of the dataframe for each single value I want to change and use the .set_value(row_idx,col_idx,val) method?

kind regards and thanks in advance

Markus

In [1]: import pandas as pd

In [2]: mindex = pd.MultiIndex.from_product([['one','two'],['first','second']])

In [3]: dfmi = pd.DataFrame([list('abcd'),list('efgh'),list('ijkl'),list('mnop')],
   ...:                     index = mindex, columns=(['X','Y','Z','Q']))

In [4]: print(dfmi)
            X  Y  Z  Q
one first   a  b  c  d
    second  e  f  g  h
two first   i  j  k  l
    second  m  n  o  p

In [5]: dfmi.loc[('two',slice('first','second')),'X']
Out[5]: 
two  first     i
     second    m
Name: X, dtype: object

In [6]: substitute = pd.Series(data=["ab","cd"], index= mindex.levels[1])
   ...: print(substitute)
first     ab
second    cd
dtype: object

In [7]: dfmi.loc[('two',slice('first','second')),'X'] = substitute

In [8]: print(dfmi)
              X  Y  Z  Q
one first     a  b  c  d
    second    e  f  g  h
two first   NaN  j  k  l
    second  NaN  n  o  p
Community
  • 1
  • 1
M. Eppel
  • 53
  • 6

2 Answers2

4

What's happening is that substitute has an index, which determine the location of the values, and dfmi.loc[('two',slice('first','second')),'X'] is also specifying such location.

During the assignment pandas is trying to align both index and since they do not match (they would if substitute was also a multi-index), the result of the alignment are all NA's, which get inserted.

A solution could be to get rid of the index of substitute since the location of where you want to insert the values is already specified in the loc:

dfmi.loc[('two',slice('first','second')),'X'] = substitute.values

or even simpler, insert the values directly:

dfmi.loc[('two',slice('first','second')),'X'] = ["ab","cd"]
Svend
  • 6,352
  • 1
  • 25
  • 38
  • Thanks for the reply! The substitute.values solutions works! However as for your explanation, why my code does not work, I do not understand. If you look into my code the index of substitute is mindex.level[1] which equals an index of ['first', 'second'] just as the index of the sliced view of the dataframe does. It should match? – M. Eppel Sep 15 '16 at 11:17
  • What happens is that `mindex.levels[1]` is a single level index, so it cannot be aligned with a 2 levels index of `dfmi` . What works is defining substitute with a multi-index, like `substitute = pd.Series(data=["ab","cd"], index=pd.MultiIndex.from_tuples([("two", "first"), ("two", "second"), ]))`, => then the assignment as `dfmi.loc[('two',slice('first','second')),'X'] = substitute` works as you expect, though that's getting really convoluted. I would suggest sticking to just assigning data as arrays directly. – Svend Sep 15 '16 at 12:03
  • Thanks for the clarification! Great help! – M. Eppel Sep 16 '16 at 08:29
  • This solution essentially inserts by position. Isn't a downside of this solution that it relies on the assumption that the second-level indexes of `dfmi` are always aligned with the indexes of the `substitute` Series? Still, that seems like the only workable solution, so thanks! – billjoie Nov 19 '20 at 00:16
0

Can you try this:

dfmi.loc['two']['X']=substitute

Venkat R
  • 51
  • 3
  • That works. Thank you! But while it is a solution to the given example it won't help in my actual usecase. I really want to slice the second index. So if the that goes from first to fourth and I want to assign only from first to second. Sorry for not giving an example which is precise in that regard. – M. Eppel Sep 15 '16 at 11:16