3

I have come across a bug in my code below. If the second element of my categories list is a capital letter then the resulting values of my data frame 'change_1_month_df' are NaN. If I use a lower case letter then the random numbers are correctly inserted. Any Ideas? Thanks

import pandas as pd
import numpy as np

dates = ['d1','d2']
categories = ['a','b']
sub_categories = ['f','g']
my_index = pd.MultiIndex.from_product([categories,sub_categories])
change_1_month_df = pd.DataFrame(index=my_index,columns=dates)

for a in categories:
    for d in dates:
        print a,d
        if a == 'W':
            None
        else:
            change_1_month_df.ix[a].ix['f'][d] = np.random.randn(1)
            change_1_month_df.ix[a].ix['g'][d] = np.random.randn(1)

change_1_month_df
cs95
  • 379,657
  • 97
  • 704
  • 746

1 Answers1

4

Can you try this ? By using .loc select the multi index

for a in categories:
    for d in dates:

        if a == 'W':
            None
        else:
            change_1_month_df.loc[(a, 'f'), d]=np.random.randn(1)[0]
            change_1_month_df.loc[(a, 'g'), d]=np.random.randn(1)[0]

change_1_month_df
BENY
  • 317,841
  • 20
  • 164
  • 234
  • 1
    Wen, you are so right, it has to do with the chaining of .loc or .ix, a copy is getting created in memory and not assigning it to the original dataframe. – Scott Boston Nov 01 '17 at 14:28
  • Yes, this is the answer. I was refraining from posting this because I did not know why the original method returns NaNs. – cs95 Nov 01 '17 at 14:29
  • @cᴏʟᴅsᴘᴇᴇᴅ for his method, it work on my side , just give back the warning .ix . – BENY Nov 01 '17 at 14:31
  • Interesting... if I changed `b` to `B`, it did not work anymore. – cs95 Nov 01 '17 at 14:31
  • @BoscoBaracus notice , you method still work , but suggested to using the method pandas doc mention https://pandas.pydata.org/pandas-docs/stable/advanced.html – BENY Nov 01 '17 at 14:32
  • @cᴏʟᴅsᴘᴇᴇᴅ, I did the same test. Is this still due to the copy in memory issue? – MattR Nov 01 '17 at 14:33
  • But could you elaborate on why a capital letter, as opposed to a lowercase letter, make a difference when chaining with .loc – BoscoBaracus Nov 01 '17 at 14:33
  • @MattR I'm not familiar enough with pandas internals to answer! – cs95 Nov 01 '17 at 14:34
  • Wouldn't that be simple if we use `if a != 'W':` ? – Bharath M Shetty Nov 01 '17 at 14:34
  • @MattR multi index still have lot of bug ...:-), I spend whole night yesterday on git ... – BENY Nov 01 '17 at 14:36
  • @Wen, gotcha! Interestingly enough if `categories = ['A','B']` (ie, both capitals) works fine on my end. even `categories = ['A','b']` works – MattR Nov 01 '17 at 14:37
  • 1
    Since, I made that comment I fill like I have to respond. And, I don't know. However, it is not recommended. [See these docs](https://pandas.pydata.org/pandas-docs/stable/indexing.html#returning-a-view-versus-a-copy) – Scott Boston Nov 01 '17 at 14:37
  • If you guys would like to look into this question https://stackoverflow.com/questions/47047140/pandas-slice-one-multiindex-dataframe-with-multiindex-of-another-when-some-leve/47047512#47047512 – BENY Nov 01 '17 at 14:38