0

I have a dataframe as below.

df
Out[209]: 
a       b         
User1   2019-07-01                        [The Milky Way]
        2019-07-02                                    NaN
        2019-07-03                                [Taken]
        2019-07-04                                    NaN
        2019-07-05                          [The Lobster]
        2019-07-06                        [Bloody Sunday]
        2019-07-07                  [Lost in Translation]
        2019-07-08                                    NaN
        2019-07-09                                    NaN
        2019-07-10                             [Face Off]
        2019-07-11                  [The Thief of Bagdad]
        2019-07-12                                    NaN
        2019-07-13                              [Charade]
        2019-07-14                             [Scarface]
        2019-07-15    [Anchorman 2: The Legend Continues]

I am trying to access the multiindex, when I access one level, i get the desired output.

df.loc['User1']
Out[211]: 
b
2019-07-01                        [The Milky Way]
2019-07-02                                    NaN
2019-07-03                                [Taken]
2019-07-04                                    NaN
2019-07-05                          [The Lobster]

But when i am trying the multilevel index accessing, it gives me the error as below.

df.loc['User1','2019-07-13']
IndexingError: Too many indexers

Index looks like this :

MultiIndex(levels=[['User1', 'User2', 'User3', 'User4', 'User5', 'User6', 'User7', 'User8', 'User9', 'User10'], [2019-07-01, 2019-07-02, 2019-07-03, 2019-07-04, 2019-07-05, 2019-07-06, 2019-07-07, 2019-07-08, 2019-07-09, 2019-07-10, 2019-07-11, 2019-07-12, 2019-07-13, 2019-07-14, 2019-07-15, 2019-07-16, 2019-07-17, 2019-07-18, 2019-07-19, 2019-07-20, 2019-07-21, 2019-07-22, 2019-07-23, 2019-07-24, 2019-07-25, 2019-07-26, 2019-07-27, 2019-07-28, 2019-07-29, 2019-07-30, 2019-07-31, 2019-08-01, 2019-08-02, 2019-08-03, 2019-08-04, 2019-08-05, 2019-08-06, 2019-08-07, 2019-08-08, 2019-08-09, 2019-08-10, 2019-08-11, 2019-08-12, 2019-08-13, 2019-08-14, 2019-08-15, 2019-08-16, 2019-08-17, 2019-08-18, 2019-08-19, 2019-08-20]]

How do I get past the error and access the multiindex.

I have found a proposed solution already in another question, but i get the below error.

upon trying the axis = 0 argument, i get the error as KeyError: ('User1', '2019-07-13')

Sarang Manjrekar
  • 1,839
  • 5
  • 31
  • 61

1 Answers1

0

It looks like:

  • User1 and 2019-07-13 are MultiIndex values (with levels named a and b, above respective index columns).
  • The variable named df is actually a Series (your printout has no column name, not even the default name like 0).

Note that df.loc['User1'] is a case of access to all elements of df, which have User1 at the first index level. Your printout contains all existing values at the second MultiIndex level and corresponding values.

So, if you want to use loc[...] with MulitiIndex, specifying values at both levels, then:

  • between square brackets should be a tuple,
  • containing MultiIndex values at consecutive levels.

Then, assuming that b column of your MultiIndex is of string type, try:

df.loc[('User1','2019-07-13')]

Another possibility is that b is e.g. of DateTime type and then you have to create a variable of DateTime type (with proper date) and use it as the second element of this tuple.

And to check the type of df variable, run type(df).

Valdi_Bo
  • 30,023
  • 4
  • 23
  • 41
  • Running the tuple based command gives me the error as : KeyError: 'the label [2019-07-02] is not in the [columns]'. I am running it on a Dataframe. – Sarang Manjrekar Sep 01 '19 at 06:26
  • Pleas provide a minimal, complete and verifiable example of your data. Your question leaves too much unknown details. – Valdi_Bo Sep 01 '19 at 08:20
  • it doesn't look like the OP has a multi-index. Try inspecting df.index and will clearly tell whether you have a multi-index or not. – Boris Apr 17 '20 at 13:09