9

I've started using Pandas for some large Datasets and mostly it works really well. There are some questions I have regarding the indices though

  1. I have a MultiIndex with three levels - let's say a, b, c. How do I slice along index a - I just want the values where a = 5, 7, 10, 13. Doing df.ix[[5, 7, 10, 13]] does not work as pointed out in the documentation

  2. I need to have different indices on a DF - can I create these multiple indices and not associate them to a dataframe and use them to give me back the raw ndarray index?

  3. Can I slice a MultiIndex on its own not in a series or Dataframe?

Thanks in advance

Wolfgang Kerzendorf
  • 724
  • 1
  • 9
  • 24

2 Answers2

12

For the first part, you can use boolean indexing using get_level_values:

df[df.index.get_level_values('a').isin([5, 7, 10, 13])]

For the second two, you can inspect the MultiIndex object by calling:

df.index

(and this can be inspected/sliced.)

Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • brilliant! so I still have one problem. I can slice on multiindex, but I give it a raw index and it gives me back a tuple. I want it the other way around, so: myindex[1, 2, 4] – Wolfgang Kerzendorf Dec 21 '12 at 17:08
  • @WolfgangKerzendorf So you want it exported to an array? I think the issue is that behind the scenes pandas stores using indices of .level, and doesn't store this array... I will take another look. Hopefully there is a better way than `np.array(map(np.array, df.index.values))` (!) – Andy Hayden Dec 21 '12 at 17:21
  • So I found that index.get_loc is similar to what I want. It translates from a key to an actual location - but it is not as useful as the .ix notation of a series. For now I think i will just do my_index = Series(arange(len(df)), index=myselectedindex) – Wolfgang Kerzendorf Dec 21 '12 at 19:42
  • df.index.get_level_values('a') gives back an array which doesn't have a method isin. – Wolfgang Kerzendorf Dec 21 '12 at 20:44
  • @WolfgangKerzendorf what version of pandas are you using? – Andy Hayden Dec 22 '12 at 10:27
  • @WolfgangKerzendorf In `0.10` get_level_values returns an Index rather than an array, which has a `isin` method :) – Andy Hayden Dec 22 '12 at 10:36
  • no i'm on 0.9.1, but will upgrade. Thanks - I will update the answer ( I realized that I can do that). – Wolfgang Kerzendorf Dec 26 '12 at 06:30
2

Edit: This answer for pandas versions lower than 0.10.0 only:

Okay @hayden had the right idea to start with:

An index has the method get_level_values() which returns, however, an array (in pandas versions < 0.10.0). The isin() method doesn't exist for arrays but this works:

from pandas import lib
lib.ismember(df.index.get_level_values('a'), set([5, 7, 10, 13])

That only answers question 1 - but I'll give an update if I crack 2, 3 (half done with @hayden's help)

K.-Michael Aye
  • 5,465
  • 6
  • 44
  • 56
Wolfgang Kerzendorf
  • 724
  • 1
  • 9
  • 24