0

I need to select some data from a given dataframe with multiindex. Now I need to split it by the second index column.

                Data1   Data2   Data3   Data4
first  Second
 A      2010
 B      2010
 C      2010
 A      2011
 B      2011
 C      2011
 D      2011
 E      2011
 A      2012
 B      2012
 C      2012
 A      2013
 B      2013
 C      2013
 E      2013
 A      2014
 B      2014
 C      2014
 A      2015
 B      2015
 C      2015

Now I want to select all the data between 2010 and 2014, how should I sub-index this Dataframe?

Kid
  • 413
  • 4
  • 11
  • Asked here: https://stackoverflow.com/questions/45128523/pandas-multiindex-how-to-select-second-level-when-using-columns – Alex Fish Jul 24 '19 at 03:28

2 Answers2

1

I usually using get_level_values

lvl1=df.index.get_level_values(1)
df=df[(lvl1>2010)&(lvl1<=2014)]
BENY
  • 317,841
  • 20
  • 164
  • 234
1

You may also use query

df.query('2010 <= Second <= 2014')

Out[314]:
              n
first Second
A     2010    1
B     2010    1
C     2010    1
A     2011    1
B     2011    1
C     2011    1
D     2011    1
E     2011    1
A     2012    1
B     2012    1
C     2012    1
A     2013    1
B     2013    1
C     2013    1
E     2013    1
A     2014    1
B     2014    1
C     2014    1
Andy L.
  • 24,909
  • 4
  • 17
  • 29
  • What if the Second is not numerical data(string, datetimeindex obj). This doesn't work – Kid Jul 26 '19 at 02:00
  • @Kid: of course, `query` has its limitation. I gave the answer base on your sample data. If your `Second` is string, `query` still works but the answer would be different. – Andy L. Jul 26 '19 at 02:18
  • yw. Glad I could help :) – Andy L. Jul 26 '19 at 02:22