How to filter rows of Pandas dataframe by checking whether sub-level index value within a list?

Question

I have a sample Pandas dataframe df which has multi_level index:

>>> df
                STK_Name   ROIC   mg_r
STK_ID RPT_Date                       
002410 20111231      ???  0.401  0.956
300204 20111231      ???  0.375  0.881
300295 20111231     ????  2.370  0.867
300288 20111231     ????  1.195  0.861
600106 20111231     ????  1.214  0.857
300113 20111231     ????  0.837  0.852

and stk_list is defined as stk_list = ['600106','300204','300113']

I want to get the rows of df whose value of sub_level index STK_ID is within stk_list . The output is as below:

                STK_Name   ROIC   mg_r
STK_ID RPT_Date                       
300204 20111231      ???  0.375  0.881
600106 20111231     ????  1.214  0.857
300113 20111231     ????  0.837  0.852

Basically, I can achieve the target for this sample data by:

df = df.reset_index() ; df[df.STK_ID.isin(stk_list)]

But I already have columns 'STK_ID' & 'RPT_Date' in my application dataframe, so reset_index() will cause an error. Anyway, I want to directly filter against index instead of columns.

Learn from this : How to filter by sub-level index in Pandas

I try df[df.index.map(lambda x: x[0].isin(stk_list))] , and Pandas 0.8.1 gives AttributeError: 'unicode' object has no attribute 'isin',

My question: How should I filter rows of Pandas dataframe by checking whether sub-level index value within a list without using the reset_index() & set_index() methods?

`df[df.index.map(lambda x: x[0] in stk_list)]` ? – Avaris Nov 18 '12 at 10:54 — Avaris, Nov 18 '12 at 10:54
@hayden: Well, I guess I should :). – Avaris Nov 19 '12 at 11:32 — Avaris, Nov 19 '12 at 11:32

score 11 · Answer 1 · edited Nov 19 '12 at 12:28

11

How about using the level parameter in DataFrame.reindex?

In [14]: df
Out[14]: 
            0         1
a 0  0.007288 -0.840392
  1  0.652740  0.597250
b 0 -1.197735  0.822150
  1 -0.242030 -0.655058

In [15]: stk_list = ['a']

In [16]: df.reindex(stk_list, level=0)
Out[16]: 
            0         1
a 0  0.007288 -0.840392
  1  0.652740  0.597250

edited Nov 19 '12 at 12:28

Andy Hayden

359,921
101
625
535

answered Nov 19 '12 at 04:26

Chang She

16,692
8
40
25

score 11 · Accepted Answer · answered Nov 19 '12 at 11:48

You can try:

df[df.index.map(lambda x: x[0] in stk_list)]

Example:

In : stk_list
Out: ['600106', '300204', '300113']

In : df
Out:
                STK_Name   ROIC   mg_r
STK_ID RPT_Date
002410 20111231      ???  0.401  0.956
300204 20111231      ???  0.375  0.881
300295 20111231     ????  2.370  0.867
300288 20111231     ????  1.195  0.861
600106 20111231     ????  1.214  0.857
300113 20111231     ????  0.837  0.852

In : df[df.index.map(lambda x: x[0] in stk_list)]
Out:
                STK_Name   ROIC   mg_r
STK_ID RPT_Date
300204 20111231      ???  0.375  0.881
600106 20111231     ????  1.214  0.857
300113 20111231     ????  0.837  0.852

score 8 · Answer 3 · answered Jul 13 '15 at 13:00

I'm pretty late to the party, but surely the most readable and intuitive way to do this is using index.levels[n].isin?

It works like this:

>>> stk_list = [600106, 300204, 300113]
>>> df[df.index.levels[0].isin(stk_list)]
                STK_Name   ROIC   mg_r
STK_ID RPT_Date                       
300204 20111231      ???  0.375  0.881
300295 20111231     ????  2.370  0.867
300113 20111231     ????  0.837  0.852

What I like about this approach is that the command can virtually be read like an English sentence.

p.s. in the OP, the stk_list is a list of strings. A bit of list comprehension-fu will deal with that:

df[df.index.levels[0].isin([int(i) for i in stk_list])]

This did not work for me while df[df.index.isin(stk_list, .level=0)] did. — UMY, Oct 18 '16 at 07:04
of course the . in front of level is not supposed to be there. — UMY, Oct 18 '16 at 07:11

score 1 · Answer 4 · answered Apr 23 '16 at 18:00

1

For me it only worked if I take the zero out of the x as follows:

a[a.index.map(lambda x: x in b)]

answered Apr 23 '16 at 18:00

tsando

4,557
2
33
35

Is this a versioning thing? – Joshua Cook May 12 '16 at 14:52

Shoresh · Answer 5 · 2017-02-14T19:40:17.757

0

get_level_values:

df[df.index.get_level_values(level = 0).isin(stk_list)]

edited Feb 14 '17 at 19:40

answered Jan 25 '17 at 20:58

Shoresh

2,693
2
16
9

How to filter rows of Pandas dataframe by checking whether sub-level index value within a list?

5 Answers5