1

I have the following dataframe: g= pd.DataFrame({'A':[1,2,42,5,7],'B':[5,6,7,3,2]})

    A  B
0   1  5
1   2  6
2  42  7
3   5  3
4   7  2

am using the following list to filter the dataframe:

list_values = [5,7,1]

and get the following output using:

indexes = g[g['A'].isin(list_values)].index.values

output

array([0, 3, 4], dtype=int64)

How do I change the code so that indexes is the following?

array([3, 4, 0], dtype=int64)

Essentially, I am looking for a way to filter a DF with a list and return the original index values in the order of the filter list.

Thanks!

I looked at this but did not find what I was looking for: Select rows of pandas dataframe from list, in order of list

Josh
  • 75
  • 9
  • @ALollz - I suppose that would be okay. The important thing is keeping the not loosing the original index values and returning a list in the order that is in the same order as list_values. – Josh Feb 14 '20 at 22:26
  • 1
    `np.array([g.loc[g.A==i].index[0] for i in list_values])` works only if the values in 'A' are unique and all elements from list exist in 'A' – Asetti sri harsha Feb 14 '20 at 22:29
  • @Asettisriharsha - That worked beautifully, thanks! – Josh Feb 16 '20 at 02:53

1 Answers1

1

You can use an ordered CategoricalDtype to enforce a custom sorting order. After sorting you return all indices for 5, then 7, then 1.

import pandas as pd

my_cat = pd.CategoricalDtype(categories=list_values, ordered=True)
#CategoricalDtype(categories=[5, 7, 1], ordered=True)

g.loc[g['A'].isin(list_values), 'A'].astype(my_cat).sort_values().index
#Int64Index([3, 4, 0], dtype='int64')
ALollz
  • 57,915
  • 7
  • 66
  • 89