2

The question of selecting pandas dataframe rows on the basis of column values was addressed in:

Select rows from a DataFrame based on values in a column in pandas

without regard to ordering of the rows based on the order of the column values.

As an example, consider:

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
               'B': 'one one two three two two one three'.split(),
               'C': np.arange(8), 'D': np.arange(8) * 2})
print(df.loc[df['D'].isin([0,2])])

yields:

     A    B  C  D
0  foo  one  0  0
1  bar  one  1  2

as does:

print(df.loc[df['D'].isin([2,0])])

where the order of the column values has been reversed.

I'd like to know how I can modify this expression to respect the order of the desired column values such that the output would be:

     A    B  C  D
1  bar  one  1  2
0  foo  one  0  0
Community
  • 1
  • 1
rhz
  • 960
  • 14
  • 29

2 Answers2

1

You can put the values in a data frame, then do an inner join (merge by default) which should preserve the order on the left data frame:

D = pd.DataFrame({"D": [2, 0]})
D.merge(df)

#   D     A   B C
#0  2   bar one 1
#1  0   foo one 0

Or a more sure way to do this:

D.reset_index().merge(df).sort_values("index").drop("index", 1)
#   D     A   B C
#0  2   bar one 1
#1  0   foo one 0
Community
  • 1
  • 1
Psidom
  • 209,562
  • 33
  • 339
  • 356
0
df.loc[df['D'].isin([2,0])].sort_values(by='A')

Where 'A' can be any column the and ascending=False of you want reverse order.

Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • How would this change if I wanted to specify an **arbitrary** order on the column values (not necessarily monotonically increasing or decreasing)? Also as a variant, is there a way to modify the isin (call it requiredisin) to **require** each element in the list of columns? In other words, the code should barf if I write df.loc[df['D'].requiredisin([2,0, 2342])].sort_values(by='A') since 2342 is not one of the values in the D column? – rhz Apr 04 '17 at 17:53
  • I think you want data type [category](http://pandas.pydata.org/pandas-docs/stable/categorical.html#categorical-data). [isin](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.isin.html) returns boolean for each element, no requirement flag. – Scott Boston Apr 04 '17 at 18:39
  • df['A'] = df['A'].astype('category') df.A.cat.set_categories(['bar','foo'], ordered=True) – Scott Boston Apr 04 '17 at 19:51