0

Let's say I have an UNORDERED Dataframe :

df = pandas.DataFrame({'A': [6, 2, 3, 5]})

I have an input :

input = 3

I want to find the rank of my input in the list. Here :

expected_rank_in_df(input) = 2
# Because 2 < 3 < 5 < 6

Assumption : The input is always included in the dataframe. So for example, I will not find the position of "4" in this df.

The first idea was to use like here : Pandas rank by column value:

df.rank()

But it seems overkill to me as I don't need to rank the whole column. Maybe it's not ?

Vincent
  • 1,534
  • 3
  • 20
  • 42
  • I think you need to insert {} inside the brackets when you initialize df. `pandas.DataFrame({'A': [1, 3, 5]})` – lhay86 Jul 11 '18 at 14:11

2 Answers2

1

If you know for sure that the input is in the column, the rank will be equal to

df[df > input].count()

Does that make sense? If you intend on calling this multiple times, it may be worth it to just sort the column. But this is probably faster if you only care about a few inputs.

Rushabh Mehta
  • 1,529
  • 1
  • 13
  • 29
0

You can get first position of matched value by numpy.where with boolean mask for first True:

a = 3

print (np.where(np.sort(df['A']) == a)[0][0] + 1)
2

If default RangeIndex:

a = 3

print (df['A'].sort_values().eq(3).idxmax())
2

Another idea is count True values by sum:

print (df['A'].gt(3).sum())
2
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252