1

I have a numpy array df for which I calculate 10 quantiles using this:

np.quantile(df, np.arange(0.1, 1.1, 0.1))

which gives:

array([0.73518751, 0.73774966, 0.7415471 , 0.74462405, 0.74640229, 0.74952848, 0.7522163 , 0.75820953, 0.76289489, 0.76954299])

But I also wanted to know the indices of these 10 values (or nearest values). Basically, which values reflect the quantiles the closest in the original array.

How can I do this?

df

array([[0.74472344, 0.72122723, 0.75860745, 0.76498497, 0.70747197,
        0.74813282, 0.74042159, 0.74722695, 0.73554462, 0.75064671,
        0.74331897, 0.74716657, 0.73778194, 0.74615586, 0.76285201,
        0.75964642, 0.73241562, 0.75377691, 0.75396305, 0.75909799,
        0.73139387, 0.74406022, 0.7466802 , 0.74904978, 0.76954299,
        0.74861521, 0.73639923, 0.72030777, 0.76361781, 0.74158204,
        0.75146145, 0.7445026 , 0.75811005, 0.72869432, 0.74437159,
        0.73829693, 0.73757732, 0.75386381, 0.75234902, 0.76638651,
        0.75146788, 0.74095088, 0.72755104, 0.74487853, 0.74637753,
        0.73871291, 0.76512724, 0.75878656, 0.76340133, 0.75192058,
        0.74568671, 0.75875962, 0.74529624, 0.73688561, 0.75444639,
        0.74146557, 0.71298224, 0.76272094, 0.73547828, 0.73585916,
        0.73830914, 0.7449885 , 0.74602336, 0.75998384, 0.7620157 ,
        0.75025225, 0.75024652, 0.74897128, 0.736911  , 0.75200063,
        0.75215942, 0.75381088, 0.74398333, 0.74050236, 0.75152248,
        0.7435388 , 0.75403506, 0.76490963, 0.73895139, 0.74835008,
        0.76352328, 0.74488378, 0.75088626, 0.74712956, 0.76071763,
        0.75406158, 0.73689425, 0.73673624, 0.76328081, 0.73762053,
        0.74111873, 0.74642706, 0.76343733, 0.74263775, 0.73242599,
        0.74470502, 0.74260616, 0.74389631, 0.75280219, 0.73257053]])
maximusdooku
  • 5,242
  • 10
  • 54
  • 94
  • 1
    Possible duplicate of [How do I get the index of a specific percentile in numpy / scipy?](https://stackoverflow.com/questions/26070514/how-do-i-get-the-index-of-a-specific-percentile-in-numpy-scipy) – G. Anderson Nov 27 '18 at 22:29
  • @G.Anderson Both the answers deal with finding the index of a particular quantile, not a range of quantiles. – maximusdooku Nov 27 '18 at 22:33
  • @G.Anderson Can you please explain the duplicate tag? I have my answer, but I would like an explanation. – maximusdooku Nov 27 '18 at 22:46
  • I found another question that might provide either an answer to a similar problem, or a solution that maybe could be adapted and flagged it both to see if admins thought it was a duplicate, and if you might find it helpful – G. Anderson Nov 27 '18 at 22:55

1 Answers1

4

You np.abs() and argmin():

quantiles = np.quantile(df, np.arange(0.1, 1.1, 0.1))

[(np.abs(df - i)).argmin() for i in quantiles]

Returns:

[58, 12, 29, 95, 44, 23, 70, 32, 14, 24]
rahlf23
  • 8,869
  • 4
  • 24
  • 54