6

I have the following table:

   year   pop1     pop2
0   0    100000    100000
1   1    999000    850000
2   2    860000    700000
3   3    770000    650000

I want to find for each pop (pop1 ,pop2) the year the pop was closest to a given number, for example, the year the pop was the closest to 830000.

Is there any way to find the nearest value inside column based on given value?

I have seen this post (How do I find the closest values in a Pandas series to an input number?_ but seems like here the result is above and below and I wat to get in the end only one number.

*I don't have code example because I don't find any argument to use to get the nearest

Ch3steR
  • 20,090
  • 4
  • 28
  • 58
Reut
  • 1,555
  • 4
  • 23
  • 55

1 Answers1

7

Convert column year to index, then subtract value, get absolute values and last index (here year) by nearest value - here minimal by DataFrame.idxmin:

val = 830000

s = df.set_index('year').sub(val).abs().idxmin()
print (s)
pop1    2
pop2    1
dtype: int64
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • could you please elaborate regard the sub() and abs() part? like why do we subtract the population size and from what? not sure I understood this part though it works. thanks:) – Reut Oct 24 '20 at 19:04
  • @Reut - Sure, we need difference of all values by `val`, so using `sub` for subtract, then get positive and negative values because like `830010 - 830000 = 10` and `820090 - 830000 = -10` so need absolute values for find minimal a bit greater or a bit less rows. – jezrael Oct 25 '20 at 04:29