Pandas find the nearest value for in a column

Question

I have the following table:

   year   pop1     pop2
0   0    100000    100000
1   1    999000    850000
2   2    860000    700000
3   3    770000    650000

I want to find for each pop (pop1 ,pop2) the year the pop was closest to a given number, for example, the year the pop was the closest to 830000.

Is there any way to find the nearest value inside column based on given value?

I have seen this post (How do I find the closest values in a Pandas series to an input number?_ but seems like here the result is above and below and I wat to get in the end only one number.

*I don't have code example because I don't find any argument to use to get the nearest

score 7 · Accepted Answer · answered Oct 23 '20 at 11:34

7

Convert column year to index, then subtract value, get absolute values and last index (here year) by nearest value - here minimal by DataFrame.idxmin:

val = 830000

s = df.set_index('year').sub(val).abs().idxmin()
print (s)
pop1    2
pop2    1
dtype: int64

answered Oct 23 '20 at 11:34

jezrael

822,522
95
1,334
1,252

could you please elaborate regard the sub() and abs() part? like why do we subtract the population size and from what? not sure I understood this part though it works. thanks:) – Reut Oct 24 '20 at 19:04
@Reut - Sure, we need difference of all values by `val`, so using `sub` for subtract, then get positive and negative values because like `830010 - 830000 = 10` and `820090 - 830000 = -10` so need absolute values for find minimal a bit greater or a bit less rows. – jezrael Oct 25 '20 at 04:29

Pandas find the nearest value for in a column

1 Answers1