2

I have a series A, e.g. 1.3, 4.5, 10.11 and a series B 0.8, 5.1, 10.1, 0.3, and I would like to get a series C with the closest number from A for every element in B: 1.3, 4.5, 10.11, 1.3

btw, if it simplifies things the closest number from A can be the closest number larger than the number in B, so the answer can also be 1.3, 10.11, 10.11, 1.3

related to How do I find the closest values in a Pandas series to an input number?

ihadanny
  • 4,377
  • 7
  • 45
  • 76

2 Answers2

4

Setup

A = pd.Series([1.3, 4.5, 10.11])
B = pd.Series([0.8, 5.1, 10.1, 0.3])

Option 1
Use pd.Series.searchsorted
This searches through A for each element of B and finds where in A that element of B should be inserted.

A.iloc[A.searchsorted(B)]

0     1.30
2    10.11
2    10.11
0     1.30
dtype: float64

Option 2
But to get at the nearest, you could hack the pd.Series.reindex method.

pd.Series(A.values, A.values).reindex(B.values, method='nearest')

0.8      1.30
5.1      4.50
10.1    10.11
0.3      1.30
dtype: float64
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • method 1 fails on `positional indexers are out-of-bounds` if `B = pd.Series([0.8, 5.1, 10.1, 0.3, 30])` – ihadanny Jan 03 '18 at 14:06
0

I tried using the reindex method, but having non-unique values in series A throws up an error.

A = pd.Series([1.0, 4.0, 10.0, 4.0, 5.0, 19.0, 20.0])
B = pd.Series([0.8, 5.1, 10.1, 0.3, 5.5])
pd.Series(A.values, A.values).reindex(B.values, method='nearest')

ValueError: cannot reindex a non-unique index with a method or limit

This was the workaround that I believe could be useful to others.

A = pd.Series([1.0, 4.0, 10.0, 4.0, 5.0, 19.0, 20.0])
B = pd.Series([0.8, 5.1, 10.1, 0.3, 5.5])
pd.Series(A.values, A.values).sort_index().drop_duplicates().reindex(B.values, method='nearest')

0.8      1.0
5.1      5.0
10.1    10.0
0.3      1.0
5.5      5.0
dtype: float64
Debjit Bhowmick
  • 920
  • 7
  • 20