Using pandas to look up the values of one column from the closest match between another column and multiple inputs

Question

I have two data frames one is a reference I am comparing the second data frame to the first column of the reference to find the closest matches and then returning the corresponding item from the second column of the reference data frame. I am trying to find a faster method to do this than what I’m currently doing which is a for loop the same as the one at the bottom which works but is there a better way to do it avoiding the iteration?

The expected results from looking up values for

a
1.1
2.1
2.9
3.1 
4.2
5.0

against reference values of

Would be

The method i have is

import numpy as np
import pandas as pd

def reference_df():
    A = [1, 2, 3, 4, 5]
    B = [10, 20, 30, 40, 50]
    df1 = pd.DataFrame(A, columns=['A'])
    df1['B'] = pd.Series(B, index=df1.index)
    return(df1)


def working_df():
    a = [1.1, 2.1, 2.9, 3.1, 4.2, 5.0]
    df1 = pd.DataFrame(a, columns=['a'])
    return(df1)


def Look_up():
    df1 = reference_df()
    df2 = working_df()
    A = df1['A']
    B = df1['B']
    a = df2['a']

    def Look_up_b(a):
        idx = (np.abs(A - a)).argmin()
        b = B[idx]
        return(b)

    b = []
    for i in a:
        b.append(Look_up_b(i))

    df3 = pd.DataFrame(a, columns=['a'])
    df3['b'] = pd.Series(b, index=df3.index)
    return(df3)

print(Look_up())

Why use a nested function here? As an aside, variable and function names should generally follow the `lower_case_with_underscores` style. — AMC, Apr 14 '20 at 15:15
Does this answer your question? [How do I find the closest values in a Pandas series to an input number?](https://stackoverflow.com/questions/30112202/how-do-i-find-the-closest-values-in-a-pandas-series-to-an-input-number) — AMC, Apr 14 '20 at 17:28

score 2 · Accepted Answer · answered Apr 14 '20 at 15:06

This is merge_asof:

# convert reference values to float
references['A'] = references['A'].astype('float64')

pd.merge_asof(df, references, 
              left_on='a', right_on='A',
              direction='nearest'
             )

Output:

     a    A   B
0  1.1  1.0  10
1  2.1  2.0  20
2  2.9  3.0  30
3  3.1  3.0  30
4  4.2  4.0  40
5  5.0  5.0  50

Using pandas to look up the values of one column from the closest match between another column and multiple inputs

1 Answers1