I have two data frames one is a reference I am comparing the second data frame to the first column of the reference to find the closest matches and then returning the corresponding item from the second column of the reference data frame. I am trying to find a faster method to do this than what Iām currently doing which is a for loop the same as the one at the bottom which works but is there a better way to do it avoiding the iteration?
The expected results from looking up values for
a
1.1
2.1
2.9
3.1
4.2
5.0
against reference values of
A B
1 10
2 20
3 30
4 40
5 50
Would be
a B
1.1 10
2.1 20
2.9 30
3.1 30
4.2 40
5.0 50
The method i have is
import numpy as np
import pandas as pd
def reference_df():
A = [1, 2, 3, 4, 5]
B = [10, 20, 30, 40, 50]
df1 = pd.DataFrame(A, columns=['A'])
df1['B'] = pd.Series(B, index=df1.index)
return(df1)
def working_df():
a = [1.1, 2.1, 2.9, 3.1, 4.2, 5.0]
df1 = pd.DataFrame(a, columns=['a'])
return(df1)
def Look_up():
df1 = reference_df()
df2 = working_df()
A = df1['A']
B = df1['B']
a = df2['a']
def Look_up_b(a):
idx = (np.abs(A - a)).argmin()
b = B[idx]
return(b)
b = []
for i in a:
b.append(Look_up_b(i))
df3 = pd.DataFrame(a, columns=['a'])
df3['b'] = pd.Series(b, index=df3.index)
return(df3)
print(Look_up())