I have two pandas dataframes:
df1 = pd.DataFrame({'col1': [1.2574, 5.3221, 4.3215, 9.8841], 'col2': ['a', 'b', 'c', 'd']})
df2 = pd.DataFrame({'col1': [4.326, 9.89, 5.326, 1.2654], 'col2': ['w', 'x', 'y', 'z']})
Now I want to compare the values in col1
of both dataframes. Consider 5.3221
from df1
, I want to check if this value exists in df2['col1']
with an error of 0.005
(in this very example 5.326
from df2['col1']
should be considered equal to 5.3221
) and make a third dataframe to hold both columns from df1
and df2
where the above said condition is true.
The expected output is:
col1 col2 col1.1 col2.2
0 5.3221 b 5.236 y
1 4.3215 c 4.326 w
I have defined a function which is able to take care of the error condition:
def close(a, b, e=0.005):
return round(abs(a - b), 3) <= e
But I don't know how to apply this on the data without using a for
loop. I also know that I can use numpy.intersect1d
but I can not figure out how.
Any help would be appreciated :)
EDIT: The suggested duplicate answer doesn't address my problem. That question just works on combining two dataframes based on similar looking indices. Also difflib
is used to find word matches and not integer. My scenario is completely different.