From looking through Stackoverflow, and other sources, I believe that changing my dataframes to data.tables and using setkey, or similar, will give what I want. But as of yet I have been unable to get a working Syntax.
I have two data frames, one containing 26000 rows and the other containing 6410 rows.
The first dataframe contains the following columns:
Customer name, Base_Code, Idenity_Number, Financials
The second dataframe holds the following:
Customer name, Base_Code, Idenity_Number, Financials, Lapse
Both sets of data have identical formatting.
My goal is to join the Lapse column in the second dataframe to first dataframe. The issue I have is that the numeric value in Financials does not match between the two datasets and I only want the closest match in DF1 to have the value in the Lapse column in DF2 against it.
There will be examples where there are multiple entries for the same customer ID and Base Code in each dataframe, so I need to merge the two based on Idenity_Number and Base_Code (which is exact) and then match against the nearest financial numeric match for each entry only.
There will never be more entries in the DF2 then held within DF1 for each Customer and Base_Code.
Here is an example of DF1:
Here is an example of DF2:
And finally, here is what I want end up with:
If we use Jessica Rabbit as the example we have a match against DF1 and DF2, the financial value of 1240 from DF1 was matched against 1058 in DF2 as that was the closest match.