Let's consider two dataframes df1 and df2. I would like to join dataframes based on the date difference only. For Example;
Dataframe 1: (df1)
| version_id | date_invoiced | product_id |
-------------------------------------------
| 1 | 03-07-2020 | 201 |
| 1 | 02-07-2020 | 2013 |
| 3 | 02-07-2020 | 2011 |
| 6 | 01-07-2020 | 2018 |
| 7 | 01-07-2020 | 201 |
Dataframe 2: (df2)
| validfrom | pricelist| pricelist_id |
------------------------------------------
|02-07-2020 | 10 | 101 |
|01-07-2020 | 20 | 102 |
|29-06-2020 | 30 | 103 |
|28-07-2020 | 10 | 104 |
|25-07-2020 | 5 | 105 |
I need to map the pricelist_id and the pricelist based on the the validfrom column present in df2. Say that, based on the least difference between the date_invoiced (df1) and validfrom (df2), the row should be mapped.
Expected Outcome:
| version_id | date_invoiced | product_id | date_diff | pricelist_id | pricelist |
----------------------------------------------------------------------------------
| 1 | 03-07-2020 | 201 | 1 | 101 | 10 |
| 1 | 02-07-2020 | 2013 | 1 | 102 | 20 |
| 3 | 02-07-2020 | 2011 | 1 | 102 | 20 |
| 6 | 01-07-2020 | 2018 | 1 | 103 | 30 |
| 7 | 01-07-2020 | 201 | 1 | 103 | 30 |
I need to map purely based on the difference and the difference should be the least. Always, the date_invoiced (df1), should have closest difference comparing to validfrom (df2). Thanks