0
  • I have two tables Order and Product.
  • Both tables have one ID field, 3 feature fields, and one Date field.
  • I want to create an output table that can help me map from the Order table to the Product table. The conditions to satisfy are:
    • The order should match all the 3 features with the Product and should be a one-one mapping.
    • The Delivery Date of the Product should be before the Order date.
    • The product-order combination should be unique. For 2 orders with the same combination of features, they will get assigned to the product which has the earliest delivery date. Ideally, the first order should get the first delivered product, and the second-order should get the second delivered product if they have the same set of features.

Please see the screenshots for an example:

image I tried using left join but I am getting one order mapped to multiple products.

I need to have the one-one mapping of an order to a single product in such a way that the delivery date is before the order date.

Please advise how to write code in Python.

import pandas as pd

order = pd.DataFrame({'order_id': [1,2,3,4,5], 
                       'Feature 1': ['A', 'A', 'B', 'B','B'],
                       'Feature 2': ['D', 'E', 'D', 'E','E'],
                       'Feature 3': ['G', 'G', 'H', 'H','H'],
                       'Delivery Date': ['01-Feb-2021','02-Apr-2021','03-May-2021','04-Jun-2021','05-Jun-2021']
                      })

product = pd.DataFrame({'product_id': [1,2,3,4,5,6,7,8,9,10,11,12], 
                       'Feature 1': ['A','A','A','B','B','B','B','A','B','B','B','B'],
                       'Feature 2': ['D','E','E','D','E','D','E','D','E','D','E','E'],
                       'Feature 3': ['G','H','G','G','G','H','H','G','H','H','H','H'],
                       'Delivery Date': ['15-Feb-2021','16-Feb-2021','17-Feb-2021','18-Feb-2021','19-Feb-2021','20-Jun-2021',
                                      '21-Jul-2021','01-Jan-2021','21-Feb-2021','20-Feb-2021','10-Feb-2021','21-Jul-2021']
                      })
Sreenivas
  • 1
  • 1
  • You should use inner join. `df1.merge(df2, how='inner', on='a')` – Irfanuddin Feb 02 '22 at 19:11
  • Does this answer your question? [pandas: merge (join) two data frames on multiple columns](https://stackoverflow.com/questions/41815079/pandas-merge-join-two-data-frames-on-multiple-columns) – Irfanuddin Feb 02 '22 at 19:15
  • Hi @irfanuddin, inner or left join will given multiple combinations as output. new_df = pd.merge(order, product, how='left', left_on=['Feature_1','Feature_2','Feature_3'], right_on =['Feature_1','Feature_2','Feature_3']), new_df1 = pd.merge(order, product, how='inner',on= ['Feature_1','Feature_2','Feature_3']). For example. Order 1 gets assigned with both Product 1 and 8. Ideally, it should only be assigned with 8 since product 8 has the earliest delivery date. Also, product 8 should no longer be available to match for the remaining orders in the list. – Sreenivas Feb 02 '22 at 19:42
  • I get it, I probably can write a piece of code in the morning :) – Irfanuddin Feb 02 '22 at 19:53

0 Answers0