0

I have following two dataframe in pandas

trans_df
code     price       date           time        product
12023    71.23       01-01-2018     06:23:00    MS  
12023    61          01-01-2018     07:56:00    HS
12023    71.23       01-01-2018     08:34:00    MS
12023    71.30       01-01-2018     06:03:00    MS  
12023    61          01-01-2018     11:43:00    HS
12023    71.23       01-01-2018     10:11:00    MS
12023    71.23       01-01-2018     04:23:00    MS

master_price
code  price  effective_date_from    effective_date_to  time_from   time_to  product
12023 71.23  01-01-2018             02-01-2018         06:00:00    05:59:00  MS     
12023 61     01-01-2018             02-01-2018         06:00:00    05:59:00  HS 
12023 72.23  01-01-2018             02-01-2018         10:00:00    05:59:00  MS   

My desired dataframe would be

trans_df
code     price       date           time        product   flag     price
12023    71.23       01-01-2018     06:23:00    MS        match    71.23
12023    61          01-01-2018     07:56:00    HS        match    61
12023    71.23       01-01-2018     08:34:00    MS        match    71.23
12023    71.30       01-01-2018     06:03:00    MS        mismatch 71.30
12023    61          01-01-2018     11:43:00    HS        match    61
12023    71.23       01-01-2018     10:11:00    MS        mismatch 72.23 
12023    71.23       01-01-2018     14:23:00    MS        mismatch 72.23

Logic is, I want to compare price set in trans_df for MS and HS is in the range of datetime and price in master_price There may be two different price changes in a day.

e.g. in master_price on 01-01-2018 at 06:00:00 which is 71.23 and next price change is at 10:00:00 so transactions happening between 06:00:00- 10:00:00 should charge the price 72.23 and after 10:00 it should be 71.23 till effective_date_to if we get the mismatch in price then we will flag it as mismatch with correct price.

Neil
  • 7,937
  • 22
  • 87
  • 145

1 Answers1

0

Following the approach in Merging two dataframes based on a date between two other dates without a common column,

This approach is to:

  1. Create a Cartesian product between the two tables by joining on the product field
  2. Filtering out rows where the dates do not coincide
  3. Create your flag column based on whether the rows match

This solution will be sub-optimal for a high number of rows.

rich
  • 520
  • 6
  • 21
  • This method will not work on large dataset because of cartesian product. – Neil Dec 07 '18 at 08:17
  • Ok, another alternative (I haven't tested) could be this https://stackoverflow.com/a/45510620/10290585 using a SQL style join. – rich Dec 07 '18 at 08:50