1

I was trying to merge two dataframes using a less-than operator. But I ended up using pandasql.

Is it possible to do the same query below using pandas functions? (Records may be duplicated, but that is fine as I'm looking for something similar to cumulative total later)

sql = '''select A.Name,A.Code,B.edate from df1 A
        inner join df2 B on A.Name = B.Name
        and A.Code=B.Code
        where A.edate < B.edate '''

df4 = sqldf(sql)

The suggested answer seems similar but couldn't get the result expected. Also the answer below looks very crisp.

sjd
  • 1,329
  • 4
  • 28
  • 48
  • Does this answer your question? [How to do a conditional join in python Pandas?](https://stackoverflow.com/questions/23508351/how-to-do-a-conditional-join-in-python-pandas) – smci Jun 22 '20 at 06:15

1 Answers1

2

Use:

df = df1.merge(df2, on=['Name','Code']).query('edate_x < edate_y')[['Name','Code','edate_y']]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    jezrael, could you add this method to [How to do/workaround a conditional join in python Pandas?](https://stackoverflow.com/questions/23508351/how-to-do-workaround-a-conditional-join-in-python-pandas) – smci Jun 22 '20 at 06:20
  • @smci - I think it is something else, so not dupe, only similar. – jezrael Jun 22 '20 at 06:21
  • 3
    `df.query` is very useful but under-used. +1 – Ch3steR Jun 22 '20 at 06:23
  • rael, perhaps, but that other question could sorely use your mention of `df.query()`, so I recommend you add it there too. – smci Jun 22 '20 at 08:54
  • I faced a memory error though with large dataset while merging. Though it won't be an issue for small datasets. My actual datasets are 5M and 2M records. and it gave `MemoryError` while `sqldf` worked without error, – sjd Jun 22 '20 at 08:54
  • 1
    @sjd - Unfortunately pandas working not so nice like sql libs :( – jezrael Jun 22 '20 at 08:55
  • This works well for me. If people are facing a memory error, I would suggest subsetting your data (ie if working by user, do chunks of 500 users etc). Excellent suggestion I was stuck. – Christina Stebbins Jun 10 '22 at 19:07