0

i need a pandas equivalent to following SQL join which is right now to slow...

# SQL lite solution

import sqlite3

#Make the db in memory
conn = sqlite3.connect(':memory:')

#write the tables
df_weather.to_sql('weather', conn, index=False)
data.to_sql('flights', conn, index=False)

qry = '''
    select *
    from weather left join flights
    ON weather.AIRPORT_IATA = flights.destination
    AND flights.scheduled_arrival_time BETWEEN weather.VALID_FROM_TIME AND weather.VALID_TO_TIME
    '''
data = pd.read_sql_query(qry, conn)

is there a pandas way to merge two dfs with SQL like above? (using between)

scheduled_arrival_time, VALID_FROM_TIME and VALID_TO_TIME are timestamps.

a simple pandas merge won't work.

Bernouy
  • 55
  • 7
  • Please provide a [mcve](/help/mcve). – rpanai Oct 08 '19 at 11:44
  • Possible duplicate of [How to join two dataframes for which column values are within a certain range?](https://stackoverflow.com/questions/46525786/how-to-join-two-dataframes-for-which-column-values-are-within-a-certain-range) – rpanai Oct 08 '19 at 11:46
  • It's a duplicate as stated above. If there is a memory/speed problem you might consider using `dask` – rpanai Oct 08 '19 at 11:47
  • i asked for a pandas way. in the possible duplicate there is no clear pandas way. maybe there is a way now? – Bernouy Oct 08 '19 at 11:51
  • Hi @Bernouy, first if you can provide a [mcve](/help/mcve) it will be great. Then the answer I suggested uses pandas and AFAIK there is no a way to directly translate your query to `pandas`. Again you might consider to left join with [dask](https://docs.dask.org/en/latest/dataframe-joins.html) or [vaex](https://vaex.readthedocs.io/en/latest/tutorial.html#Joining) and then filter. – rpanai Oct 08 '19 at 12:06

0 Answers0