0

I'd like to run a query on a table that matches any of the following sets of conditions:

SELECT
id,
time
FROM
TABLE
WHERE
<condition1 is True> OR,
<condition2 is True> OR,
<condition3 is True> OR,
...

Each condition might look like:

id = 'id1' AND t > 20 AND t < 40

The values from each WHERE condition (id, 20, 40 above) are rows in a pandas dataframe - that is 20k rows long. I see two options that would technically work:

  • make 20k queries to the database, one for each condition, and concat the result
  • generate a (very long) query as above and submit

My question: what would be an idiomatic/performant way to accomplish this? I suspect neither of the above are appropriate approaches and this problem is somewhat difficult to google.

anon01
  • 10,618
  • 8
  • 35
  • 58
  • 2
    If you are really looking for the pandas way to do this, check out [Dynamically filtering a pandas dataframe](https://stackoverflow.com/questions/45925327/dynamically-filtering-a-pandas-dataframe). – cs95 May 02 '19 at 21:44
  • @cs95 the db is postgres and too large/too much io to pull from the server. It looks like I need to remove the pandas tag :) – anon01 May 02 '19 at 21:45

1 Answers1

0

I think it would be better to create a temporary table with columns id, t1, and t2 and put your 20k rows in there. Then just join to this temporary table:

SELECT DISTINCT TABLE.id, time
FROM TABLE 
JOIN TEMP_TABLE T2 ON
TABLE.ID = T2.ID AND TABLE.T > T1 AND TABLE.T < T2;
Jeremy
  • 6,313
  • 17
  • 20