Currently I know how to use the BigQuery Python API to make queries from <company_warehouse>.<table>
in the form
SELECT id, field1, field2, ...
FROM '<company_warehouse>.<table>'
WHERE field1==...
What I'd like to do is perform the above query but add in a restriction to search only on a subset of the total data restricted to a large set (list) of IDs I have in Python (as a list or pandas Series). i.e., the same as adding a clause WHERE id IN (id1,...,idn)
, but this method is inefficient for very large n.
This question has some SQL answers suggesting to do it with inner joins, but the answers aren't written with clearly usable code (even in other SQL DBMS's). How do I get this done in BigQuery?