I have a table of IP addresses and a table of IP address ranges (start ip, end ip) that I'd like to join together. I've been able to make this work with the following query:
SELECT * FROM `ips` i
JOIN `ranges` a
ON NET.SAFE_IP_FROM_STRING(i.ip)
BETWEEN NET.SAFE_IP_FROM_STRING(a.start_ip)
AND NET.SAFE_IP_FROM_STRING(a.end_ip)
The problem I'm having is that it scales really badly. To do it for 10 IPs takes around 8 seconds, 100 takes 30 seconds, and 1000 takes a few minutes. I'd like to be able to do this for tens of millions of rows. (I have tried writing the output of NET.SAFE_IP_FROM_STRING
to the ranges table, but it only speeds things up by around 10%, and doesn't help with scaling).
The ranges don't overlap, so for every row in the input table I expect 0 or 1 rows in the output table. A LATERAL JOIN
would let me do that and almost certainly speed things up, but I don't think BigQuery supports them. Is there any other way to make this query faster and scalable?