Table dataset.transactions
contains ~5.000.000 records.
1) This query takes ~ 3 seconds:
SELECT *
FROM dataset.transactions
WHERE customer = 'c1' AND year = 2017
2) This query takes about 10 seconds:
SELECT
salesrep_id AS id,
date AS lastUsedForFplDate,
fpl AS individual_fpl,
ANY_VALUE(salesrep_name) AS salesrep_name,
ANY_VALUE(customer) AS customer
FROM dataset.transactions VT1
WHERE date = (
SELECT
MAX(date)
FROM dataset.transactions VT2
WHERE
VT1.salesrep_id = VT2.salesrep_id
)
GROUP BY
salesrep_id,
date,
fpl
3) Whereas this one takes more than 200 seconds (tried few times, cancelled it each time after 200 seconds):
WITH transactions AS (
SELECT *
FROM dataset.transactions
WHERE customer = 'c1' AND year = 2017
)
SELECT
salesrep_id AS id,
date AS lastUsedForFplDate,
fpl AS individual_fpl,
ANY_VALUE(salesrep_name) AS salesrep_name,
ANY_VALUE(customer) AS customer
FROM transactions VT1
WHERE date = (
SELECT
MAX(date)
FROM transactions VT2
WHERE
VT1.salesrep_id = VT2.salesrep_id
)
GROUP BY
salesrep_id,
date,
fpl
Third query consist of the two queries above, except the #2 uses output of #1 as a source.
Why does it take so much time?