How do I compute the average duration of records whose end_date is 1-hour before this record's start_date?
I can do this with a self-join:
SELECT AVG(p.duration) AS prior_duration
FROM `bigquery-public-data`.london_bicycles.cycle_hire c
JOIN `bigquery-public-data`.london_bicycles.cycle_hire p
ON c.start_station_id = p.start_station_id AND
p.end_date BETWEEN TIMESTAMP_SUB(c.start_date, INTERVAL 3600 SECOND)
AND c.start_date
but how can I do it more efficiently (without a self-join)? something along the lines of:
AVG(duration)
OVER(PARTITION BY start_station_id
ORDER BY UNIX_SECONDS(end_date) ASC
RANGE BETWEEN 3600 PRECEDING AND CURRENT ROW) AS prior_duration
but which uses the start_date of current records.