0

I'm trying to compute a rolling average using SQL and I was able to compute said average using the OVER and ROWS BETWEEN commands. I obtained a rolling average using the rows between the preceding 1 and following 1 rows, code seen below along with output in figure 1.

How can I modify my SQL query below so that I compute this rolling average using a 6 hour window for the observed row value CHARTTIME (i.e., 3 hours before and 3 hours after). Any help or advice would be greatly appreciated.

sql_query <- "SELECT SUBJECT_ID, HADM_ID, chartevents.ITEMID, VALUE, CHARTTIME, LABEL, AVG(SAFE_CAST(VALUE AS INT)) OVER(
  PARTITION BY SUBJECT_ID
  ORDER BY CHARTTIME
  ROWS BETWEEN 1 PRECEDING and 1 FOLLOWING
) AS MV_AVG
FROM chartevents
INNER JOIN d_items ON chartevents.ITEMID = d_items.ITEMID
WHERE chartevents.ITEMID IN (211,220045)
ORDER BY SUBJECT_ID, CHARTTIME 
"

Figure 1 enter image description here

theneil
  • 488
  • 1
  • 4
  • 14
  • Consider using RANGE clause instead of ROWS; as in :https://cloud.google.com/bigquery/docs/reference/standard-sql/window-function-calls#def_window_frame – tinazmu Nov 30 '22 at 21:45
  • @tinazmu should've mentioned that I first tried that, and get error "ORDER BY key must be numeric in a RANGE-based window with OFFSET PRECEDING or OFFSET FOLLOWING boundaries, but has type DATETIME" – theneil Nov 30 '22 at 21:55
  • There is a work-around for that here: https://stackoverflow.com/a/58551911/11695049. If all fails you can always use a subquery; why does it have to be a window function? – tinazmu Nov 30 '22 at 21:59
  • @tiazmu, because I'm a noob with SQL and usually process all my data with R's `data.table` after doing a SQL pull; but, I'd rather preprocess it in SQL if possible. How would *you* do it with a subquery? – theneil Dec 01 '22 at 00:21

0 Answers0