I'm trying to query a table that uses a basic repeating field to store data like this:
+---+----------+------------+
| i | data.key | data.value |
+---+----------+------------+
| 0 | a | 1 |
| | b | 2 |
| 1 | a | 3 |
| | b | 4 |
| 2 | a | 5 |
| | b | 6 |
| 3 | a | 7 |
| | b | 8 |
+---+----------+------------+
I'm trying to figure out how to run a query that gets a result like
+---+----+----+
| i | a | b |
+---+----+----+
| 1 | 4 | 6 |
| 3 | 12 | 14 |
+---+----+----+
where each row represents a non-overlapping sum (i.e. i=1
is the sum of rows i=0
and i=1
) and the data has been pivoted such that data.key
is now a column.
Problem 1:
I did my best to convert this answer to use Standard SQL and ended up with:
SELECT
i,
(SELECT SUM(value) FROM UNNEST(data) WHERE key = 'a') as `a`,
(SELECT SUM(value) FROM UNNEST(data) WHERE key = 'b') as `b`
FROM
`dataset.testing.dummy`)
This works, but I'm wondering if there is a better way to do this, especially since it produces a particularly verbose query when trying to use analytic functions:
SELECT
i,
SUM(a) OVER (ORDER BY i ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS `a`,
SUM(b) OVER (ORDER BY i ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) AS `b`
FROM (
SELECT
i,
(SELECT SUM(value) FROM UNNEST(data) WHERE key = 'a') as `a`,
(SELECT SUM(value) FROM UNNEST(data) WHERE key = 'b') as `b`
FROM
`dataset.testing.dummy`)
ORDER BY
i;
Problem 2:
How do I write a ROW
or RANGE
statement such that resulting windows don't overlap. In the last query, I get a rolling sum over the data, which isn't quite what I'm looking to do.
+---+----+----+
| i | a | b |
+---+----+----+
| 0 | 1 | 2 |
| 1 | 4 | 6 |
| 2 | 8 | 10 |
| 3 | 12 | 14 |
+---+----+----+
The rolling sum produces a result for each row, whereas I'm attempting to reduce the number of rows returned.