Suppose a source table like this:
Year Long Lat Values
========================================
2015 8 1 12
2015 8 1 43
2015 8 1 11
...
The destiny table should be like this:
Year Long Lat Field1 Field2 Field3
========================================================================
2015 8 1 12 43 11
...
¿How can I do something like this in a single query (to use in Google BigQuery) ?
These post could be an approach of what I want: post1 post2
The "problem" is I will use this query in a Python Google Dataflow pipeline, so due to the restriction of read and write using apache beam I would need to do it in a single select (you can obviate the insert statement because I will use an apache beam function for that).
EXTRA
Using this select
select Fecha, Longitud, Latitud,
ARRAY_AGG(Valor) as vector
from `poc-cloud-209212.PruebasIoT.CanonicalFrames`
group by Fecha, Longitud, Latitud
If I try to use:
select Fecha, Longitud, Latitud,
ARRAY_AGG(Valor)[offset(0)] as field1,
ARRAY_AGG(Valor)[offset(1)] as field2
from `poc-cloud-209212.PruebasIoT.CanonicalFrames`
group by Fecha, Longitud, Latitud
I have the error:
Array index 1 is out of bounds (overflow)
The solution
This is what worked for my use case:
Simple query to understand easier the way you could do it base on the table of this post (which could have more rows, with different values than shows in the schema):
select year, long, lat,
array_agg(values)[SAFE_OFFSET(0)] as field1,
array_agg(values)[SAFE_OFFSET(1)] as field2,
array_agg(values)[SAFE_OFFSET(2)] as field3
from t
group by year, long, lat;