Question:
In BigQuery, standard SQL, if I run
SELECT *
FROM mytable
CROSS JOIN UNNEST(mytable.array)
Can I be certain that the resulting row order is the same as the array order?
Example:
Let's say I have the following table mytable
:
Row | id | prediction
1 | abcd | [0.2, 0.5, 0.3]
If I run SELECT * FROM mytable CROSS JOIN UNNEST(mytable.prediction)
, can I be certain that the row order is the same as the array order? I.e. will the resulting table always be:
Row | id | unnested_prediction
1 | abcd | 0.2
2 | abcd | 0.5
3 | abcd | 0.3
More background on use case (argmax):
I'm trying to find the array index with the largest value for the array in each row (argmax), i.e. the second element (0.5
) in the array above. My target output is thus something like this:
Row | id | argmax
1 | abcd | 2
Using CROSS JOIN
, a DENSE_RANK
window function ordered by the prediction
value and a ROW_NUMBER
window function to find the argmax, I am able to make this work with some test data. You can verify with this query:
WITH predictions AS (
SELECT 'abcd' AS id, [0.2, 0.5, 0.3] AS prediction
UNION ALL
SELECT 'efgh' AS id, [0.7, 0.2, 0.1] AS prediction
),
ranked_predictions AS (
SELECT
id,
ROW_NUMBER() OVER (PARTITION BY id) AS rownum, -- This is the ordering I'm curious about
DENSE_RANK() OVER (PARTITION BY id ORDER BY flattened_prediction DESC) AS array_rank
FROM
predictions P
CROSS JOIN
UNNEST(P.prediction) AS flattened_prediction
)
SELECT
id,
rownum AS argmax
FROM
ranked_predictions
WHERE array_rank = 1
It could just be a coincidence that ROW_NUMBER
behaves well in my tests (i.e. that it is ordered according to the unnested array), so it would be nice to be certain.