0

I am using redshift

I have a table like this : metric is a super type, built with the array() function within redshift

user metrics
red array(2021, 120)
red array(2020, 99)
blue array(2021, 151)

I would like to do : select user, max(metrics) from table group by user

and get this :

user metrics
red array(2021, 120)
blue array(2021, 151)

Sadly using this query, I only get null values

Do you know how to handle that ?

Thanks

alankar
  • 41
  • 3

1 Answers1

0

If you are familiar with Redshift Spectrum, the logic is very similar to unnest an array field when you query an external schema.

In your case, the query is pretty simple:

SELECT t.user, max(metric)
FROM my_schema.my_table as t, t.metrics as metric
GROUP BY 1

If the array contains types other than numerical ones, you can simply cast it to int or double like:

max(metric::int)

In this way, pure string such as "hello world" are considered as null, but string like "33333" is converted to int.

Hyruma92
  • 836
  • 8
  • 24
  • It seems here it would only send back the maximum value within the array for each user : | red | 2021 | and | blue | 2021 | It may be dangerous if the second element of array would have been a string for exemple (which would be considered as the max value) – alankar Jun 23 '21 at 15:37
  • @alankar I updated the answer to handle that case – Hyruma92 Jun 24 '21 at 09:31