1

This answer shows how to collapse several columns per group to one array of key:value pairs.

friends_map:
=================================
user_id    friend_id    confirmed
=================================
1          2            true
1          3            false
2          1            true
2          3            true
1          4            false
SELECT user_id, array_agg((friend_id, confirmed)) as friends
FROM friend_map
WHERE user_id = 1
GROUP BY user_id

returns

user_id |           friends            
--------+--------------------------------
      1 | {"(2,true)","(3,false)","(4,false)"}

My question

How to go the other way around, from the result table back to the initial table?

Guru Stron
  • 102,774
  • 10
  • 95
  • 132
Emman
  • 3,695
  • 2
  • 20
  • 44

1 Answers1

1

In PrestoSQL/Trino you can use unnest:

-- sample data
with dataset(user_id, friend_id, confirmed) as(
    values (1, 2, true),
    (1, 3, false),
    (2, 1, true),
    (2, 3, true),
    (1, 4, false)
),
friend_map as (
    SELECT user_id, array_agg((friend_id, confirmed)) as friends
    FROM dataset
    GROUP BY user_id
)

-- query
select user_id, friend_id, confirmed
from friend_map,
unnest(friends) as t(friend_id, confirmed);

UPD

It seems that currently Athena (do not have access to Athena ATM to test it myself) does not support automatic unnest of rows to columns so you can try:

select user_id, r.friend_id, r.confirmed 
from (
    select user_id, cast(r as row(friend_id INT, confirmed VARCHAR))) r
    from friend_map,
    unnest(friends) as t(r)
);

or

select user_id, r[1] friend_id, r[2] confirmed
from friend_map,
unnest(friends) as t(r)

P.S.

Note that Athena is not based on PostgreSQL, it is based on Presto/Trino (dependent on the engine version)

Guru Stron
  • 102,774
  • 10
  • 95
  • 132
  • Thanks. However, running this on Athena I get an error: _MISMATCHED_COLUMN_ALIASES: Column alias list has 2 entries but 't' has 1 columns available_ – Emman May 23 '23 at 16:24
  • @Emman yes, there is possibility of different `ROW` type handling in Athena and Trino (which I used and where the query works). Try `unnest(friends) as t(r)` and `select user_id, r`. If this works then you can try `select user_id, r.friend_id, r.friend_id` – Guru Stron May 23 '23 at 16:25
  • @Emman BTW what engine version are you running on? – Guru Stron May 23 '23 at 16:34
  • I'm using version 3 – Emman May 23 '23 at 16:37
  • So far, `select user_id, r from friend_map, unnest(friends) as t(r)` works. It gives a dictionary (curly braces) in each cell of `r`. However, `select user_id, r.friend_id, r.friend_id from friend_map, unnest(friends) as t(r)` results in an error :/ _COLUMN_NOT_FOUND: Column 'r.friend_id' cannot be resolved or requester is not authorized to access requested resources_ – Emman May 23 '23 at 16:38
  • @Emman what is result of `select typeof(friends)`? – Guru Stron May 23 '23 at 16:40
  • `array(row(integer, boolean))` – Emman May 23 '23 at 16:43
  • @Emman check the docs for [`ROW`](https://prestodb.io/docs/current/language/types.html#row) type, you might need to cast it (`cast(r as row(friend_id INT, confirmed VARCHAR)))`) to access fields. Also you can try indexers - `r[1]` (`r[1] friend_id , r[2] confirmed` or `r[0] friend_id , r[1] confirmed`, though access by index is Trino feature also). – Guru Stron May 23 '23 at 16:44
  • 1
    Eventually I used index (i.e., `r[1]`, `r[2]`), and it works great, thanks!! – Emman May 23 '23 at 16:53