14

So I have 5 rows like this

userid, col
--------------
1, a
1, b
2, c
2, d
3, e

How would I do query so it will look like this

userid, combined
1, a b
2, c d
3, e
Tom H
  • 46,766
  • 14
  • 87
  • 128
haoxu
  • 143
  • 1
  • 1
  • 4

4 Answers4

44

In hive you can use

SELECT userid, collect_set(combined) FROM tabel GROUP BY user_id;

collect_set removes duplicated. If you need to keep them you can check this post:

COLLECT_SET() in Hive, keep duplicates?

Community
  • 1
  • 1
PanchaGil
  • 2,091
  • 2
  • 15
  • 8
14

Use the GROUP_CONCAT aggregate function:

  SELECT yt.userid,
         GROUP_CONCAT(yt.col SEPARATOR ' ') AS combined
    FROM YOUR_TABLE yt
GROUP BY yt.userid

The default separator is a comma (","), so you need to specify the SEPARATOR of a single space to get the output you desire.

If you want to ensure the order of the values in the GROUP_CONCAT, use:

  SELECT yt.userid,
         GROUP_CONCAT(yt.col ORDER BY yt.col SEPARATOR ' ') AS combined
    FROM YOUR_TABLE yt
GROUP BY yt.userid
OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
  • I am currently using Hive so I cant do the group by part because GROUP_CONCAT is not recognized as an aggregation function, anyway around it? – haoxu Sep 13 '10 at 20:47
  • @haoxu: Next option (from a SQL prespective) would be a cursor and string concatenation, but it would require a MySQL function. I'm not familiar with Hive - is there no way of using a native query? – OMG Ponies Sep 13 '10 at 21:11
  • 1
    Filed a JIRA request for GROUP_CONCAT in HiveQL: https://issues.apache.org/jira/browse/HIVE-1689 – Jeff Hammerbacher Oct 04 '10 at 11:25
  • 2
    GROUP_CONCAT doesn't actually exist in Hive yet... This solution is only applicable to MySQL. – thatha Oct 26 '12 at 22:24
  • Hi i am testing the same on sql lite https://sqliteonline.com/ GROUP_CONCAT is not working – BdEngineer Jun 16 '17 at 18:50
2
SELECT 
  userid,
  concat_ws(" ", collect_set(col)) AS combined
FROM table 
GROUP BY userid
formath
  • 319
  • 3
  • 17
2
  1. MySQL with duplicates: select col1, group_concat(col2) from table1 group by col1
  2. MySQL without duplicates: select col1, group_concat(distinct col2) from table1 group by col1
  3. Hive with duplicates: select col1, collect_list(col2) from table1 group by col1
  4. Hive without duplicates: select col1, collect_set(col2) from table1 group by col1
Safwan
  • 3,300
  • 1
  • 28
  • 33