2

I am trying to flatten a security table to make a single row per country. I am using Hive as the execution engine currently in hortonworks if this makes a difference to the SQL required.

An example of what im trying to achieve is below.

(country, Name)
(US, 'Matt'),
(US, 'Rocks'),
(GB, 'Stylus'),
(FR, 'Foo'),
(FR, 'Bar'),
(FR, 'Baz')

The query result I am after :

Country Name
----------- ------------------------------
GB           Stylus
FR           Bar, Baz, Foo
US           Matt, Rocks

I am really lost as to where to start. Does anyone know? I found some syntax on another thread that work fine in MSSQL but not here in hortonworks.

Optimal way to concatenate/aggregate strings

Many thanks

leftjoin
  • 36,950
  • 8
  • 57
  • 116
AndusM
  • 135
  • 5

2 Answers2

2

I'm not really familiar with Hadoop, but following Column to comma separated value in Hive you can make this:

SELECT country, collect_set(Name)
FROM your_table
GROUP BY country;
Blag
  • 5,818
  • 2
  • 22
  • 45
1

Use collect_set for string aggregation into array and concat_ws to convert array to the comma separated string

select country, concat_ws(',', collect_set(Name)) 
   from table
 group by country
leftjoin
  • 36,950
  • 8
  • 57
  • 116