0

Let's say I have a presto sql table as following:

Lunch choice | Person
Pasta | John
Burger | David
Pasta | Bob
Pizza | Lindet
Pasta | Hillary

I want to group by 'Lunch choice', show the number of distinct persons who choose the meal, and give examples of up to two of them (doesn't matter which ones), as the following:

Lunch choice | count | example
Pasta | 3 | John, Bob
Burger | 1 | David
Pizza | 1 | Lindet

I'm struggling with the example part (the distinct count is obviously very simple). Anyone with ideas?

Golden
  • 407
  • 2
  • 12
  • 1
    Does this answer your question? [Presto equivalent of MySQL group\_concat](https://stackoverflow.com/questions/44142356/presto-equivalent-of-mysql-group-concat) – OwlsSleeping Apr 12 '20 at 10:49
  • That's 80% of the solution thanks :) Do you know how to limit the array_agg? – Golden Apr 12 '20 at 10:52

2 Answers2

2

I would just use min() and max():

select lunch, count(*), min(person), nullif(max(person), min(person))
from t
group by lunch;

If you wanted all of them, then you can use array_agg():

select lunch, count(*), array_agg(person)
from t
group by lunch;

You could then extend this using slice() to get just two values:

select lunch, count(*), slice(array_agg(person), 1, 2)
from t
group by lunch;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

For the missing part of the duplicate - you'll need to find the second comma using strpos, and then read only characters up to that point.

something like:

select count(*)   --other columns etc..
, substr([comma separated names], 1, strpos([comma separated names], ',', 2))

If it gets a bit unreadable with all those nested formulae, split it across lines with indentation or use a cte.

OwlsSleeping
  • 1,487
  • 2
  • 11
  • 19