I have a table which contains id, offset, text. Suppose input:
id offset text
1 1 hello
1 7 world
2 1 foo
I want output like:
id text
1 hello world
2 foo
I'm using:
df.groupby(id).agg(concat_ws("",collect_list(text))
But I don't know how to ensure the order in the text. I did sort
before groupby
the data, but I've heard that groupby
might shuffle the data. Is there a way to do sort
within group after groupby
data?