Can any one please advise on Spark-SQL query that is used to combine multiple rows based on column1, date(order asc) into one row by considering the column1 unique values. Below is the data
This is the enrollment table that has data in this way:
column1 column2 timeStamp
abc enrolled 2022/09/01
abc changed 2022/09/02
abc registered 2022/09/04
abc blocked 2022/09/05
abc left 2022/09/06
def enrolled 2022/09/20
def changed 2022/09/21
def changed 2022/09/21
def changed 2022/09/24
def left 2022/09/25
ghi registered 2022/09/01
ghi changed 2022/09/02
ghi left 2022/09/03
ghi returned 2022/10/03
Needed the output of the query to be like below:
out_column1 out_column2
abc enrolled-changed-registered-blocked-left
def enrolled-changed-changed-left
ghi registered-changed-left-returned
group_concat function is not available in pySpark