The format of input data as follows:
+--------------------+-------------+--------------------+
| date | user | product |
+--------------------+-------------+--------------------+
| 2016-10-01 | Tom | computer |
+--------------------+-------------+--------------------+
| 2016-10-01 | Tom | iphone |
+--------------------+-------------+--------------------+
| 2016-10-01 | Jhon | book |
+--------------------+-------------+--------------------+
| 2016-10-02 | Tom | pen |
+--------------------+-------------+--------------------+
| 2016-10-02 | Jhon | milk |
+--------------------+-------------+--------------------+
And the format of output as follows:
+-----------+-----------------------+
| user | products |
+-----------------------------------+
| Tom | computer,iphone,pen |
+-----------------------------------+
| Jhon | book,milk |
+-----------------------------------+
The output shows all products every user bought order by date.
I want to process these data using Spark, who Can you help me, please? Thank you.