I have a data set like the following below:
Input Dataset
Id, Parent_id, Data
-----------------------
1, NULL, favorite: 3
2, NULL, favorite: 4
Output Dataset
Id, Parent_Id, Data
------------------------
1, NULL, favorite: 3
1_t1, 1, favorite: 3
1_t2, 1, favorite: 3
1_t3, 1, favorite: 3
2, NULL, favorite: 4
2_t1, 2, favorite: 4
2_t2, 2, favorite: 4
2_t3, 2, favorite: 4
2_t4, 2, favorite: 4
As you can see above that I am trying to explode the data column favorite counts property into their own individual rows and using the parent_id column to represent its root record.
So far I v tried using a Spark SQL Explode function to try to do this but however I wasn't able to get it working.