I have a dataframe as follows:
+---------------+--------------------+
|IndexedArtistID| recommendations|
+---------------+--------------------+
| 1580|[[919, 0.00249262...|
| 4900|[[41749, 7.143963...|
| 5300|[[0, 2.0147272E-4...|
| 6620|[[208780, 9.81092...|
+---------------+--------------------+
I want to split the recommendations column so as to have a dataframe as follows:
+---------------+--------------------+
|IndexedArtistID| recommendations|
+---------------+--------------------+
| 1580|919 |
| 1580|0.00249262 |
| 4900|41749 |
| 4900|7.143963 |
| 5300|0 |
| 5300|2.0147272E-4 |
| 6620|208780 |
| 6620|9.81092 |
+---------------+--------------------+
So basically, I want to split the feature vector into columns and then merge those columns into a single column. The merging part is described in : How to split single row into multiple rows in Spark DataFrame using Java. Now, how to carry out the splitting part using java? For scala, it is explained here: Spark Scala: How to convert Dataframe[vector] to DataFrame[f1:Double, ..., fn: Double)], but I am not able to find a way to proceed the same way in java as given in the link.
The schema for the dataframe is below and the value of IndexedUserID is to be taken into the newly created recommendations column:
root
|-- IndexedArtistID: integer (nullable = false)
|-- recommendations: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- IndexedUserID: integer (nullable = true)
| | |-- rating: float (nullable = true)