I use Spark 1.5.
I have a DataFrame A_DF
as follows:
+--------------------+--------------------+
| id| interactions|
+--------------------+--------------------+
| id1 |30439831,30447866...|
| id2 |37597858,34499875...|
| id3 |30447866,32896718...|
| id4 |33029476,31988037...|
| id5 |37663606,37627579...|
| id6 |37663606,37627579...|
| id7 |36922232,37675077...|
| id8 |37359529,37668820...|
| id9 |37675077,37707778...|
+--------------------+--------------------+
where interactions
is a String
. I want to explode this by first splitting the interactions
string into a set of substrings split by a comma which I try to do as follows:
val splitArr = udf { (s: String) => s.split(",").map(_.trim) }
val B_DF = A_DF.explode(splitArr($"interactions"))
but I am getting the following error:
error: missing arguments for method explode in class DataFrame;
follow this method with `_' if you want to treat it as a partially applied function A_DF.explode(splitArr($"interactions"))
which I don't understand. So I tried something even more complicated:
val B_DF = A_DF.explode($"interactions") { case (Row(interactions: String) =>
interactions.split(",").map(_.trim))
}
to which I am getting an inspection warning, that reads:
Expression of Type Array[String] does not conform to expected type TraversableOnce[A_]
Any ideas?