0

my dataframe looks like :

StationID | Extlist | Situation
5         | 3,2   | Situation_1

the formats are strings. I would like to transform it to split the "x,y" into lines like this ;

StationID | Extlist | Situation
5         | 3       | Situation_1
5         | 2       | Situation_1

Thank's in advance

SimbaPK
  • 566
  • 1
  • 7
  • 26

1 Answers1

0

You can simply split then explode the column, as follows:

import org.apache.spark.sql.functions._

val df = Seq(
  (5, "3,2", "Situation_1"),
  (6, "4,3,2", "Situation_2")
).toDF("StationID", "Extlist", "Situation")

df.withColumn("Extlist", explode(split($"Extlist", ","))).
  show
// +---------+-------+-----------+
// |StationID|Extlist|  Situation|
// +---------+-------+-----------+
// |        5|      3|Situation_1|
// |        5|      2|Situation_1|
// |        6|      4|Situation_2|
// |        6|      3|Situation_2|
// |        6|      2|Situation_2|
// +---------+-------+-----------+
Leo C
  • 22,006
  • 3
  • 26
  • 39