4

How can I explode multiple columns pairs into multiple rows?

I have a dataframe with the following

client, type, address,    type_2,   address_2
abc,    home, 123 Street, business, 456 Street

I want to have a final dataframe with the follow

client, type, address
abc, home, 123 Street
abc, business, 456 Street

I tried using this code below but it return me 4 records instead of the two records I want

df .withColumn("type", explode(array("type", "type_2"))) .withColumn("address", explode(array("address", "address_2")))

I can do this with two separate dataframe and perform an union but I wanted to see if there was another way I can do it within a single dataframe

Thanks

tartancub
  • 51
  • 3

2 Answers2

5

you can do it using structs:

df
  .withColumn("str",explode(
    array(
      struct($"type",$"address"),
      struct($"type_2".as("type"),$"address_2".as("address"))))
  )
  .select($"client",$"str.*")
  .show()

gives

+------+--------+----------+
|client|    type|   address|
+------+--------+----------+
|   abc|    home|123 Street|
|   abc|business|456 Street|
+------+--------+----------+
Raphael Roth
  • 26,751
  • 15
  • 88
  • 145
0

Here is technique I use for complicated transformations - map records on the dataframe and use scala to apply transformation of any complexity.

Here I am hard coding creation of 2 rows, however any logic can be put here to explode rows as needed. I used flatmap to split array of rows into rows.

    val df = spark.createDataFrame(Seq(("abc","home","123 Street","business","456 Street"))).toDF("client", "type", "address","type_2","address_2")

    df.map{ r =>
      Seq((r.getAs[String]("client"),r.getAs[String]("type"),r.getAs[String]("address")),
         (r.getAs[String]("client"),r.getAs[String]("type_2"),r.getAs[String]("address_2")))
    }.flatMap(identity(_)).toDF("client", "type", "address").show(false)

Result

+------+--------+----------+
|client|type    |address   |
+------+--------+----------+
|abc   |home    |123 Street|
|abc   |business|456 Street|
+------+--------+----------+
Salim
  • 2,046
  • 12
  • 13
  • How can this working code be downvoted? No comment left either. I wonder why should I answer – Salim Jan 23 '20 at 13:28
  • Thank you for your answer I wasn't the one to downvote but I gave you an upvote just now. I like the other answer more as the code looks cleaner to me – tartancub Jan 23 '20 at 15:17
  • Thanks @tartancub. This tip will help when you have complex operation. IMHO this approach is readable too as it is standard Scala code. You can make a function and just call that too. – Salim Jan 23 '20 at 16:35