1

I have a column with arrays in it:

"subscriberPhoneNbrs" : [
        {
            "phoneType" : "HOM",
            "phoneNbr" : "9045682704"
        },
        {
            "phoneType" : "WRK",
            "phoneNbr" : "9045749004"
        }
    ]

I want to separate the array and give as different columns as below:

"subWorkPhone" : "9045682704",
"subHomePhone" : "9045749004",

Tried using explode function but I am not getting expected result.

Krzysztof Atłasik
  • 21,985
  • 6
  • 54
  • 76
Ga999
  • 71
  • 1
  • 6
  • Possible duplicate of [Querying Spark SQL DataFrame with complex types](https://stackoverflow.com/questions/28332494/querying-spark-sql-dataframe-with-complex-types) – user10938362 May 21 '19 at 19:29

1 Answers1

3

You can generate a list of columns to select:

case class Phone(phoneType: String, phoneNbr: String)

val df = List((0, List(Phone("HOM", "1234"), Phone("WRK", "5678")))).toDF("id", "subscriberPhoneNbrs")
df.show(false)

val dfMap = df.select(map_from_entries($"subscriberPhoneNbrs") as "phoneMap")

val renameMap = Map("WRK" -> "subWorkPhone", "HOM" -> "subHomePhone")
val newCols = renameMap.map(kv => col(s"phoneMap.${kv._1}").alias(kv._2)).toList

dfMap.select(newCols: _*).show

Will result in the following:

+------------+------------+
|subWorkPhone|subHomePhone|
+------------+------------+
|        5678|        1234|
+------------+------------+

map_from_entries doc :

static Column   map_from_entries(Column e)
Returns a map created from the given array of entries.
Michel Lemay
  • 2,054
  • 2
  • 17
  • 34