-1

I have a dataFrame in the following format:


id     types
---   -------
1     {"BMW":"10000","Skoda":"12345"}
2     {"Honda":"90000","BMW":"11000","Benz":"56000"}

I need to create a new dataFrame like this:

id   types     value
--- ------   -------
1    BMW      10000
1    Skoda    12345
2    Honda    90000
2    BMW      11000
2    Benz     56000
notNull
  • 30,258
  • 4
  • 35
  • 50
  • 5
    Possible duplicate of [How to query JSON data column using Spark DataFrames?](https://stackoverflow.com/questions/34069282/how-to-query-json-data-column-using-spark-dataframes) – user10938362 May 20 '20 at 21:40

1 Answers1

2

Use from_json with MapType and explode the array.

Example:

import org.apache.spark.sql.types._
import org.apache.spark.sql.functions._
df.withColumn("jsn", from_json(col("types"),MapType(StringType(),StringType()))).
select("id",explode("jsn")).
show()
//+---+-----+-----+
//| id|  key|value|
//+---+-----+-----+
//|  1|  BMW|10000|
//|  1|Skoda|12345|
//|  2|Honda|90000|
//|  2|  BMW|11000|
//|  2| Benz|56000|
//+---+-----+-----+
notNull
  • 30,258
  • 4
  • 35
  • 50