The schema of data that read from hdfs is:
root
|-- id: string
|-- ext_json: string
while the data in ext_json is like:
[{'a':'1','b':'2'},{'a':'3','b':'4'}]
now I need to convert the data that schema is as follows:
root
|-- id: string
|-- ext_json: array
| |-- element: struct
| | |-- a: string
| | |-- b: string
How to do that?
Spark version is 2.0.1