I want to create a new dataframe from existing dataframe in pyspark. The dataframe "df" contains a column named "data" which has rows of dictionary and has a schema as string. And the keys of each dictionary are not fixed.For example the name and address are the keys for the first row dictionary but that would not be the case for other rows they may be different. following is the example for that;
........................................................
data
........................................................
{"name": "sam", "address":"uk"}
........................................................
{"name":"jack" , "address":"aus", "occupation":"job"}
.........................................................
How do I convert into the dataframe with individual columns like following.
name address occupation
sam uk
jack aus job