I have a rdd as below
rdd_1 = ['"columns":["date","appname","appenv","appstate"]']
I want to convert it to a dataframe like below
+---------+
| columns |
+---------+
|date |
|appname |
|appenv |
|appstate |
+---------+
What I tried: I tired to create a schema as below and use that to create the dataframe,but that did not work
rdd_1_schema = StructType(
[
StructField('columns',ArrayType(StringType()))
])
The error output with the schema is below
rdd1.toDF(schema=rdd_1_schema).show()
Error:
TypeError: StructType can not accept object '"columns": in type <type 'str'>
2nd Try: I tried using flatmap
rdd1.flatMap(lambda x: map(lambda e: (x[0], e), x[1])).toDF().show()
but it takes each string as elements of list e.g of the output below
+---+---+
| _1| _2|
+---+---+
| ''| c|
+---+---+