How to set the property name when converting an array column to json in spark? (w/o udf)

Asked Aug 13 '18 at 16:03

Active Aug 13 '18 at 20:56

Viewed 67 times

I have a dataframe with a fixed size array column. Like this:

[v1, v2, v3, v4]

I need to convert the array to a json of the following structure:

{ 
   v1: {
          Min: v2,
          Max: v3,
          Count: v4
   }
}

While it is easy to achieve to inner structure, having v1 as the name of the property is more challenging.

I tried 'to_json' but the keys are taken from column names. In my case 'v1' changes in each row.

Is it possible to achieve this in pyspark without using a udf? If it helps, I am running on top of Databricks.

edited Aug 13 '18 at 20:56

pault

asked Aug 13 '18 at 16:03

Vitaliy

My current approach is inspired by [This answer](https://stackoverflow.com/a/36167904/180650), where I construct the json by concatenation. Hopefully as time goes by we will have better primitives for this. – Vitaliy Aug 14 '18 at 06:09

0 Answers0