2

I have a dataframe with a fixed size array column. Like this:

[v1, v2, v3, v4]

I need to convert the array to a json of the following structure:

{ 
   v1: {
          Min: v2,
          Max: v3,
          Count: v4
   }
}

While it is easy to achieve to inner structure, having v1 as the name of the property is more challenging.

I tried 'to_json' but the keys are taken from column names. In my case 'v1' changes in each row.

Is it possible to achieve this in pyspark without using a udf? If it helps, I am running on top of Databricks.

pault
  • 41,343
  • 15
  • 107
  • 149
Vitaliy
  • 8,044
  • 7
  • 38
  • 66
  • My current approach is inspired by [This answer](https://stackoverflow.com/a/36167904/180650), where I construct the json by concatenation. Hopefully as time goes by we will have better primitives for this. – Vitaliy Aug 14 '18 at 06:09

0 Answers0