0

I have this dataframe sales_df:

id  year    month   total_sales
0   2020    1       200
1   2019    12      866474119
1   2019    10      555
1   2019    11      13073203
1   2020    2       5255259695
1   2020    1       13622027370

From this, I want to make a dictionnary, as follow:

[
  {
    "2020": {
      "1": "200"
    },
    "id": "0"
  },
  {
    "2019": {
      "10": "555",
      "11": "13073203",
      "12": "866474119"
    },
    "2020": {
      "1": "13553473101",
      "2": "6000"
    },
    "id": "1"
  }
]

i convert df to pandas achchive the output i want know without convert how to achive that

moys
  • 7,747
  • 2
  • 11
  • 42
siva
  • 3
  • 1
  • 2
  • I think even in pyspark your going to have to use collect() to driver node, and then use asDict() on your list of rows. doing it the pandas way might your best bet. i could be wrong.. – murtihash Feb 13 '20 at 06:24
  • you can refer : https://stackoverflow.com/questions/19798112/convert-pandas-dataframe-to-a-nested-dict – Prabhanj Feb 13 '20 at 07:08
  • d = {k: recur_dictify(g.ix[:,1:]) for k,g in grouped} ^ i am geting SyntaxError: invalid syntax – siva Feb 13 '20 at 07:37

0 Answers0