1

I am fairly new to this and have spent the entire day reading numerous posts and figuring out how i can convert this flattened excel table to a nested json. Here is an example of the flattened nested table:

    {'Sample': {0: '1A',
  1: '1A',
  2: '1A',
  3: '1A',
  4: '1A',
  5: '1A',
  6: '1A',
  7: '2A',
  8: '2A',
  9: '2A',
  10: '2A',
  11: '2A',
  12: '2A',
  13: '2A'},
 'Substance category': {0: 'Additive',
  1: 'Additive',
  2: 'Alkali',
  3: 'Alkali',
  4: 'Alkali',
  5: 'Alkali',
  6: 'Alkali',
  7: 'Additive',
  8: 'Additive',
  9: 'Alkali',
  10: 'Alkali',
  11: 'Alkali',
  12: 'Alkali',
  13: 'Alkali'},
 'Substance': {0: 'Irgafos 168',
  1: 'Alkylphenylphosphate',
  2: 'Calcium',
  3: 'Kalium',
  4: 'Lithium',
  5: 'Magnesium',
  6: 'Natrium',
  7: 'Irgafos 168',
  8: 'Alkylphenylphosphate',
  9: 'Calcium',
  10: 'Kalium',
  11: 'Lithium',
  12: 'Magnesium',
  13: 'Natrium'},
 'Value': {0: 0,
  1: 0,
  2: 2,
  3: 2,
  4: 1,
  5: 2,
  6: 3,
  7: 2,
  8: 3,
  9: 2,
  10: 3,
  11: 1,
  12: 2,
  13: 3}}

This table looks like this Sample table

I used the following code to get a nested json, which was taken from this answer.

j = (df.groupby(['Sample','Substance category'])
       .apply(lambda x: x[['Substance','Value']].to_dict('records'))
       .reset_index()
       .rename(columns={0:'Substance'})
       .to_json(orient='records'))

I am getting the following json.

[
  {
    "Sample": "1A",
    "Substance": [
      {
        "Substance": "Irgafos 168",
        "Value": 0
      },
      {
        "Substance": "Alkylphenylphosphate",
        "Value": 0
      }
    ],
    "Substance category": "Additive"
  },
  {
    "Sample": "1A",
    "Substance": [
      {
        "Substance": "Calcium",
        "Value": 2
      },
      {
        "Substance": "Kalium",
        "Value": 2
      },
      {
        "Substance": "Lithium",
        "Value": 1
      },
      {
        "Substance": "Magnesium",
        "Value": 2
      },
      {
        "Substance": "Natrium",
        "Value": 3
      }
    ],
    "Substance category": "Alkali"
  },
  {
    "Sample": "2A",
    "Substance": [
      {
        "Substance": "Irgafos 168",
        "Value": 2
      },
      {
        "Substance": "Alkylphenylphosphate",
        "Value": 3
      }
    ],
    "Substance category": "Additive"
  },
  {
    "Sample": "2A",
    "Substance": [
      {
        "Substance": "Calcium",
        "Value": 2
      },
      {
        "Substance": "Kalium",
        "Value": 3
      },
      {
        "Substance": "Lithium",
        "Value": 1
      },
      {
        "Substance": "Magnesium",
        "Value": 2
      },
      {
        "Substance": "Natrium",
        "Value": 3
      }
    ],
    "Substance category": "Alkali"
  }
]

However what I actually want is to define an addition level for the 'Substance category'. Despite all my efforts, I just could not figure that out and none of the answers could help me.

Thank you very much in advance.

Emma
  • 8,518
  • 1
  • 18
  • 35
dannyboy
  • 11
  • 4
  • It's hard to tell what you are asking for. Please try rephrase and formulate your question, to make it more clear what you are asking for. – MrWorldWide Jul 01 '22 at 14:13
  • Please add an expected output. – Emma Jul 01 '22 at 15:10
  • +1 for work put into the question. The table has only 1 level (header), so what are you looking for please ? – D.L Jul 01 '22 at 15:28
  • Please provide enough code so others can better understand or reproduce the problem. – Community Jul 01 '22 at 18:41
  • As I am fairly new to this, I could not really paste images and file, otherwise I would just shared the original file to help everyone understand the Q. Sorry about that. – dannyboy Jul 04 '22 at 12:36

2 Answers2

0

This would be my process:

  • convert the dict to a dataframe.
  • writing to 'json' from a dataframe can be done with to_json()

so the code looks like this:

#%%
import pandas as pd

d = {'Sample': {0: '1A',
  1: '1A',
  2: '1A',
  3: '1A',
  4: '1A',
  5: '1A',
  6: '1A',
  7: '2A',
  8: '2A',
  9: '2A',
  10: '2A',
  11: '2A',
  12: '2A',
  13: '2A'},
 'Substance category': {0: 'Additive',
  1: 'Additive',
  2: 'Alkali',
  3: 'Alkali',
  4: 'Alkali',
  5: 'Alkali',
  6: 'Alkali',
  7: 'Additive',
  8: 'Additive',
  9: 'Alkali',
  10: 'Alkali',
  11: 'Alkali',
  12: 'Alkali',
  13: 'Alkali'},
 'Substance': {0: 'Irgafos 168',
  1: 'Alkylphenylphosphate',
  2: 'Calcium',
  3: 'Kalium',
  4: 'Lithium',
  5: 'Magnesium',
  6: 'Natrium',
  7: 'Irgafos 168',
  8: 'Alkylphenylphosphate',
  9: 'Calcium',
  10: 'Kalium',
  11: 'Lithium',
  12: 'Magnesium',
  13: 'Natrium'},
 'Value': {0: 0,
  1: 0,
  2: 2,
  3: 2,
  4: 1,
  5: 2,
  6: 3,
  7: 2,
  8: 3,
  9: 2,
  10: 3,
  11: 1,
  12: 2,
  13: 3}}


# make dataframe
df = pd.DataFrame(d)

# %%  send to excel
json_path = "C:\\test\\test.json"
df.to_json(json_path)

The dataframe (before the json) looks like this:

enter image description here

You can manipulate the dataframe as you wish from here.

Are you asking to create a multilevel dataframe ? if so, then the final part is answered here:

How to create a multilevel dataframe in pandas?

D.L
  • 4,339
  • 5
  • 22
  • 45
  • Well creating a multilevel df was not a problem. But when I exported that to a json, it did not maintain the nested structure of the indexes. Anyway, I finally found an answer here. It was just a matter to searching on google with the right keywords [link](https://stackoverflow.com/questions/52923685/convert-pandas-multiindex-series-to-json-python) @D.L Thank you for your time and comment anyway – dannyboy Jul 04 '22 at 12:37
  • that is a good find. Glad that the problem is solved. – D.L Jul 04 '22 at 14:11
0

Well creating a multilevel df was not a problem. But when I exported that to a json, it did not maintain the nested structure of the indexes. Anyway, I finally found an answer here. It was just a matter to searching on google with the right keywords link

dannyboy
  • 11
  • 4
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jul 06 '22 at 06:17