0

Want to convert Sample JSON data into CSV file using python. I am retrieving JSON data from API. As my JSON has nested objects, so it normally cannot be directly converted to CSV.I don't want to do any hard coding and I want to make a python code fully dynamic.

So, I have written a function that flatten my JSON Data but I am not able to work out how to iterate all records, finding relevant column names and then output those data into CSV.

In the Sample JSON file I have mentioned only 2 records but in actual there are 100 records. Sample JSON Look like this:

[
   {
      "id":"Random_Company_57",
      "unid":"75",
      "fieldsToValues":{
         "Email":"None",
         "occupation":"SO1 Change",
         "manager":"None",
         "First Name":"Bells",
         "employeeID":"21011.0",
         "loginRequired":"true",
         "superUser":"false",
         "ldapSuperUser":"false",
         "archived":"true",
         "password":"None",
         "externalUser":"false",
         "Username":"Random_Company_57",
         "affiliation":"",
         "Phone":"+16 22 22 222",
         "unidDominoKey":"",
         "externalUserActive":"false",
         "secondaryOccupation":"SO1 Change",
         "retypePassword":"None",
         "Last Name":"Christmas"
      },
      "hierarchyFieldAccess":[
         
      ],
      "userHierarchies":[
         {
            "hierarchyField":"Company",
            "value":"ABC Company"
         },
         {
            "hierarchyField":"Department",
            "value":"gfds"
         },
         {
            "hierarchyField":"Project",
            "value":"JKL-SDFGHJW"
         },
         {
            "hierarchyField":"Division",
            "value":"Silver RC"
         },
         {
            "hierarchyField":"Site",
            "value":"SQ06"
         }
      ],
      "locale":{
         "id":1,
         "dateFormat":"dd/MM/yyyy",
         "languageTag":"en-UA"
      },
      "roles":[
         "User"
      ],
      "readAccessRoles":[
         
      ],
      "preferredLanguage":"en-AU",
      "prefName":"Christmas Bells",
      "startDate":"None",
      "firstName":"Bells",
      "lastName":"Christmas",
      "fullName":"Christmas Bells",
      "lastModified":"2022-02-22T03:47:41.632Z",
      "email":"None",
      "docNo":"None",
      "virtualSuperUser":false
   },
   {
      "id":"xyz.abc@safe.net",
      "unid":"98",
      "fieldsToValues":{
         "Email":"xyz.abc@safe.net",
         "occupation":"SO1 Change",
         "manager":"None",
         "First Name":"Bells",
         "employeeID":"21011.0",
         "loginRequired":"false",
         "superUser":"false",
         "ldapSuperUser":"false",
         "archived":"false",
         "password":"None",
         "externalUser":"false",
         "Username":"xyz.abc@safe.net",
         "affiliation":"",
         "Phone":"+16 2222 222 222",
         "unidDominoKey":"",
         "externalUserActive":"false",
         "secondaryOccupation":"SO1 Change",
         "retypePassword":"None",
         "Last Name":"Christmas"
      },
      "hierarchyFieldAccess":[
         
      ],
      "userHierarchies":[
         {
            "hierarchyField":"Company",
            "value":"ABC Company"
         },
         {
            "hierarchyField":"Department",
            "value":"PUHJ"
         },
         {
            "hierarchyField":"Project",
            "value":"RPOJ-SDFGHJW"
         },
         {
            "hierarchyField":"Division",
            "value":"Silver RC"
         },
         {
            "hierarchyField":"Site",
            "value":"SQ06"
         }
      ],
      "locale":{
         "id":1,
         "dateFormat":"dd/MM/yyyy",
         "languageTag":"en-UA"
      },
      "roles":[
         "User"
      ],
      "readAccessRoles":[
         
      ],
      "preferredLanguage":"en-AU",
      "prefName":"Christmas Bells",
      "startDate":"None",
      "firstName":"Bells",
      "lastName":"Christmas",
      "fullName":"Christmas Bells",
      "lastModified":"2022-03-16T05:04:13.085Z",
      "email":"xyz.abc@safe.net",
      "docNo":"None",
      "virtualSuperUser":false
   }
]

What I have tried.

def flattenjson(b, delim):
    val = {}
    for i in b.keys():
        if isinstance(b[i], dict):
            get = flattenjson(b[i], delim)
            for j in get.keys():
                val[i + delim + j] = get[j]
        else:
            val[i] = b[i]
    print(val)        
    return val
    
json=[{Sample JSON String that mentioned above}]
flattenjson(json,"__")

I don't know it is a right way to deal this problem or not? My final aim is that all the above json data will output in a csv file.

1 Answers1

0

Based on this answer, you could loop through your list of json data and flatten each json with the given function (they always have the same structure?), then build a DataFrame and write the data to csv. That's the easiest way I can think of, try this:

import pandas as pd
import json
import collections

def flatten(dictionary, parent_key=False, separator='__'):

    items = []
    for key, value in dictionary.items():
        new_key = str(parent_key) + separator + key if parent_key else key
        if isinstance(value, collections.MutableMapping):
            items.extend(flatten(value, new_key, separator).items())
        elif isinstance(value, list):
            for k, v in enumerate(value):
                items.extend(flatten({str(k): v}, new_key).items())
        else:
            items.append((new_key, value))
    return dict(items)
    
with open('your_json.json') as f:
    data = json.load(f) # data is a the example you provided (list of dicts)
    
all_records=[]
for jsn in data:
    tmp = flatten(jsn)
    all_records.append(tmp)
    
df = pd.DataFrame(all_records)
out = df.to_csv('json_to_csv.csv')
Rabinzel
  • 7,757
  • 3
  • 10
  • 30