0

I have a very large JSON file with multiple individual JSON objects in the format shown below. I am trying to convert it to a CSV so that each row is a combination of the outer id/name/alphabet in a JSON object and 1 set of conversion: id/name/alphabet. This is repeated for all the sets of id/name/alphabet within an individual JSON object. So from the object below, 2 rows should be created where the first row is (outer) id/name/alphabet and 1st id/name/alphabet of conversion. The second row is again (outer) id/name/alphabet and now the 2nd id/name/alphabet of conversion.

Important note is that certain Objects in the file can have upwards of 50/60 conversion id/name/alphabet pairs.

What I tried so far was to flatten the JSON objects first which resulted in keys like conversion_id_0 and conversion_id_1 etc... so I can map the outer as its always constant but I am unsure how to map each corresponding number set to be a seperate row.

Any help or insight would be greatly appreciated!

[
    {
        "alphabet": "ABCDEFGHIJKL",
        "conversion": [
            {
                "alphabet": "BCDEFGHIJKL",
                "id": 18589260,
                "name": [
                    "yy"
                ]
            },
            {
                "alphabet": "EFGHIJEFGHIJ",
                "id": 18056632,
                "name": [
                    "zx",
                    "cd"
                ]
            }
        ],
        "id": 23929934,
        "name": [
            "x",
            "y"
        ]
    }
]
martineau
  • 119,623
  • 25
  • 170
  • 301
lucas305
  • 15
  • 2
  • 4
  • What you tried ? community is not for solving the task, is for help when you get stuck or run out of "fuel" – n1tk Sep 26 '16 at 23:56
  • 1
    Although you have said in broad terms what you have tried so far, it would be better to post the code that you have so far and try to explain where it is breaking....Having said that, you can pop off the last character of a string in many ways, which you could use as row indices when writing the row. For example my_string[-1] – Albert Rothman Sep 26 '16 at 23:59
  • check this code: http://stackoverflow.com/a/1872081/2027457 – n1tk Sep 26 '16 at 23:59
  • Possible duplicate of [How can I convert JSON to CSV with Python?](http://stackoverflow.com/questions/1871524/how-can-i-convert-json-to-csv-with-python) – OneCricketeer Sep 27 '16 at 00:33
  • Your description of the JSON to CSV output is unclear. You say you want each row to consist of two pairs of "id/name/alphabet` groups. However the "name" value in your outer and inter data is a list of one or more values. If there's more than one, how are they to represented in a single row of the CSV? – martineau Sep 27 '16 at 12:20

1 Answers1

0

Your question is unclear about exactly the mapping from input JSON data to rows of the CSV file, so I had to guess on what should happen when there's more than one "name" associated with an inner or outer object.

Regardless, hopefully the following will give you a general idea of how to solve such problems.

import csv

objects = [
    {
        "alphabet": "ABCDEFGHIJKL",
        "id": 23929934,
        "name": [
            "x",
            "y"
        ],
        "conversion": [
            {
                "alphabet": "BCDEFGHIJKL",
                "id": 18589260,
                "name": [
                    "yy"
                ]
            },
            {
                "alphabet": "EFGHIJEFGHIJ",
                "id": 18056632,
                "name": [
                    "zx",
                    "cd"
                ]
            }
        ],
    }
]

with open('converted_json.csv', 'wb') as outfile:
    def group(item):
        return [item["id"], item["alphabet"], ' '.join(item["name"])]

    writer = csv.writer(outfile, quoting=csv.QUOTE_NONNUMERIC)
    for obj in objects:
        outer = group(obj)
        for conversion in obj["conversion"]:
            inner = group(conversion)
            writer.writerow(outer + inner)

Contents of the CSV file generated:

23929934,"ABCDEFGHIJKL","x y",18589260,"BCDEFGHIJKL","yy"
23929934,"ABCDEFGHIJKL","x y",18056632,"EFGHIJEFGHIJ","zx cd"
martineau
  • 119,623
  • 25
  • 170
  • 301
  • Thanks! This is exactly what I was looking for. I didn't think to keep the json structure as is and just access each object within "conversion". I was attempting to accomplish this after flattening the json and was running into some trouble. – lucas305 Sep 28 '16 at 00:29