I am making an API call that gets a JSON response. However, as the response is huge and I don't need all the information received, I am parsing only the required key:values
to a dictionary which I am using to write to a CSV file. Is it a good practice to do? Should I parse the JSON data directly to create the CSV file?

- 33
- 5
-
How are you parsing the JSON without converting to a dictionary? Regex? Substring? – OneCricketeer Aug 22 '17 at 05:19
-
Although unusual, one can have duplicate keys in JSON, e.g. `{'key': 1, 'key': 2}`. In a python dictionary, you can only have one such key, so one of the values would be overwritten. For example, the result of the previous could result in `{'key': 2}` where the first pair of items were over-written. https://stackoverflow.com/questions/21832701/does-json-syntax-allow-duplicate-keys-in-an-object – Alexander Aug 22 '17 at 05:26
-
@cricket_007 I'm using the json() method and then using the keys to get the values to the dictionary. Somewhat like this: `r = requests.get(...)` `dict[key] = r.json()[key1][key2]` – f0rtyseven Aug 22 '17 at 05:53
-
One optimization you **can** do is only call `r.json()` once and store the result in a variable. Each time you call `r.json()` it may be parsing the JSON response again. – Soviut Aug 22 '17 at 06:02
-
@Soviut Thanks a lot. That never occurred to me. I'll implement that right away. – f0rtyseven Aug 22 '17 at 06:09
2 Answers
Like all things performance-related, don't bother optimizing until it becomes a problem. What you're doing is the normal, simple approach, so keep doing it until you hit real bottlenecks. A "huge response" is a relative thing. To some a "huge" response might be several kilobytes, while others might consider several megabytes, or hundreds of megabytes to be huge.
If you ever do hit a bottleneck, the first thing you should do is profile your code to see where the performance problems are actually occurring and try to optimize only those parts. Don't guess; For all you know, the CSV writer could turn out to be the poor performer.
Remember, those JSON libraries have been around a long time, have strong test coverage and have been battle tested in the field by many developers. Any custom solution you try to create is going to have none of that.

- 88,194
- 49
- 192
- 260
-
Thanks for the suggestion. Actually I was a bit doubtful about getting dictionary from a JSON as both use key-value pairs. I was thinking if it is redundant. – f0rtyseven Aug 22 '17 at 05:56
-
JSON is a string, not an object/dict. It's serialized representation of data, while the dictionary is the real data. You pretty much *have* to parse the JSON to a dict otherwise you're just dealing with a raw string. – Soviut Aug 22 '17 at 05:59
If u want to write only particular key:value pairs into csv file, it is better to convert json into python dictionary with selected key:value pairs and write that into csv file.

- 1
- 5
-
Thanks for the answer. I will keep doing on developing this way then :) – f0rtyseven Aug 22 '17 at 05:58