I am running a loop with data coming in and writing the data to a json file. Here's what it looks like in a minimal verifiable concrete example.
import json
import random
import string
dc_master= {}
for i in range(100):
# Below mimics an API call that returns new data.
name = ''.join(random.choice(string.ascii_uppercase) for _ in range(15))
dc_info= {}
dc_info['Height'] = 'NA'
dc_master[name] = dc_info
with open("myfile.json", "w") as filehandle:
filehandle.write(json.dumps(dc_master))
As you can see from the above, every time it loops, it creates a new dc_info. That becomes the value of the new key-value pair (with the key being the name) that gets written to the json file.
The one disadvantage of the above is that when it fails and I restart again, I have to do it from the very beginning. Should I do a open for reading of the json file to dc_master, then add a name:dc_info to the dictionary, followed by writing the dc_master back to the json file at every turn of the loop? Should I just append to the json file even if it's a duplicate and let the fact that when I need to use it, I will load it back into a dictionary and that takes care of duplicates automatically?
Additional information: There are occasionally timeouts, so I want to be able to start somewhere in the middle if needed. Number of key value pairs in the dc_info is about 30 and number of overall name:dc_info pairs is about 1000. So it's not huge. Reading it out and writing it back in again is not onerous. But I do like to know if there's a more efficient way of doing it.