I have a JSON file of the following form:
{'query': {'tool': 'domainquery', 'query': 'example.org'},
'response': {'result_count': '1',
'total_pages': '1',
'current_page': '1',
'matches': [{'domain': 'example2.org',
'created_date': '2015-07-25',
'registrar': 'registrar_10'}]}}
I have a list of the following form:
removal_list=["example2.org","example3.org"...]
I am trying to loop through the removal_list and remove all instances of each item from the JSON file. The issue is how long it takes to compute, with removal_list containing 110,000 items. I have tried to make this faster by using set() and isdisjoint, but this does not make it any faster it seems.
The code I currently have to do this is:
removal_list= set(removal_list)
for domain in removal_list:
for i in range(len(JSON_file)):
if int(JSON_file[i]['response']['result_count'])>0:
for j in range(len(JSON_file[i]['response']['matches'])):
for item in JSON_file[i]['response']['matches'][j]['domain']:
if not remove_set.isdisjoint(JSON_file[i]['response']['matches'][j]['domain']):
del(JSON_file[i]['response']['matches'][j]['domain'])
else:
pass
Does anyone have any suggestions on how to speed this process up? Thanks in advance.