0

I'm trying to pull information from NIST's NVD, and I'm having trouble appending to an existing JSON as I grab data. I see how the code overwrites existing data, but I am unsure of how to tell the for loop to append instead of overwrite.

#Block of code:

jsonPathMedium = (filepath)
jsonPathMediumCompressed = (filepath)

base_url = "https://services.nvd.nist.gov/rest/json/cves/2.0?cvssV3Severity=MEDIUM&resultsPerPage=2000&"
headers = {"Accept": "application/json", "Authorization": "Bearer 123456  "}
ids = ['startIndex=0', 'startIndex=2000']
jsonAppend = []
for id in ids:
    responseMedium = requests.get(base_url + str(id), headers=headers)
    jsonAppend.append(responseMedium)
    print('Grabbed data, making next request.')
print('Finishing pulling data.')
#converts data into json
jsonPrintMedium = responseMedium.json()

jsonObjectMedium = json.dumps(jsonPrintMedium, indent=4)

with open(jsonPathMedium, "w") as jsonFileMedium:
    jsonFileMedium.write(str(jsonObjectMedium))
    jsonFileMedium.close
    print('Wrote to medium severity JSON file.')
    mediumIn= open(jsonPathMedium, 'rb')
    mediumOut = gzip.open(jsonPathMediumCompressed, 'wb')
    mediumOut.writelines(mediumIn)
    mediumOut.close
    mediumIn.close
    print('Compressed medium severity JSON file.')
  • When you say "append to a json file", what exactly are you expecting the results to be. Will you [edit] your question to show an example of what the file looks like before you run your program and what you want it to look like afterwards? – Code-Apprentice Jan 24 '23 at 20:59
  • https://stackoverflow.com/questions/4706499/how-do-i-append-to-a-file might answer your question. However, just appending to the existing file might not give you the desired results. – Code-Apprentice Jan 24 '23 at 21:00
  • If you want to preserve list or object syntax in the resulting JSON file, then you will need to read the file, parse it into a dictionary or list, add data to it, then write the entire thing back to a file again. – Code-Apprentice Jan 24 '23 at 21:01
  • Adding the desired output will take up too many characters on here, but if you put the link to the API into your web browser you can see what the output will be. Essentially trying to take that output, combine all API requests, and dump all of that into a JSON. – tortillas21 Jan 24 '23 at 21:40
  • The beginning of the dataset is at startIndex 0 and the end of the dataset is whatever the “totalResults” count is for the query. Example: https://services.nvd.nist.gov/rest/json/cves/2.0 The default pagination is 2000 records. The startIndex is 0 (not 1) so the first page contains records 0 to 1999 – tortillas21 Jan 24 '23 at 21:43
  • To get the next page you will need to set startIndex to the next item you want (in this case, 2000). https://services.nvd.nist.gov/rest/json/cves/2.0?startIndex=2000 Continue this process of iterating the startIndex parameter until you have exceeded the value of “totalResults” At the time of this comment the totalResults is 2270, which means you would need to perform 2 requests to pull down all CVE related data, which is shown in the script above. – tortillas21 Jan 24 '23 at 21:44
  • Perhaps I misunderstood what you are trying to do here. Does this program run only one time to create a file? Or do you want to run it multiple times and keep adding to an existing file? – Code-Apprentice Jan 24 '23 at 23:38
  • 1
    Apologies - this version is only supposed to run once in order to populate the initial dataset. The reason I want to append is because when the for loop goes through the second pass, it overwrites what it initially pulled. I'm trying to make it so it does this: open JSON -> write -> append as the loop progresses -> close JSON when no more data is found. – tortillas21 Jan 25 '23 at 14:59
  • "The reason I want to append is because when the for loop goes through the second pass" This is because you are only getting the JSON from `responseMedium` after the for loop. This will only use the last response. To get the JSON from each response, you need to do `responseMedium.json()` inside the loop instead of after it. – Code-Apprentice Jan 25 '23 at 18:26
  • On a side note, `jsonFileMedium.close` does nothing. In order to actually close the file, you need parentheses: `jsonFileMedium.close()`. Also, in this particular case, the explicit call to `close()` is redundant because you are using `with` which will automatically close the file after the end of the `with` block. – Code-Apprentice Jan 25 '23 at 18:28

1 Answers1

0

Let's think about this in words. If I understand correctly, you want to do something like this:

for each id in a list of ids
    get the JSON from an HTTP request for that specific id
    append this JSON to a list of all of the results
write the list of results to a file

You already have some of the code for this, so I will borrow from it. There are a few details you are not quite getting right and I'll comment on those later. Here's what I suggest the code should be:

base_url = "https://services.nvd.nist.gov/rest/json/cves/2.0?cvssV3Severity=MEDIUM&resultsPerPage=2000&"
headers = {"Accept": "application/json", "Authorization": "Bearer 123456  "}
ids = ['startIndex=0', 'startIndex=2000']
jsonAppend = []
for id in ids:
    responseMedium = requests.get(base_url + str(id), headers=headers)
    jsonAppend.append(responseMedium.json()) # <----- parse the JSON from the request here
    print('Grabbed data, making next request.')

json.dump(jsonPathMedium, jsonAppend) # <---- write all of the list to a single file, no need to make this more complicated

If you want to write the JSON to a compressed file, then I recommend you just do that directly. Don't write it to an uncompressed file first. I will leave the implementation as an exercise for the reader.

Code-Apprentice
  • 81,660
  • 23
  • 145
  • 268