1

I am trying to add row items to the dataframe, and I am not able to update the dataframe. What i tried until now is commented out as it doesn't do what I need.

I simply want to download the json file and store it to a dataframe with those given columns. Seems I am not able to extract the child components fron JSON file and store them to a brand new dataframe.

Please find bellow my code:

import requests, json, urllib
import pandas as pd
url = "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"

data = pd.read_json(url)
headers = []
df = pd.DataFrame()
for key, item in data['vulnerabilities'].items():
    for k in item.keys():
        headers.append(k)
col = list(set(headers))
        
new_df = pd.DataFrame(columns=col)


for item in data['vulnerabilities'].items():
    print(item[1])
#     new_df['product'] = item[1]['product']
#     new_df['vendorProject'] = item[1]['vendorProject']
#     new_df['dueDate'] = item[1]['dueDate']
#     new_df['shortDescription'] = item[1]['shortDescription']
#     new_df['dateAdded'] = item[1]['dateAdded']
#     new_df['vulnerabilityName'] = item[1]['vulnerabilityName']
#     new_df['cveID'] = item[1]['cveID']
#     new_df.append(item[1], ignore_index = True)

new_df

At the end my df is still blank. enter image description here

2 Answers2

0

The nested JSON data can be directly converted to a flattened dataframe using pd.json_normalize(). The headers are extracted from the JSON itself.

new_df = pd.DataFrame(pd.json_normalize(data['vulnerabilities']))

UPDATE: Unnested the vulnerabilities column specifically.

Output: enter image description here

Kabilan Mohanraj
  • 1,856
  • 1
  • 7
  • 17
0

It worked with this:


import requests, json, urllib
import pandas as pd
url = "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"

data = pd.read_json(url)
headers = []
df = pd.DataFrame()
for key, item in data['vulnerabilities'].items():
    for k in item.keys():
        headers.append(k)
col = list(set(headers))
        
new_df = pd.DataFrame(columns=col)


for item in data['vulnerabilities'].items():

    new_df.loc[len(new_df.index)] = item[1] <===THIS


new_df.head()