Pandas - Add items to dataframe

Question

I am trying to add row items to the dataframe, and I am not able to update the dataframe. What i tried until now is commented out as it doesn't do what I need.

I simply want to download the json file and store it to a dataframe with those given columns. Seems I am not able to extract the child components fron JSON file and store them to a brand new dataframe.

Please find bellow my code:

import requests, json, urllib
import pandas as pd
url = "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"

data = pd.read_json(url)
headers = []
df = pd.DataFrame()
for key, item in data['vulnerabilities'].items():
    for k in item.keys():
        headers.append(k)
col = list(set(headers))
        
new_df = pd.DataFrame(columns=col)


for item in data['vulnerabilities'].items():
    print(item[1])
#     new_df['product'] = item[1]['product']
#     new_df['vendorProject'] = item[1]['vendorProject']
#     new_df['dueDate'] = item[1]['dueDate']
#     new_df['shortDescription'] = item[1]['shortDescription']
#     new_df['dateAdded'] = item[1]['dateAdded']
#     new_df['vulnerabilityName'] = item[1]['vulnerabilityName']
#     new_df['cveID'] = item[1]['cveID']
#     new_df.append(item[1], ignore_index = True)

new_df

At the end my df is still blank.

Kabilan Mohanraj · Accepted Answer · 2022-01-23T12:04:12.020

0

The nested JSON data can be directly converted to a flattened dataframe using pd.json_normalize(). The headers are extracted from the JSON itself.

new_df = pd.DataFrame(pd.json_normalize(data['vulnerabilities']))

UPDATE: Unnested the vulnerabilities column specifically.

Output:

edited Jan 23 '22 at 12:04

answered Jan 23 '22 at 11:12

Kabilan Mohanraj

1,856
1
7
17

Thanks, the data was nested. I needed the vulnerabilities – Jan 23 '22 at 11:18
@Gelato I missed that. I will update the answer shortly. Thanks for letting me know. – Kabilan Mohanraj Jan 23 '22 at 11:24
@Gelato I have updated my answer. You can take a look. – Kabilan Mohanraj Jan 23 '22 at 11:50
@Gelato Can you upvote my answer if it was useful? Since you self-resolved the issue, you can accept your answer as well. – Kabilan Mohanraj Jan 23 '22 at 19:50
1

Your code is better! Thank you @Kabilan! – Jan 24 '22 at 04:30

score 0 · Answer 2 · answered Jan 23 '22 at 11:14

It worked with this:


import requests, json, urllib
import pandas as pd
url = "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"

data = pd.read_json(url)
headers = []
df = pd.DataFrame()
for key, item in data['vulnerabilities'].items():
    for k in item.keys():
        headers.append(k)
col = list(set(headers))
        
new_df = pd.DataFrame(columns=col)


for item in data['vulnerabilities'].items():

    new_df.loc[len(new_df.index)] = item[1] <===THIS


new_df.head()

Pandas - Add items to dataframe

2 Answers2