-1

I'm trying to parse some JSON to get some values, for example the value of 'Count'. I can successfully retrieve the JSON and convert it to text but .get() returns 'NONE' for any values I try to get. The code I am using is:

from urllib import request
from bs4 import BeautifulSoup
import json
url = 'https://irus.jisc.ac.uk/api/sushilite/v1_7/GetReport/?Report=IR1&Release=4&RequestorID=Cambridge&BeginDate=2020-01&EndDate=2020-05&ItemIdentifier=irusuk%3A1861749&Granularity=Monthly&Pretty=Pretty'
html = request.urlopen(url).read()
soup = BeautifulSoup(html,'html.parser')
site_json=json.loads(soup.text)
x = site_json.get('Count')
print(x)
tmnsnmt
  • 95
  • 1
  • 10
  • 'Count' is in inner dictionary you can't get that by just using `site_json.get('Count')`. You have to use full path – deadshot Jun 25 '20 at 10:05
  • Does this answer your question? [Is there a recursive version of the dict.get() built-in?](https://stackoverflow.com/questions/28225552/is-there-a-recursive-version-of-the-dict-get-built-in) – deadshot Jun 25 '20 at 10:08

2 Answers2

1

You should parse the JSON

try:
    for values in site_json["ReportResponse"]["Report"]["Report"]["Customer"]["ReportItems"][0]["ItemPerformance"]:
        print(values["Instance"]["Count"])
except KeyError as e:
    print("Key not found" , e)

Another approach:

if ReportItems key having multiple values in the list. You can simply iterate the value from ReportItems key.

try:
    for values in site_json["ReportResponse"]["Report"]["Report"]["Customer"]["ReportItems"]:
        for performance in values["ItemPerformance"]:
            print(performance["Instance"]["Count"])
except KeyError as e:
    print("Key not found" , e)

Output:

1001
6273
2128
993
1365
Narendra Prasath
  • 1,501
  • 1
  • 10
  • 20
  • Thanks this worked! I figured it had been converted to text so didn't think to parse the JSON. Relatively new to Python so apologies if this was a stupid question. – tmnsnmt Jun 25 '20 at 10:17
  • A quick follow up question: why is the '[0]' necessary in this code? – tmnsnmt Jun 28 '20 at 10:41
  • @tmnsnmt `["ReportItems"]` key having a value as `list`. So taking the first `index` of the value from the `list`. You can iterate the `["ReportItems"]` as well if list having multiple values. – Narendra Prasath Jun 28 '20 at 10:43
1

You can try iterating of the JSON object from urllib import request from bs4 import BeautifulSoup import json

url = 'https://irus.jisc.ac.uk/api/sushilite/v1_7/GetReport/?Report=IR1&Release=4&RequestorID=Cambridge&BeginDate=2020-01&EndDate=2020-05&ItemIdentifier=irusuk%3A1861749&Granularity=Monthly&Pretty=Pretty'
html = request.urlopen(url).read()
soup = BeautifulSoup(html,'html.parser')
site_json=json.loads(soup.text)
for itemIdentifier in  site_json["ReportResponse"]["Report"]["Report"]['Customer']["ReportItems"]:
    for itemPerformance in itemIdentifier["ItemPerformance"]:
        print(itemPerformance["Instance"]["Count"])

Output all the counts

1001
6273
2128
993
1365

In this nested loops you will be able to make some logic of getting a specific count our all of the counts.

Leo Arad
  • 4,452
  • 2
  • 6
  • 17