0

I am trying to webscrape a value, what is always change on a website, and i want to get the actual value.

I tried this:

my_url = requests.get('https://www.telekom.hu/shop/categoryresults/https://www.telekom.hu/shop/categoryresults/?N=10994&contractType=list_price&instock_products=1&Ns=sku.sortingPrice%7C0%7C%7Cproduct.displayName%7C0&No=0&Nrpp=9&paymentType=FULL')

data = my_url.text
parsed = json.loads(data)
my_number = parsed["totalNumRecs"]
print my_number

But I get this error message:

"my_number = parsed["totalNumRecs"]
KeyError: 'totalNumRecs'"

What am I wrong? Why i cant get back this number that is inside totalNumRecs?

Mr.D
  • 151
  • 2
  • 10

2 Answers2

1

The reason you get a key error is the nested structure of your returned dictionary. totalNumRecs is in fact present but not at the top level of the dict. Have a look at:

Find all occurrences of a key in nested python dictionaries and lists

This is a way of traversing a dictionary of unknown structure and finding all occurences of a specific key. I was able to find your desired key and its value with the following code inspired by the aforementioned link:

import requests
import json


def gen_dict_extract(key, var):
    if hasattr(var, 'items'):
        for k, v in var.iteritems():
            if k == key:
                yield v
            if isinstance(v, dict):
                for result in gen_dict_extract(key, v):
                    yield result
            elif isinstance(v, list):
                for d in v:
                    for result in gen_dict_extract(key, d):
                        yield result



my_url = requests.get('https://www.telekom.hu/shop/categoryresults/https://www.telekom.hu/shop/categoryresults/?N=10994&contractType=list_price&instock_products=1&Ns=sku.sortingPrice%7C0%7C%7Cproduct.displayName%7C0&No=0&Nrpp=9&paymentType=FULL')

data = my_url.text
parsed = json.loads(data)

result = gen_dict_extract('totalNumRecs', parsed)

for i in result:
    print(i)
kb-0
  • 98
  • 5
1

You need to specify complete "path" to required key:

my_url = requests.get('https://www.telekom.hu/shop/categoryresults/https://www.telekom.hu/shop/categoryresults/?N=10994&contractType=list_price&instock_products=1&Ns=sku.sortingPrice%7C0%7C%7Cproduct.displayName%7C0&No=0&Nrpp=9&paymentType=FULL')
data = my_url.json()
my_number = data['MainContent'][0]['contents'][0]['totalNumRecs']
print my_number
Andersson
  • 51,635
  • 17
  • 77
  • 129