-1

I am currently writing a scraper that reads from an API that contains a JSON. By doing response.json() it would return a dict where we could easily use the e.g response["object"]to get the value we want as I assume that converts it to a dict. The current mock data looks like this:

data = {
    'id': 336461,
    'thumbnail': '/images/product/123456?trim&h=80',
    'variants': None,
    'name': 'Testing',
    'data': {
        'Videoutgång': {
            'Typ av gränssnitt': {
                'name': 'Typ av gränssnitt',
                'value': 'PCI Test'
            }
        }
    },
    'stock': {
        'web': 0,
        'supplier': None,
        'displayCap': '50',
        '1': 0,
        'orders': {
            'CL': {
                'ordered': -10,
                'status': 1
            }
        }
    }
}

What I am looking after is that the API sometimes does contain "orders -> CL" but sometime doesn't . That means that both happy path and unhappy path is what I am looking for which is the fastest way to get a data from a dict.

I have currently done something like this:

data = {
    'id': 336461,
    'thumbnail': '/images/product/123456?trim&h=80',
    'variants': None,
    'name': 'Testing',
    'data': {
        'Videoutgång': {
            'Typ av gränssnitt': {
                'name': 'Typ av gränssnitt',
                'value': 'PCI Test'
            }
        }
    },
    'stock': {
        'web': 0,
        'supplier': None,
        'displayCap': '50',
        '1': 0,
        'orders': {
            'CL': {
                'ordered': -10,
                'status': 1
            }
        }
    }
}

if (
        "stock" in data
        and "orders" in data["stock"]
        and "CL" in data["stock"]["orders"]
        and "status" in data["stock"]["orders"]["CL"]
        and data["stock"]["orders"]["CL"]["status"]
):
    print(f'{data["stock"]["orders"]["CL"]["status"]}: {data["stock"]["orders"]["CL"]["ordered"]}')

1: -10

However my question is that I would like to know which is the fastest way to get the data from a dict if it is in the dict?

Alex Waygood
  • 6,304
  • 3
  • 24
  • 46
PythonNewbie
  • 1,031
  • 1
  • 15
  • 33
  • see https://stackoverflow.com/questions/3437708/python-nested-dictionary-lookup-with-default-values – balderman Sep 28 '21 at 21:34
  • You might be confusing "fast" with "readable". [PEP 505](https://www.python.org/dev/peps/pep-0505/) would introduce a handy null-coalescing operator. While it does not arrive, what exactly are the keys that you might not get in one `data` object? It seems like some of the checks you are making aren't really necessary. – Carlos Menezes Sep 28 '21 at 21:37
  • @CarlosMenezes Hi! Well in my case that would be that I want to know if there is any value for the key `status`. If it does then I would like to print out both `status & ordered` but in some scenarios the `orders {...}` is not in the API. E.g. if you remove the `'orders': { 'CL': { 'ordered': -10, 'status': 1 } }` from the data that would be one example of how the API could look like as well. – PythonNewbie Sep 28 '21 at 21:39
  • Finding any key from a dictionary in python already has O(1) time complexity. I don't think so there is any specific fastest way of doing it. – DrDoggo Sep 28 '21 at 21:40
  • @HritikSingh what I was thinking more was that there is different scenarios that I could think of. e.g. using for loop to go through the api to see if it contains, try except, .get() etc etc... - and I would like to know which is the fastest when it comes for my question where I would liek to know which way is the fastest to get the value from given key – PythonNewbie Sep 28 '21 at 21:42

2 Answers2

1

I got your point. For this question, since your stock has just 4 values it is hard to say if .get() method will work faster than using a loop or not. If your dictionary would have more items then certainly .get() would have worked much faster but since there are few keys, using loop will not make much difference.

DrDoggo
  • 149
  • 10
  • 1
    Doing dictionary look-ups is ***O(1)*** meaning the amount of time it takes is independent of how many items there are. The will almost *always* be faster than anything involving a user-written loop because it's implemented internally in C or whatever underlying language the interpreter is written in. – martineau Sep 28 '21 at 23:00
1

Lookups are faster in dictionaries because Python implements them using hash tables. If we explain the difference by Big O concepts, dictionaries have constant time complexity, O(1). This is another approach using .get() method as well:

data = {
    'id': 336461,
    'thumbnail': '/images/product/123456?trim&h=80',
    'variants': None,
    'name': 'Testing',
    'data': {
        'Videoutgång': {
            'Typ av gränssnitt': {
                'name': 'Typ av gränssnitt',
                'value': 'PCI Test'
            }
        }
    },
    'stock': {
        'web': 0,
        'supplier': None,
        'displayCap': '50',
        '1': 0,
        'orders': {
            'CL': {
                'ordered': -10,
                'status': 1
            }
            
        }
    }
}
if (data.get('stock', {}).get('orders', {}).get('CL')):
    print(f'{data["stock"]["orders"]["CL"]["status"]}: {data["stock"]["orders"]["CL"]["ordered"]}')

Here is a nice writeup on lookups in Python with list and dictionary as example.

Bhagyesh Dudhediya
  • 1,800
  • 1
  • 13
  • 16
  • Hi! In that case wouldn't it be better to put the if statement as walrus `if found_data := data.get('stock', {}).get('orders', {}).get('CL'):` and then you could just do `found_data["status"]: found_data["ordered"]` ? – PythonNewbie Sep 29 '21 at 07:14
  • 1
    Doesn't matter how you extract it, it will O(1), you can use any of the approach you want. – Bhagyesh Dudhediya Sep 29 '21 at 07:31