1

I am doing some graph theory and I use the NetworkX library.

In some algorithms, one needs to get an arbitrary item from a dictionary to start it (for instance, one of the neighbors of a given node, and the neighborhoods are a dict in NetworkX).

By arbitrary element I mean any element, I don't care which one (not a given element, not a random element, etc.)

Currently I do the following:

D = {i:i**2 for i in range(N)}  # an example of a dictionary where N is a positive integer
any_item_of_D = list(D)[0]      # getting a item of D

But it appears that the time complexity of any_item_of_D = list(D)[0] is linearly growing as N tends to infinity (a O(N) complexity).

Is it possible to get an arbitrary item from a dictionary with a O(1) complexity in Python?

wim
  • 338,267
  • 99
  • 616
  • 750
Héhéhé
  • 123
  • 5
  • 2
    It's not clear if you really mean `arbitrary item` from the dict, or if you are looking for a `random item` from the dict. It think you will find `D[next(iter(D.keys()))]` to give constant time *arbitrary* elements. – Mark May 20 '21 at 02:48
  • 1
    you can use ordereddict to get fist item in O(1) – leaf_yakitori May 20 '21 at 02:48
  • 1
    Mark: as written in my question, I am looking for an arbitrary item, not a random item. But thanks you for your answer, it seems to work :) – Héhéhé May 20 '21 at 03:19
  • leaf_yakitori: NetworkX does not use ordereddict. – Héhéhé May 20 '21 at 03:20
  • Have you tried converting it to set instead of a list? – pyNeophyte May 20 '21 at 03:30
  • I do not understand how you interpret the term "arbitrary". Do you mean something like getting the value pointed by the `i`th key in the dictionary? Of course, assuming the dictionary preserves the keys in insertion-order: https://stackoverflow.com/a/40007169/937153 – Unni May 20 '21 at 03:34
  • pyNeophyte: but how do you get an arbitrary item from a set in O(1) ? – Héhéhé May 20 '21 at 04:04
  • Unni: by arbitrary I mean any element, I don't care which one. – Héhéhé May 20 '21 at 04:05
  • @Mark That is only O(1) for an OrderedDict, for a std dict that is O(n) in worst-case. – wim Mar 17 '22 at 02:18
  • Hi @wim...this is a pretty old thread. I wonder if you could clarify the worst-case. What is the case where `D[next(iter(D.keys()))]` is O(n)? Where does the linear behavior come from? – Mark Mar 17 '22 at 02:39
  • @Mark It comes from the possibility that you have to skip past a bunch of "empty markers" left from items which have been removed from the dict – wim Mar 17 '22 at 02:45
  • @wim...sorry if I'm being slow. Are you saying `next(iter(D.keys()))` will possibly have to iterate over a bunch of values — like `keys()` produces values for deleted items? Or is there some performance case in dicts where `D[some_key]` has to skip a bunch of 'markers'? I've honestly never heard of "empty markers" in this context? – Mark Mar 17 '22 at 02:55
  • 1
    @Mark When you delete an item from a dict, it doesn't (usually) shift or resize the underlying data storage, it just puts an empty marker into that position. So in iterating a dict, at each step of the iteration you have to check for possibility of an empty 'tombstone' left from where there was an item in the dict but it's now deleted, and skip over those. You can see that in CPython [here](https://github.com/python/cpython/blob/424dfc723137fb1ca8f3b5e4eb12ba3e669a9f52/Objects/dictobject.c#L4079-L4082). To get the _O(n)_ case for `next(iter(D.keys()))`, create big dict then remove n items. – wim Mar 17 '22 at 17:34

2 Answers2

2

Pop item returns an arbitrary item in O(1), which you can then put back into the dict (also O(1)).

key, val = D.popitem()
D[key] = val

Since Python 3.7 LIFO order is guaranteed for dict.popitem, so you don't have to worry about it messing up ordering either.

This is significantly faster than creating an item iterator just to get the first item:

>>> timeit k,v = D.popitem(); D[k] = v
107 ns ± 0.0716 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
>>> timeit next(iter(D.items()))
164 ns ± 0.0687 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Note that next(iter(ordered_dict)) is O(1) but next(iter(std_dict)) can be O(n) in worst-case.

wim
  • 338,267
  • 99
  • 616
  • 750
0

I have "always" done next(iter(D.items())), but to be honest, I have never checked whether it is the most efficient way...

Ture Pålsson
  • 6,088
  • 2
  • 12
  • 15