1

Assuming I have the following code, I have some questions.

>>> asd = {}
>>> asd[1] ='a'
>>> asd[2] = 'b'
>>> asd[3] = 'c'
>>> asd
{1: 'a', 2: 'b', 3: 'c'}

>>> dict((v,k) for k, v in asd.iteritems())
{'a': 1, 'c': 3, 'b': 2}

>>> if 'a' in asd:
    print("1")


>>> if 'a' in dict((v,k) for k, v in asd.iteritems()):
    print("1")
1

When I reverse a dictionary how much time will it take assuming my dictionary contains 10gb+ of data.

If I do not store the reversed dictionary to another dict then reversing it by itself as an instance will it consume space over memory?

I need the reversed dictionary because I want O(1) lookups over values, for some operations. Some others require key lookups.

bill
  • 728
  • 3
  • 6
  • 15
  • Your dictionary contains 10gb+ of data?? You really should consider databases when such a large amount of data is involved. – Jayanth Koushik May 01 '14 at 06:42
  • On my algorithm I use dictionary till Memory exists 90% and then I store the remaining data into a database. I dont want to order the dict , I just want to have O(1) lookups over values for some occasions. – bill May 01 '14 at 06:44
  • @bill Do you want O(1) lookups of the values or O(1) reverse lookups? – thefourtheye May 01 '14 at 06:45
  • Calling `dict(...)` is linear time and space. You might want to look at [bidict](http://stackoverflow.com/a/21894086/198633) – inspectorG4dget May 01 '14 at 06:46
  • O(1) lookups over values. I thought that making values (always unique) into keys I will get the O(1) instead of O(N) – bill May 01 '14 at 06:47
  • 1
    Looking up by a value after reversing, takes O(1) time. However, reversing the dict takes O(n) time and O(n) space. If reversing is something you do often, then bidict would be my first guess – inspectorG4dget May 01 '14 at 06:51

2 Answers2

0

When I reverse a dictionary how much time will it take assuming my dictionary contains 10gb+ of data.

The only valid answer is "run it and check it". It depends on your computer's architecture. From the theoretical point of view you need a linear time if the dict is hash-based and O(nlogn) if it is tree based.

If I do not store the reversed dictionary to another dict then reversing it by itself as an instance will it consume space over memory?

You will need temporarly the memory for both dicts, one of which will be discarded after the process (if you use the code provided). It would be however possible to make it without additonal memory by performing iterative process ("take first element from dict"; "remove it"; "add to new one")

lejlot
  • 64,777
  • 8
  • 131
  • 164
0

Whenever you construct a new container object in Python, using its comprehension notation will be slightly faster than any other means. In this case, if you want to construct reverse-lookup dictionary, you can use dictionary comprehension, like this

d = {i: i * 2 for i in range(10000)}
from timeit import timeit
print timeit("{d[k]: k for k in d}", "from __main__ import d", number = 10000)
# 7.22010397911
print timeit("dict((v, k) for k, v in d.iteritems())", "from __main__ import d", number = 10000)
# 10.6085851192

For value lookups, I would recommend using dict.viewvalues like this

d = {i: i * 2 for i in range(10000)}
print 10 in d.viewvalues()
# True

But if the dictionary is not gonna change over time, then converting the values to a set would be a better option.

values_set = set(d.viewvalues())
thefourtheye
  • 233,700
  • 52
  • 457
  • 497