There is already an interesting quest on How are Python's Built In Dictionaries Implemented ...
but why not try and measure here (after having read about time coplexity in python implementation):
#! /usr/bin/env python
from __future__ import division, print_function
import dis # for disassembling (bottom)
import string # string.printable as sample character source
import timeit
chars = tuple(string.printable)
cartesian = tuple(a + b for a in chars for b in chars)
assert 10000 == len(cartesian), print(len(cartesian))
d = dict((a + b, b) for a in cartesian for b in chars)
assert 1000000 == len(d), print(len(d))
assert d['zzz'] == 'z'
setup = """
import string
chars = tuple(string.printable)
d = dict((a + b, b) for a in chars for b in chars)
"""
assert 1000000 / 10000 == 100
setup_100x = """
import string
chars = tuple(string.printable)
cartesian = tuple(a + b for a in chars for b in chars)
d = dict((a + b, b) for a in cartesian for b in chars)
"""
stmt = """
'zzz' in d
"""
t = timeit.timeit(stmt=stmt, setup=setup, timer=timeit.default_timer,
number=timeit.default_number)
print("# Timing[secs] for 1x10000:", t)
t_100x = timeit.timeit(stmt=stmt, setup=setup_100x, timer=timeit.default_timer,
number=timeit.default_number)
print("# Timing[secs] for 100x10000:", t_100x)
disassemble_me = "'zzz' in {'a': 'b'}"
print("# Disassembly(lookup in dict with 1 string entry):")
print("#", disassemble_me)
dis.dis(disassemble_me)
disassemble_me = "'zzz' in {'a': 'b', 'c': 'd'}"
print("# Disassembly(lookup in dict with 2 string entries):")
print("#", disassemble_me)
dis.dis(disassemble_me)
disassemble_me = "'zzz' in {'a': 'b', 'c': 'd', 'e': 'f'}"
print("# Disassembly(lookup in dict with 3 string entries):")
print("#", disassemble_me)
dis.dis(disassemble_me)
On my machine with Python 2.7.11 this gives:
# Timing[secs] for 1x10000: 0.0406861305237
# Timing[secs] for 100x10000: 0.0472030639648
# Disassembly(lookup in dict with 1 string entry):
# 'zzz' in {'a': 'b'}
0 <39>
1 SETUP_FINALLY 31354 (to 31358)
4 <39>
5 SLICE+2
6 BUILD_MAP 8302
9 <123> 24871
12 <39>
13 INPLACE_DIVIDE
14 SLICE+2
15 <39>
16 DELETE_GLOBAL 32039 (32039)
# Disassembly(lookup in dict with 2 string entries):
# 'zzz' in {'a': 'b', 'c': 'd'}
0 <39>
1 SETUP_FINALLY 31354 (to 31358)
4 <39>
5 SLICE+2
6 BUILD_MAP 8302
9 <123> 24871
12 <39>
13 INPLACE_DIVIDE
14 SLICE+2
15 <39>
16 DELETE_GLOBAL 11303 (11303)
19 SLICE+2
20 <39>
21 DUP_TOPX 14887
24 SLICE+2
25 <39>
26 LOAD_CONST 32039 (32039)
# Disassembly(lookup in dict with 3 string entries):
# 'zzz' in {'a': 'b', 'c': 'd', 'e': 'f'}
0 <39>
1 SETUP_FINALLY 31354 (to 31358)
4 <39>
5 SLICE+2
6 BUILD_MAP 8302
9 <123> 24871
12 <39>
13 INPLACE_DIVIDE
14 SLICE+2
15 <39>
16 DELETE_GLOBAL 11303 (11303)
19 SLICE+2
20 <39>
21 DUP_TOPX 14887
24 SLICE+2
25 <39>
26 LOAD_CONST 11303 (11303)
29 SLICE+2
30 <39>
31 LOAD_NAME 14887 (14887)
34 SLICE+2
35 <39>
36 BUILD_TUPLE 32039
So 10000 entries looked up for 'zz' in 10^4 entries dict in approx. 40 milli seconds on the average (timeit.default_number == 1000000) and below 50 milli secs for 100 times as many, that is 10^6 entries (for 'zzz'
lookup).
# Timing[secs] for 1x10000: 0.0406861305237
# Timing[secs] for 100x10000: 0.0472030639648
Measuring means repeatability :-) thus running it again:
# Timing[secs] for 1x10000: 0.0441079139709
# Timing[secs] for 100x10000: 0.0460820198059
it settles merely (with other runs not shown here around a dict with these key types and length relations (the keys of the larger dict are also longer!) that there is here no linear worst case realized. It is more like 10% longer run times for the 100 times larger dict (qua entry count) and the 50% larger key length.
Looks good. Suggest to always measure and disassemble when in doubt.
HTH.
PS: OP might still want to provide some more code context in future questions, as a data structure is best selected knowing how it wioll be used ;-)
PPS: Raymond Hettinger et al. often optimized CPython implementation with "love to detail" (sorry, no better english expression for this available for me), so expect always specific unrolling implementations for small "size" problems, that is why the disassembly of a toy variant problem might differ substantially from the one, realized for huge sized tasks. That is why I prefer the timeit and (profile measurements) often over the disassembly, but we should get used to reading byte codes, to get ideas when measured performance fails to meet our expectations.
Otherwise: Enjoy the disassembly of the lookup :-)
Update: ... and if you change the statement timed that the small dict geceives a hit 'zz'
and the large one not (vice versa than above), you may also encouter these timeings:
# Timing[secs] for 1x10000: 0.0533709526062
# Timing[secs] for 100x10000: 0.0458760261536
where the test if 'zz' in d
takes 53 milli secs in small and 46 milli secs in large (on the average of 1000000 trials).