No, there is no faster method available for dictionaries.
That's because the performance cost is all in processing each item from the iterator, computing its hash and slotting the key into the dictionary data hash table structures (including growing those structures dynamically). Executing the dictionary comprehension bytecode is really insignificant in comparison.
dict(zip(it, it))
, {k: k for k in it}
and dict.fromkeys(it)
are all close in speed:
>>> from timeit import Timer
>>> tests = {
... 'dictcomp': '{k: k for k in it}',
... 'dictzip': 'dict(zip(it, it))',
... 'fromkeys': 'dict.fromkeys(it)',
... }
>>> timings = {n: [] for n in tests}
>>> for magnitude in range(2, 8):
... it = range(10 ** magnitude)
... for name, test in tests.items():
... peritemtimes = []
... for repetition in range(3):
... count, total = Timer(test, 'from __main__ import it').autorange()
... peritemtimes.append(total / count / (10 ** magnitude))
... timings[name].append(min(peritemtimes)) # best of 3
...
>>> for name, times in timings.items():
... print(f'{name:>8}', *(f'{t * 10 ** 9:5.1f} ns' for t in times), sep=' | ')
...
dictcomp | 46.5 ns | 47.5 ns | 50.0 ns | 79.0 ns | 101.1 ns | 111.7 ns
dictzip | 49.3 ns | 56.3 ns | 71.6 ns | 109.7 ns | 132.9 ns | 145.8 ns
fromkeys | 33.9 ns | 37.2 ns | 37.4 ns | 62.7 ns | 87.6 ns | 95.7 ns
That's a table of the per-item cost for each technique, from 100 to 10 million items. The timings go up as the additional cost of growing the hash table structures accumulate.
Sure, dict.fromkeys()
can process items a little bit faster, but it's not an order of magnitude faster than the other processes. It's (small) speed advantage does not come from being able to iterate in C here; the difference lies purely in not having to update the value pointer each iteration; all keys point to the single value reference.
zip()
is slower because it builds additional objects (creating a 2-item tuple for each key-value pair is not a cost-free operation), and it increased the number of iterators involved in the process, you go from a single iterator for the dictionary comprehension and dict.fromkeys()
, to 3 iterators (the dict()
iteration delegated, via zip()
, to two separate iterators for the keys and values).
There is no point in adding a separate method to the dict
class to handle this in C, because
- is not a common enough use case anyway (creating a mapping with keys and values equal is not a common need)
- not going to be significantly faster in C than it would be with a dictionary comprehension anyway.