8

Is there a way to get the original/consistent list of keys from defaultdict even when non existing keys were requested?

from collections import defaultdict
>>> d = defaultdict(lambda: 'default', {'key1': 'value1', 'key2' :'value2'})
>>>
>>> d.keys()
['key2', 'key1']
>>> d['bla']
'default'
>>> d.keys() # how to get the same: ['key2', 'key1']
['key2', 'key1', 'bla']
Vitali Bichov
  • 988
  • 2
  • 12
  • 26

3 Answers3

10

You have to exclude. the keys that has the default value!

>>> [i for i in d if d[i]!=d.default_factory()]
['key2', 'key1']

Time comparison with method suggested by Jean,

>>> def funct(a=None,b=None,c=None):
...     s=time.time()
...     eval(a)
...     print time.time()-s
...
>>> funct("[i for i in d if d[i]!=d.default_factory()]")
9.29832458496e-05
>>> funct("[k for k,v in d.items() if v!=d.default_factory()]")
0.000100135803223
>>> ###storing the default value to a variable and using the same in the list comprehension reduces the time to a certain extent!
>>> defa=d.default_factory()
>>> funct("[i for i in d if d[i]!=defa]")
8.82148742676e-05
>>> funct("[k for k,v in d.items() if v!=defa]")
9.79900360107e-05
Keerthana Prabhakaran
  • 3,766
  • 1
  • 13
  • 23
  • The `default_factory` bit is gold, was looking for it all over the place. Thanks! :) – Anomitra Jul 30 '17 at 09:17
  • 1
    `dir(defaultdict)` helped! – Keerthana Prabhakaran Jul 30 '17 at 09:21
  • 1
    could be faster with: `[k for k,v in d.items() if v!=d.default_factory()]`, and pre-compute `default_factory` value. – Jean-François Fabre Jul 30 '17 at 09:22
  • I've added the time comparison of both the methods! precomputing default_factory value definitely reduces time! Plus one for that! – Keerthana Prabhakaran Jul 30 '17 at 09:31
  • what do you mean time comparision here? execution time? or speed? – Mohideen bin Mohammed Jul 30 '17 at 12:27
  • @MohideenibnMohammed what is the difference between execution time and speed? – Keerthana Prabhakaran Jul 30 '17 at 14:07
  • I'm surprised that using `d.items()` slows that down. Maybe using more iterations would allow a better measure (like done in the `timeit` module) – Jean-François Fabre Jul 30 '17 at 14:25
  • @KeerthanaPrabhakaran first of all i will say thanks for your clear explanation and great answer.. but from your command, here time says your manual execution speed will say real duration... if you check your code again will know that. – Mohideen bin Mohammed Jul 30 '17 at 15:31
  • >>> def funct(a=None,b=None,c=None): ... s=time.time() ... eval(a) ... print time.time()-s ... >>> funct("[k for k,v in d.items() if v!=d.default_factory()]") 5.31673431396e-05 >>> funct("[i for i in d if d[i]!=d.default_factory()]") 0.000155210494995 >>> funct("[i for i in d if d[i]!=d.default_factory()]") 0.000157117843628 >>> funct("[k for k,v in d.items() if v!=d.default_factory()]") 0.000166893005371 >>> – Mohideen bin Mohammed Jul 30 '17 at 15:31
  • if you made delay to execute(human delay) will cause time difference which not means items() slow... check my above comment. sorry for unclear. cant make it clear in comments – Mohideen bin Mohammed Jul 30 '17 at 15:33
  • and human delay, isnt feasible. The method calculates the start and end time, so where does human delay come from!? >>> funct("[k for k,v in d.items() if v!=d.default_factory()]") 0.000102996826172 >>> funct("[i for i in d if d[i]!=d.default_factory()]") 4.79221343994e-05 >>> funct("[i for i in d if d[i]!=d.default_factory()]") 3.50475311279e-05 >>> funct("[k for k,v in d.items() if v!=d.default_factory()]") 5.50746917725e-05 – Keerthana Prabhakaran Jul 31 '17 at 03:00
  • @Jean-FrançoisFabre `Originally, Python items() built a real list of tuples and returned that. That could potentially take a lot of extra memory.` as quoted on [stackoverflow reference](https://stackoverflow.com/questions/10458437/what-is-the-difference-between-dict-items-and-dict-iteritems) – Keerthana Prabhakaran Jul 31 '17 at 03:03
  • @KeerthanaPrabhakaran if run it again will understand.. just swap it run... run .items() first and another one next.. it start from your func def. you ran through terminal so you hit one by one. its start and end time – Mohideen bin Mohammed Jul 31 '17 at 13:50
0

[key for key in d.keys() if key != 'default']

Anomitra
  • 1,111
  • 15
  • 31
0

default_factory() is a callable and need not return the same value each time!

>>> from collections import defaultdict
>>> from random import random
>>> d = defaultdict(lambda: random())
>>> d[1]
0.7411252345322932
>>> d[2]
0.09672701444816645
>>> d.keys()
dict_keys([1, 2])
>>> d.default_factory()
0.06277993247659297
>>> d.default_factory()
0.4388136209046052
>>> d.keys()
dict_keys([1, 2])
>>> [k for k in d.keys() if d[k] != d.default_factory()]
[1, 2]