20

I am trying to do something similar to this:

from   collections import defaultdict
import hashlib

def factory():
    key = 'aaa'
    return { 'key-md5' : hashlib.md5('%s' % (key)).hexdigest() }

a = defaultdict(factory)
print a['aaa']

(actually, the reason why I need access to the key in the factory is not to compute an md5, but for other reasons; this is just an example)

As you can see, in the factory I have no access to the key: I am just forcing it, which makes no sense whatsoever.

Is it possible to use defaultdict in a way that I can access the key in the factory?

falsetru
  • 357,413
  • 63
  • 732
  • 636
blueFast
  • 41,341
  • 63
  • 198
  • 344
  • Out of curiosity, why do you need `defaultdict`, is it only for correcting missing values? Because that's all it does aside from just returning `{}`? – Torxed Oct 16 '13 at 09:00
  • Yes, it is to provide values for missing keys in the dictionary. That is the whole point of a `defaultdict`, isn't it? The problem is that my (real) data structure, which I am storing in the defaultdict, has fields depending on the key. So, whenever I am trying to access a non-existing element, I need to create it in the factory using as a parameter the key of the `defaultdict`. – blueFast Oct 16 '13 at 09:05
  • Was afraid of you doing something a little more over the ordinary. See @falsetru's solution because that was what i was going to suggest to you in case of custom building usage of dictionaries :) – Torxed Oct 16 '13 at 09:07
  • Possible duplicate of [Is there a clever way to pass the key to defaultdict's default\_factory?](http://stackoverflow.com/questions/2912231/is-there-a-clever-way-to-pass-the-key-to-defaultdicts-default-factory) – eickenberg Jan 12 '17 at 14:48

3 Answers3

30

__missing__ of defaultdict does not pass key to factory function.

If default_factory is not None, it is called without arguments to provide a default value for the given key, this value is inserted in the dictionary for the key, and returned.

Make your own dictionary class with custom __missing__ method.

>>> class MyDict(dict):
...     def __init__(self, factory):
...         self.factory = factory
...     def __missing__(self, key):
...         self[key] = self.factory(key)
...         return self[key]
... 
>>> d = MyDict(lambda x: -x)
>>> d[1]
-1
>>> d
{1: -1}
falsetru
  • 357,413
  • 63
  • 732
  • 636
  • I still do not understand why defaultdict does not support this type of factory. Can anyone explain? – midas Jan 04 '16 at 18:45
  • @midas I'm not sure, but my guess would be that there is no nice way to do it while keeping the ability to just insert any type with a constructor that doesn't need parameters (`defaultdict(int)`, `defaultdict(MyClass)`). – ralokt Sep 16 '16 at 17:01
  • 1
    @ralokt, and can anyone explain why then to take a factory and not an object, like `defaultdict(0)`? – Alexey May 19 '21 at 07:42
  • 2
    @Alexey because sometimes a static value doesn't cut it, is my guess. That would be even worse than not passing the missing key, there would be nothing it could be passed to. Also, static values with mutable types would be a complete no-go, every missing key would point to the same object. – ralokt May 20 '21 at 09:52
  • I agree that not having access to the `key` is silly. For cases where you don't need the key you could simply ignore it, e.g. `defaultdict(lambda key: 0)` – sam-6174 May 30 '21 at 03:33
  • 1
    Another plus, in my opinion, of defining your own dictionary subclass, is that it will still look like regular dictionary when you print it (unlike `defaultdict`). – martineau Jul 16 '21 at 23:13
6

Unfortunately not directly, as defaultdict specifies that default_factory must be called with no arguments:

http://docs.python.org/2/library/collections.html#collections.defaultdict

But it is possible to use defaultdict as a base class that has the behavior you want:

class CustomDefaultdict(defaultdict):
    def __missing__(self, key):
        if self.default_factory:
            dict.__setitem__(self, key, self.default_factory(key))
            return self[key]
        else:
            defaultdict.__missing__(self, key)

This works for me:

>>> a = CustomDefaultdict(factory)
>>> a
defaultdict(<function factory at 0x7f0a70da11b8>, {})
>>> print a['aaa']
{'key-md5': '47bce5c74f589f4867dbd57e9ca9f808'}
>>> print a['bbb']
{'key-md5': '08f8e0260c64418510cefb2b06eee5cd'}
Meridius
  • 360
  • 1
  • 7
  • 3
    No need to use `defaultdict` as base class: [since Python 2.2](https://docs.python.org/2/library/userdict.html) `dict` can be subclassed directly. – Anton Bryzgalov Mar 02 '20 at 21:33
1

In several cases where I wanted a defaultdict with the key in the factory, I found an lru_cache also solved my problem:

import functools

@functools.lru_cache(maxsize=None)
def use_func_as_dict(key='') # Or whatever type
    with open(key, 'r') as ifile:
        return ifile.readlines()

f1 = use_func_as_dict('test.txt')
f2 = use_func_as_dict('test2.txt')
# This will reuse the old value instead of re-reading the file
f3 = use_func_as_dict('test.txt')
assert f3 is f1

This actually makes more sense theoretically, since you're after a function of the input rather than a consistent dummy fallback.

ntjess
  • 570
  • 6
  • 10