2

Trying to create a custom case-insensitive dictionary, I came the following inconvenient and (from my point-of-view) unexpected behaviour. If deriving a class from dict, the overloaded __iter__, keys, values functions are ignored when converting back to dict. I have condensed it to the following test case:

import collections

class Dict(dict):
    def __init__(self):
        super(Dict, self).__init__(x = 1)

    def __getitem__(self, key):
        return 2

    def values(self):
        return 3

    def __iter__(self):
        yield 'y'

    def keys(self):
        return 'z'

    if hasattr(collections.MutableMapping, 'items'):
        items = collections.MutableMapping.items
    if hasattr(collections.MutableMapping, 'iteritems'):
        iteritems = collections.MutableMapping.iteritems

d = Dict()
print(dict(d))              # {'x': 1}
print(dict(d.items()))      # {'y': 2}

The values for keys,values and __iter__,__getitem__ are inconsistent only for demonstration which methods are actually called.

The documentation for dict.__init__ says:

If a positional argument is given and it is a mapping object, a dictionary is created with the same key-value pairs as the mapping object. Otherwise, the positional argument must be an iterator object.

I guess it has something to do with the first sentence and maybe with optimizations for builtin dictionaries.

Why exactly does the call to dict(d) not use any of keys, __iter__? Is it possible to overload the 'mapping' somehow to force the dict constructor to use my presentation of key-value pairs?

Why did I use this? For a case-insensitive but -preserving dictionary, I wanted to:

  • store (lowercase => (original_case, value)) internally, while appearing as (any_case => value).
  • derive from dict in order to work with some external library code that uses isinstance checks
  • not use 2 dictionary lookups: lower_case=>original_case, followed by original_case=>value (this is the solution which I am doing now instead)

If you are interested in the application case: here is corresponding branch

coldfix
  • 6,604
  • 3
  • 40
  • 50
  • 1
    Why are you making an inherited dictionary, when the next time you convert it back to a simple dict object? I mean, once you use `dict(d)` it is transforming back to a normal dictionary. you should use: `d.items()` – Peter Varo Aug 19 '13 at 15:49
  • Being an inherent type, redefining the semantics of `dict` would certainly cause some astonishment (if not outright breakage) elsewhere. – msw Aug 19 '13 at 15:54
  • I do not quite understand why this question is marked as primarily opinion-based. The second answer and first comment are (which even got upvotes anyway). The accepted answer though states the 'fact', that it is not possible due to implementation details. Furthermore the answer to this question is far from obvious as in an object oriented language it is unexpected behaviour, that overriding dictionary `__iter__` is not possible. – coldfix Aug 22 '13 at 00:03

2 Answers2

2

In the file dictobject.c, you see in line 1795ff. the relevant code:

static int
dict_update_common(PyObject *self, PyObject *args, PyObject *kwds, char *methname)
{
    PyObject *arg = NULL;
    int result = 0;

    if (!PyArg_UnpackTuple(args, methname, 0, 1, &arg))
        result = -1;

    else if (arg != NULL) {
        _Py_IDENTIFIER(keys);
        if (_PyObject_HasAttrId(arg, &PyId_keys))
            result = PyDict_Merge(self, arg, 1);
        else
            result = PyDict_MergeFromSeq2(self, arg, 1);
    }
    if (result == 0 && kwds != NULL) {
        if (PyArg_ValidateKeywordArguments(kwds))
            result = PyDict_Merge(self, kwds, 1);
        else
            result = -1;
    }
    return result;
}

This tells us that if the object has an attribute keys, the code which is called is a mere merge. The code called there (l. 1915 ff.) makes a distinction between real dicts and other objects. In the case of real dicts, the items are read out with PyDict_GetItem(), which is the "most inner interface" to the object and doesn't bother using any user-defined methods.

So instead of inheriting from dict, you should use the UserDict module.

glglgl
  • 89,107
  • 13
  • 149
  • 217
  • I see, the problem is with `Pydict_Check()`. I feel like the implementation should not only check for inheritance but also for overridden methods in order to yield more consistent behaviour. Inheriting from `UserDict` (correctly) gives False in `isinstance(d, dict)` checks, which is why I wanted to derive from `dict`. – coldfix Aug 19 '13 at 16:24
  • Strange, the source code I see using the link in your answer is not exactly the same as what you show and the line numbers are different... – martineau Aug 19 '13 at 17:43
  • In that case it seems like it might be possible to get it to work by making a derived class with a `__getitem__()` method that returned the proper values. – martineau Aug 19 '13 at 18:58
  • @martineau Strange indeed; it seems I had a different revision of the file as I wrote that. – glglgl Aug 19 '13 at 19:39
  • I just changed the link to a concreate revision, so we shouldn't get any discepancies any longer. But as the call chain is `dict_init() -> dict_update_common() -> PyDict_Merge() -> PyDict_Merge()`, overriding `__getitem__()` won't help either. – glglgl Aug 19 '13 at 19:45
  • @coldfix Maybe it could work creating an [ABC](http://docs.python.org/2/library/abc.html) which implements [`__subclasshook__()`](http://docs.python.org/2/library/abc.html#abc.ABCMeta.__subclasshook__) in order to claim to be a subclass of a given other class. (But I think it works the other way round, so it doesn't work as well, alas.) I think you are stuck doing `dict(d.items()))`... – glglgl Aug 19 '13 at 19:49
  • @glglgl I experimented using `metaclass=ABCMeta` (see branch delegation in the repo) on a class derived from `dict` and registering it as virtual superclass of my Dict class. It turns out, this doesn't work, because `isinstance`/`issubclass` are not transitive in this situation. (Which is quite unexpected as well!) – coldfix Aug 19 '13 at 20:47
1

Is it possible to overload the 'mapping' somehow to force the dict constructor to use my presentation of key-value pairs?

No.

Being an inherent type, redefining the semantics of dict would certainly cause outright breakage elsewhere.

You've got a library that you can't override the behavior of dict in, that's tough, but redefining the language primitives isn't the answer. You'd probably find it irksome if someone screwed with the commutative property of integer addition behind your back; that's why they can't.

And with regard to your comment "UserDict (correctly) gives False in isinstance(d, dict) checks", of course it does because it isn't a dict and dict has very specific invariants which UserDict can't assure.

msw
  • 42,753
  • 9
  • 87
  • 112
  • That's what `type(d) is dict` is used for. To get exactly the guarantees of `dict`. For subclasses (`isinstance`) you would expect that they can override functionality. Isn't that the whole point of polymorphy. – coldfix Aug 22 '13 at 01:57
  • If you want to argue that the language should be other than it is, I suggest you submit a [Python Enhancement Proposal](http://www.python.org/dev/peps/). You asked a question about how the language is and I answered; I'm sorry that you don't like the answer, but that doesn't affect its validity. – msw Aug 22 '13 at 04:03