2

I would like a Python object that can flexibly take any key and I can access by key, like a dictionary, but is immutable. One option could be to flexibly generate a namedtuple but is it bad practice to do this? In the example below a linter would not expect nt to have attribute a for example.

Example:

from collections import namedtuple

def foo(bar):
    MyNamedTuple = namedtuple("MyNamedTuple", [k for k in bar.keys()])
    d = {k: v for k, v in bar.items()}
    return MyNamedTuple(**d)

>>> nt = foo({"a": 1, "b": 2})
jedge
  • 937
  • 2
  • 11
  • 18

2 Answers2

5

I mentioned it in the comments, that I'm not sure why this is needed.
But one could simply override __setitem__ of a dictionary class. Alltho this might (most likely) cause problems down the line. A minimal example of this would be:

class autodict(dict):
    def __init__(self, *args, **kwargs):
        super(autodict, self).__init__(*args, **kwargs)

    def __getitem__(self, key):
        val = dict.__getitem__(self, key)
        return val

    def __setitem__(self, key, val):
        pass

x = autodict({'a' : 1, 'b' : 2})
x['c'] = 3
print(x)

Which will produce {'a': 1, 'b': 2} and thus ignoring the x['c'] = 3 set.


Some benefits

The speed difference is some where between 40-1000 times faster using dictionary inheritance compared to named tuples. (See below for crude speed tests)

The in operator works on dictionaries, not so well on named tuples when used like this:

'a' in nt == False
'a' in x == True

You can use key access dictionary style instead of (for lack of a better term) JavaScript style

x['a'] == nt.a

Although that's a matter of taste.

You also don't have to be picky about keys, since dictionaries support essentially any key identifier:

x[1] = 'a number'
nt = foo({1 : 'a number'})

Named tuples will result in Type names and field names must be valid identifiers: '1'


Optimizations (timing the thing)

Now, this is a crude example, and it would vary a lot depending on the system, the place of the moon in the sky etc.. But as a crude example:

import time
from collections import namedtuple

class autodict(dict):
    def __init__(self, *args, **kwargs):
        super(autodict, self).__init__(*args, **kwargs)
        #self.update(*args, **kwargs)

    def __getitem__(self, key):
        val = dict.__getitem__(self, key)
        return val

    def __setitem__(self, key, val):
        pass

    def __type__(self, *args, **kwargs):
        return dict

def foo(bar):
    MyNamedTuple = namedtuple("MyNamedTuple", [k for k in bar.keys()])
    d = {k: v for k, v in bar.items()}
    return MyNamedTuple(**d)

start = time.time()
for i in range(1000000):
    nt = foo({'x'+str(i) : i})
end = time.time()
print('Named tuples:', end - start,'seconds.')

start = time.time()
for i in range(1000000):
    x = autodict({'x'+str(i) : i})
end = time.time()
print('Autodict:', end - start,'seconds.')

Results in:

Named tuples: 59.21987843513489 seconds.
Autodict: 1.4844810962677002 seconds.

The dictionary setup is in my book, insanely quicker. Although that most likely has to do with multiple for loops in the named tuple setup, and that can probably be easily remedied some how. But for basic understanding this is a big difference. The example obviously doesn't test larger one-time-creations or access times. Just, "what if you use these options to create data-sets over a period of time, how much time would you loose" :)

Bonus: What if you have a large base dictionary, and want to freeze it?

base_dict = {'x'+str(i) : i for i in range(1000000)}

start = time.time()
nt = foo(base_dict)
end = time.time()
print('Named tuples:', end - start,'seconds.')

start = time.time()
x = autodict(base_dict)
end = time.time()
print('Autodict:', end - start,'seconds.')

Well, the difference was bigger than I expected.. x1038.5 times faster.
(I was using the CPU for other stuff, but I think this is fair game)

Named tuples: 154.0662612915039 seconds.
Autodict: 0.1483476161956787 seconds.
Torxed
  • 22,866
  • 14
  • 82
  • 131
  • thanks @Torxed! Though what advantages does this have over the `foo` example above? – jedge Jun 10 '20 at 11:13
  • @JamieEdgecombe Not many that I can think of, other than you could define [custom metaclass references](https://realpython.com/python-metaclasses/) and fool applications into thinking it's an actual dict and not `` - although kind of pointless in most scenarios. – Torxed Jun 10 '20 at 11:17
  • Just for the giggles, I'll run some speed comparisons as well between this and named tuples, altho I think they should be the same. – Torxed Jun 10 '20 at 11:18
  • @JamieEdgecombe One advantage that I completely overlooked is that you can use this as an actual dict, named tuples to my knowledge can't be accessed with `nt["a"]` for instance. Which, if you're used to working with dictionaries and expect it to behave like one - is pretty nice. But if you don't care and the lookup times aren't important - either way works. – Torxed Jun 10 '20 at 11:19
  • thanks again @Torxed! Is that really an advantage though in comparison to say calling it like `nt.a`? – jedge Jun 10 '20 at 11:22
  • @JamieEdgecombe Define advantage, it's a matter of taste and expectations. If you expect a "frozen dictionary" to behave like a dictionary, it's an advantage. Otherwise it doesn't really matter. – Torxed Jun 10 '20 at 11:24
  • @JamieEdgecombe I'll keep adding some benefits, another one I just realized is that you can have numerals as keys in dictionaries, named tuples doesn't support this. – Torxed Jun 10 '20 at 11:28
  • fair points - thanks @Torxed! – jedge Jun 10 '20 at 11:29
  • @JamieEdgecombe You're welcome. dicts are also *(ballpark figure)* x40 times faster *(assuming my third grade math is correct)*. – Torxed Jun 10 '20 at 11:35
  • 1
    another advantage for, i guess, many use cases is that you can add a key that is not already present to your `x` but not to `nt` – jedge Jun 10 '20 at 11:54
  • 1
    The advantage of a frozen dictionary is that you can use it with the @cache decorator, the "go to" way to memoize a function. – Campbell Hutcheson Nov 15 '21 at 05:59
  • @CampbellHutcheson Cool, I believe that functionality was added after this post was made so people have to have some form of acceptance for that since I can't go back and change comments on individual posts :) – Torxed Nov 15 '21 at 11:17
2

You can make a minimal class using frozenset() to store the data and then add a custom __getitem__() method.

class Idict:
    def __init__(self, d):
        self.d = frozenset(d.items())

    def __getitem__(self, k):
        return [v for _k,v in self.d if _k == k][0]


d = {'a':1, 'b':2}
a = Idict(d)
a['a'] #1
a['h'] = 0 #TypeError: 'Idict' object does not support item assignment
alec_djinn
  • 10,104
  • 8
  • 46
  • 71
  • The problem with this is that it achieves immutability at the cost of losing a dictionary's `O(1)` lookup. – John Coleman Jun 10 '20 at 11:16
  • 1
    @JohnColeman Yes, it is true. It is not an efficient implementation. – alec_djinn Jun 10 '20 at 11:17
  • 1
    On the other hand, the only natural use-case I can think of for frozen dictionaries would be for small packets of labeled data, perhaps passed to a `kwargs` in which case asymptotics are not that relevant. – John Coleman Jun 10 '20 at 11:20