0

I need to build a data structure like this one:

{
    key: {k: v for k in range(fixed_small_number)}
    for key in range(fixed_large_number)
}

The thing is I'm building it in an "eclectic" way, where every time get one more item to put in a random k for a random key, i.e. I need random-access, and I need the inner dict to be mutable.

So my question is divided into two:

  1. The recommended type for the outer dict.

  2. The recommended type for the inner dict.

the "best" solution for me would be an array of mutable namedtuples, only this doesn't exist.

I could use a list of namedtuples, and then recreate each one with the new data, but that sounds super-wasteful with the lists not being random-access-efficient and all the rewrites of the same data.

Is there some magical new structure I'm not aware of?

EDIT: example of usage:

for key, k, v in [('a', 1, 2), ('b', 1, 3), ('a', 2, 1), ('a', 3, 1), ('b', 3, 1) ...]:
    my_structre[key][k] = v

EDIT2:

it turns out that lists actually DO support random access

martineau
  • 119,623
  • 25
  • 170
  • 301
phistakis
  • 211
  • 1
  • 9
  • I'm not sure this is possible if I understand the question correctly. Mutable objects such as `dict`s can't be dictionary keys. – millimoose Mar 10 '13 at 13:29
  • Anyway, it's not clear what you're trying to accomplish. Can you show an example of how this data structure would be built "eclectically"? I.e. the before and after states for a given single update? – millimoose Mar 10 '13 at 13:30
  • i'm not sure i understand.. where did i suggest that dicts will function as keys? – phistakis Mar 10 '13 at 13:31
  • Where does `v` come from? – Inbar Rose Mar 10 '13 at 13:32
  • @phistakis The nested dict comprehension? TBH it doesn't look like valid Python to begin with, you should really expand your code sample to show what you're trying to do. Write out actual dummy data maybe? – millimoose Mar 10 '13 at 13:34
  • i know that this was not proper python syntax. i just wrote it like this to explain how the structure looks like. the v comes from somewhere else as shown in the added example – phistakis Mar 10 '13 at 13:40
  • Would you even need to access `my_structure[x]` on its own, without accessing `my_structure[x][y]`? – Eric Mar 10 '13 at 13:45

3 Answers3

6

You could build a custom class, using __slots__ to limit the amount of memory used perhaps:

class MutableEfficientNamedList(object):
    __slots__ = ('field1', 'field2', 'field3')

    def __init__(self, *values):
        for k, v in zip(self.__slots__, values):
            setattr(self, k, v)

    def __getitem__(self, i):
        return getattr(self, self.__slots__[i])

    def __setitem__(self, i, v):
        return setattr(self, self.__slots__[i], v)

    def __repr__(self):
        return '{}({})'.format(type(self).__name__, 
            ', '.join(repr(getattr(self, s)) for s in self.__slots__))

then use those in your structure. They can be used just like named tuples (allow access by index and by name) but they allow mutation. By using __slots__ the memory footprint of each instance remains low:

>>> menl = MutableEfficientNamedList('foo', 'bar', 'baz')
>>> menl
MutableEfficientNamedList('foo', 'bar', 'baz')
>>> menl.field1
'foo'
>>> menl[0]
'foo'
>>> menl[1]
'bar'
>>> menl[1] = 'spam'
>>> menl.field2
'spam'

You of course give the slots meaningful names, and please do pick a better name for your class than what I used in my example. :-)

To expand on the namedtuple() pattern, here is a generic factory function:

def namedlist(name, *attrs):
    """Create a named list class named `name` with attributes `attrs`.
       `attrs` must be strings representing valid Python identifiers.
    """
    class MutableEfficientNamedList(object):
        __slots__ = attrs

        def __init__(self, *values):
            for k, v in zip(self.__slots__, values):
                setattr(self, k, v)

        def __getitem__(self, i):
            return getattr(self, self.__slots__[i])

        def __setitem__(self, i, v):
            return setattr(self, self.__slots__[i], v)

        def __repr__(self):
            return '{}({})'.format(type(self).__name__, 
                ', '.join(repr(getattr(self, s)) for s in self.__slots__))

    MutableEfficientNamedList.__name__ = name
    return MutableEfficientNamedList

MyList = namedlist('MyList', 'foo', 'bar', 'baz')
nl = MyList(1, 2, 3)
print nl  # MyList(1, 2, 3)
print nl.bar  # 2
print nl[1]  # 2
martineau
  • 119,623
  • 25
  • 170
  • 301
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
2

defaultdict feels right here:

from collections import defaultdict

d = defaultdict(lambda: defaultdict(int))

d[3][4] = 10

If you want fixed-sized lists, defaultdict has you covered:

d = defaultdict(lambda: [None]*fixed_small_number)

d[3][4] = 10
# d[3] is now [None, None, None, None, 10, None, None, ...]
nneonneo
  • 171,345
  • 36
  • 312
  • 383
0

Given your example:

for key, k, v in [('a', 1, 2), ('b', 1, 3), ('a', 2, 1), ('a', 3, 1), ('b', 3, 1) ...]:
    my_structre[key][k] = v

The solution would indeed be by using defaultdict.

from collections import defaultdict

d = defaultdict(dict)
for key, k, v in [('a', 1, 2), ('b', 1, 3), ('a', 2, 1), ('a', 3, 1), ('b', 3, 1)]:
    d[key][k] = v

Answer:

{'a': {1: 2, 2: 1, 3: 1}, 'b': {1: 3, 3: 1}}

As a function:

def method(iter_of_3_item_iters):
    d = defaultdict(dict)
    for (a, b, c) in iter_of_3_item_iters:
        d[a][b] = c
    return d
Inbar Rose
  • 41,843
  • 24
  • 85
  • 131