7

It hasn't been long since I started learning python, but I really want to dig in it. And dig it hard. So here is a task I've been studying for a while but haven't cracked yet:
I am given a mixed combination of nested dictionaries and lists (let's call it "combination"), and I need to implement function that will allow accessing nested elements as object attributes, also somehow treating combination elements as iterable. This should look something like this:

combination = {
'item1': 3.14,
'item2': 42,
'items': [
         'text text text',
         {
             'field1': 'a',
             'field2': 'b',
         },
         {
             'field1': 'c',
             'field2': 'd',
         },
         ]
}

def function(combination):
    ...

so that
list(function(combination).items.field1) will give: ['a', 'c'], and
list(function(combination).item1) will give: [3.14].
Edit As mentioned by @FM, I missed description of handling non-dict elements: list(function(combination).items[0]) >>> ['text text text']


I tried implementing a class (kudos to Marc) to help me:

class Struct:
    def __init__(self, **entries): 
        self.__dict__.update(entries)

and then using it in the function like return Struct(**combination)
While being very nifty, it is only the first step to the desired result.
But as the next step needs to go deeper, it overwhelms me and I can't make it on myself.
Therefore, I kindly ask for your help.

Michael.

Community
  • 1
  • 1
  • +1 for an interesting question. But your desired to jump directly from a key like `items` to a key like `field1` seems to be in tension with a general approach that preserves all information. Is the latter an important goal for you? More specifically, how do you envision accessing the non-dict elements stored under `items` (`'text text text'`)? – FMc Mar 13 '11 at 18:42
  • @FM _combination_ is not based on a specific real information scheme, so any logical inconsistencies do not mean a lot (if I understood you correctly). Accessing non-dict elements is an important point. I think `result.items` should support slicing and such. I'll edit question. – Michael Spring Mar 13 '11 at 19:04

4 Answers4

6

How about:

class ComboParser(object):
    def __init__(self,data):
        self.data=data
    def __getattr__(self,key):
        try:
            return ComboParser(self.data[key])
        except TypeError:
            result=[]
            for item in self.data:
                if key in item:
                    try:
                        result.append(item[key])
                    except TypeError: pass
            return ComboParser(result)
    def __getitem__(self,key):
        return ComboParser(self.data[key])
    def __iter__(self):
        if isinstance(self.data,basestring):
            # self.data might be a str or unicode object
            yield self.data
        else:
            # self.data might be a list or tuple
            try:
                for item in self.data:
                    yield item
            except TypeError:
                # self.data might be an int or float
                yield self.data
    def __length_hint__(self):
        return len(self.data)

which yields:

combination = {
    'item1': 3.14,
    'item2': 42,
    'items': [
        'text text text',
        {
            'field1': 'a',
            'field2': 'b',
            },
        {
            'field1': 'c',
            'field2': 'd',
            },
        {
            'field1': 'e',
            'field3': 'f',
            },        
        ]
    }
print(list(ComboParser(combination).item1))
# [3.1400000000000001]
print(list(ComboParser(combination).items))
# ['text text text', {'field2': 'b', 'field1': 'a'}, {'field2': 'd', 'field1': 'c'}, {'field3': 'f', 'field1': 'e'}]
print(list(ComboParser(combination).items[0]))
# ['text text text']
print(list(ComboParser(combination).items.field1))
# ['a', 'c', 'e']
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • +1: Wow, very clever recursive use of Martelli's (and Doug Hudgeon's) `Bunch` collection/container class. Minor cosmetic suggestion though, I'd make `convert()` a `@classmethod` which would require the call to it to become `result=Bunch.convert(combination)` -- to make things just a little more explicit. – martineau Mar 13 '11 at 19:13
  • How can I make `result.items` iterable? I feel that it must be simple, but I'm still not so good at this. – Michael Spring Mar 13 '11 at 19:55
  • Also items `'text text text'` becomes unavailable after _merge_. I am sorry that I forgot to mention importance of non-dict accessibility from start (edited later). **In general**, do you think it is possible to make _function_ act like a generator? I mean to generate those output results on-the-fly when asked for them, not pre-evaluate. Would that be _pythonic_? – Michael Spring Mar 13 '11 at 20:36
  • Thank you very much! (Beating recursion like a boss) – Michael Spring Mar 14 '11 at 10:36
3

For example:

class Struct:
    def __init__(self, **entries):
        for key, value in entries.items():
            value2 = (Struct(**value) if isinstance(value, dict) else value)
            self.__dict__[key] = value2

entries = {
    "a": 1,
    "b": {
        "c": {
            "d": 2
        }
    }
}

obj = Struct(**entries)
print(obj.a) #1
print(obj.b.c.d) #2
tokland
  • 66,169
  • 13
  • 144
  • 170
1

I think there are basically two options you could pursue here:

  1. Make function convert the nested data structure into a series of objects linked together that implement the protocols for supporting list() and dict() (the objects must implement a number of functions, including at least __iter__, __len__, __getitem__, etc). To create the objects, you need to either define classes that implement these behaviors, and assemble them recursively, or create classes on the fly using type().

  2. Make function return a class that proxies access to the underlying data structure. To implement a class that allows member attribute access for not-actually-members (i.e. doing function(combination).items), you override __getattr__. You won't be able to access the "full dotted path" so to speak in any single invocation of this function, so it will have to operate recursively and return additional instances at each level of the dotted path. I believe this approach will be simpler than the first.

dcrosta
  • 26,009
  • 8
  • 71
  • 83
1

What you probably need to do then is look at each item that you assign to your object's __dict__ to see if it is itself a dict or iterable.

import types
class Struct:
    def __init__(self, **entries):
        self.__dict__.update(entries)
        for k,v in self.__dict__.items():
            if type(v) == types.DictType:
                setattr(self, k, Struct(**v))

so that you're using a recursive arrangement. Looks something like this:

>>> b = Struct(a=1, b={'a':1})
>>> b.b.a
1
wheaties
  • 35,646
  • 15
  • 94
  • 131