0

I have a class that inherits the dict object.

my_subclassed_dict = SubclassedDictionary({
        "id": {"value1": 144
               "value2": "steve",
               "more" {"id": 114}
        },
        "attributes": "random"
})

On initialization of SubclassedDictionary, I would like paths generated which match a certain condition.

Hypothetically, if I was to make this condition, 'index all numbers above 100' This could perhaps then access my_subclassed_dict.get_paths(), which would then return some kind of structure resembling this:

[
    ['id', 'value1'], 
    ['id', 'more', 'id',]
]

In short, how can I subclass dict which generates paths for keys matching a certain condition, on instantiation?

EDIT

Since someone asked for an example implementation. However the problem with this is that it doesn't handle nested dictionaries.

class SubclassedDictionary(dict):
    paths = []

    def __init__(self, *args, **kwargs):
        self.update(*args, **kwargs)  # use the free update to set keys

    def update(self, *args, **kwargs):
        temp = args[0]
        for key, value in temp.items():
            if isinstance(value, int):
                if value > 100:
                    self.paths.append(key)
        super(SubclassedDictionary, self).update(*args, **kwargs)

dictionary = {
   "value1": 333,
   "v2": 99,
   "v2": 129,
   "v3": 30,
   "nested": {
      "nested_value" 1000
   }
}

new_dict = SubclassedDictionary(dictionary)

print(new_dict.paths) # outputs: ['v2','value1']

If it did work as intended.

print(new_dict.paths) 

Would output

[
   ['v2'],
   ['value1'],
   ['nested', 'nested_value']
]
  • How exactly would this be used, if you had it? Can you show an example? What have you tried so far, and what exactly is the problem with it? – jonrsharpe May 27 '15 at 16:28
  • What's wrong with the example I've given? –  May 27 '15 at 16:31
  • Well it's not actually code. Would the input literally be `'index all numbers about 100'`? Where would it be set? Without your code so far, it's not clear which step you're stuck on. – jonrsharpe May 27 '15 at 16:33
  • Hey added an example implementation (that doesn't work). –  May 27 '15 at 16:55
  • So the issue is handling nesting? Why not make the nested dictionaries `SubclassedDictionary` instances too, and make `paths` a method (or property) rather than a standard attribute, that includes aggregating over the nested objects? – jonrsharpe May 27 '15 at 16:56
  • That's kind of my question, I don't know how to do that in python. –  May 27 '15 at 17:01
  • Which part? You clearly know how to create instances and methods, so if it's properties you're stuck on search specifically for that. Also, note that you're in danger of falling foul of this: http://stackoverflow.com/q/1680528/3001761 – jonrsharpe May 27 '15 at 17:02
  • I'd like to find a solution to **this** problem. I've run into several smaller problems which I don't know how to get around. I can't continue this back and forth because this isn't a forum. Perhaps you can try answering. –  May 27 '15 at 17:06
  • As well as not being a forum, this isn't a tutorial or code-writing service. *"How do I do..."* questions aren't a good fit for SO; if you have *specific errors* ask *specific questions*. That said, I'd recommend you look into your `update` implementation: you already handle the case where `isinstance(value, int)`, perhaps you could do something recursivelhere in the case `isinstance(value, dict)`? – jonrsharpe May 27 '15 at 17:10
  • "How do I..." questions are a fine fit for StackOverflow, I'm not asking "How do I create an iPhone app", I'm asking "how do I create a specific data structure that I can do X with". Forgive my attitude when it comes to this, but would people stop criticizing questions because the question is more abstract than, 'this not working how do I fix'. –  May 27 '15 at 17:22
  • What makes you think the solution isn't immediately obvious? I've given you several hints towards it. I don't have a problem with questions that aren't just *"this not working"*, but it's nice to see a bit more effort and thinking on the OP's part. – jonrsharpe May 27 '15 at 17:24

2 Answers2

1

From what I understand, you want a dictionary that is capable of returning the keys of dictionaries within dictionaries if the value the key's are associated with match a certain condition.

class SubclassedDictionary(dict):
    def __init__(self, new_dict, condition=None, *args, **kwargs):
        super(SubclassedDictionary, self).__init__(new_dict, *args, **kwargs)
        self.paths = []
        self.get_paths(condition)

    def _get_paths_recursive(self, condition, iterable, parent_path=[]):
        path = []
        for key, value in iterable.iteritems():
            # If we find an iterable, recursively obtain new paths.
            if isinstance(value, (dict, list, set, tuple)):
                # Make sure to remember where we have been (parent_path + [key])
                recursed_path = self._get_paths_recursive(condition, value, parent_path + [key])
                if recursed_path:
                    self.paths.append(parent_path + recursed_path)
            elif condition(value) is True:
                self.paths.append(parent_path + [key])

    def get_paths(self, condition=None):
        # Condition MUST be a function that returns a bool!
        self.paths = []
        if condition is not None:
            return self._get_paths_recursive(condition, self)

def my_condition(value):
    try:
        return int(value) > 100
    except ValueError:
        return False



my_dict = SubclassedDictionary({"id": {"value1": 144,
                                       "value2": "steve",
                                       "more": {"id": 114}},
                                "attributes": "random"},
                               condition=my_condition)

print my_dict.paths  # Returns [['id', 'value1'], ['id', 'more', 'id']]

There's a few benefits to this implementation. One is that you can change your condition whenever you want. In your question it sounded like this may be a feature that you were interested in. If you want a different condition you can easily write a new function and pass it into the constructor of the class, or simply call get_paths() with your new condition.

When developing a recursive algorithm there are 3 things you should take into account.

1) What is my stopping condition? In this case your literal condition is not actually your stopping condition. The recursion stops when there are no longer elements to iterate through.

2) Create a non-recursive function This is important for two reasons (I'll get to the second later). The first reason is that it is a safe way to encapsulate functionality that you don't want consumers to use. In this case the _get_paths_recursive() has extra parameters that if a consumer got a hold of could ruin your paths attribute.

3) Do as much error handling before recursion (Second reason behind two functions) The other benefit to a second function is that you can do non-recursive operations. More often than not, when you are writing a recursive algorithm you are going to have to do something before you start recursing. In this case I make sure the condition parameter is valid (I could add more checking to make sure its a function that returns a bool, and accepts one parameter). I also reset the paths attribute so you don't end up with a crazy amount of paths if get_paths() is called more than once.

Righteous Mullet
  • 174
  • 1
  • 13
  • Thanks, recursion was one of the problems that was getting in my way. Before I accept, I'm going to look into `collections.MutableMaping` because I've heard that's a little cleaner than subclassing `dict`. –  May 27 '15 at 20:17
  • Sure. If you are interested in writing your data structure from scratch, then an ABC (Abstract Base Class) is fine. In the majority of Python cases it is common to subclass `dict` because it gives you a few things for free. Depending on what you are going to use this dictionary for I would recommend subclassing `dict`. Read: http://stackoverflow.com/a/7148602/3221505 for more information. – Righteous Mullet May 27 '15 at 21:40
0

The minimal change is something like:

class SubclassedDictionary(dict):

    def __init__(self, *args, **kwargs):
        self.paths = []  # note instance, not class, attribute
        self.update(*args, **kwargs)  # use the free update to set keys

    def update(self, *args, **kwargs):
        temp = args[0]
        for key, value in temp.items():
            if isinstance(value, int):
                if value > 100:
                    self.paths.append([key])  # note adding a list to the list
            # recursively handle nested dictionaries
            elif isinstance(value, dict):
                for path in SubclassedDictionary(value).paths:
                    self.paths.append([key]+path)
        super(SubclassedDictionary, self).update(*args, **kwargs)

Which gives the output you're looking for:

>>> SubclassedDictionary(dictionary).paths
[['v2'], ['value1'], ['nested', 'nested_value']]

However, a neater method might be to make paths a method, and create nested SubclassedDictionary instances instead of dictionaries, which also allows you to specify the rule when calling rather than hard-coding it. For example:

class SubclassedDictionary(dict):

    def __init__(self, *args, **kwargs):
        self.update(*args, **kwargs)  # use the free update to set keys

    def update(self, *args, **kwargs):
        temp = args[0]
        for key, value in temp.items():
            if isinstance(value, dict):
                temp[key] = SubclassedDictionary(value)
        super(SubclassedDictionary, self).update(*args, **kwargs)

    def paths(self, rule):
        matching_paths = []
        for key, value in self.items():
            if isinstance(value, SubclassedDictionary):
                for path in value.paths(rule):
                    matching_paths.append([key]+path)
            elif rule(value):
                matching_paths.append([key])
        return matching_paths

In use, to get the paths of all integers larger than 100:

>>> SubclassedDictionary(dictionary).paths(lambda val: isinstance(val, int) and val > 100)
[['v2'], ['value1'], ['nested', 'nested_value']]

One downside is that this recreates the path list every time it's called.


It's worth noting that you don't currently handle the kwargs correctly (so neither does my code!); have a look at e.g. Overriding dict.update() method in subclass to prevent overwriting dict keys where I've provided an answer that shows how to implement an interface that matches the basic dict's. Another issue that your current code has is that it doesn't deal with keys subsequently being removed from the dictionary; my first snippet doesn't either, but as the second rebuilds the path list each time it's not a problem there.

Community
  • 1
  • 1
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437