1

First question on SO! Bear with me, there is a bit of background context needed.

I started using classes to make a data storage container similar to struct in matlab. When going open source with Python, I have not completely replaced this and it is very useful when combining more than just numeric array data, or just makes more sense to reference things with names instead of indicies.

I realized using classes was not the best thing to do for several layers deep (right? mostly just confusing and maybe slow), and I need a generic tree data type. I am working on a project to automate Excel, and data must be stored based on what is in the cell, later operated on, and potentially rewritten back to some areas of the spreadsheet. And who wants to write in VBA when we can use xlwings and openpyxl to leverage other stuff we have written in an adaptable, OS portable language like Python?!

Building upon this post: Looking for a good Python Tree data structure

I liked the expandability of this:

import collections
def Tree():
     return collections.defaultdict(Tree)

I can make arbitrary layers of any type. I also wanted to also include some functions to manage it like I had in my class storage containers as demonstrated in the UserData class at the bottom of this post. There is an example of it being used in a class on that pagewhich sort of works:

class Tree(defaultdict):
    def __call__(self):
        return Tree(self)

    def __init__(self, parent):
        self.parent = parent
        self.default_factory = self

So I experimented, worked out kinks, and made this class:

import collections
class TreeClass(collections.defaultdict):

    def __call__(self):
        return TreeClass(self)

    def Tree(self):
        return collections.defaultdict(self.Tree)

    def __init__(self, parent):
        self.parent = parent
        self.default_factory = self
        #self.x = 'xvar'
        #self._locations_idx=[]
        self['someInitKey'] = 'value'

The reason I want the class and the functionality of the data structure operating on itself is to do something like this:

import openpyxl.utils as opx

class UserData():
    '''For more info on decorators:
        https://stackoverflow.com/questions/27571546/dynamically-update-attributes-of-an-object-that-depend-on-the-state-of-other-att
        https://stackoverflow.com/questions/17330160/how-does-the-property-decorator-work
        '''

    def __init__(self):
        self.locations=[]
        self._locations_idx=[]

    @property # Auto update locations_idx when locations changes
    def locations_idx(self):#, locations=self.locations):
        self._locations_idx = self.locations # set list to same len
        for i in range(len(self.locations)):
            current_loc = self.locations[i]
            # write indexed location over copy of list of same len
            # TODO check make sure this is zero indexed
            self._locations_idx[i] = (opx.column_index_from_string(current_loc[0]), int(current_loc[1]))
        return self._locations_idx

where opx.column_index_from_string is a function from openpyxl to return the corresponding 0-indexed index from a letter, and it is combined into a tuple to turn a list of 'A1', 'B2' into a list of (0,0), (1,1) etc

This way, though the class is initialized with an empty locations list, when the list is populated with 'C4', 'B22' etc, the myUserData.locations_idx contains an updated list of these indices which are very useful for the rest of the program when you don't want to reference contents by 'excel location'.

Now for my actual question:

Say I use this to make a default dict of schema defaultdict[carMake][carModel][carColor][locationInSpreadsheet] like following:

myUserData['toyota']['corolla']['grey']['location']='B2'

myUserData['chevy']['Volt']['blue']['location']='B6'

and upon adding a 'location' key/value, I would like to dynamically create corresponding:

myUserData['chevy']['Volt']['blue']['location_idx']

which returns (1,5).

Only thing is I am new to default_dict, rarely need to OOP Python, not sure what to even google (referencing many layers into a tree data structure made from default dict python ?). Can someone help or give me some pointers? I hate to drag on other people and have made it a long way in Python without needing to ask a question on here, but I've hit my limit when I don't even know what to look for. I hope you can tell what I'm trying to do from the examples, and I think I'm approaching this problem the right way. Any post formatting/etiquette, tag suggestions, are welcome too. If something is unclear, please let me know, but I have tried to make the examples general enough and easy to understand that someone who works with Python classes and default dicts should be able to understand. I think. Thanks for any help!

jpp
  • 159,742
  • 34
  • 281
  • 339
Mike
  • 83
  • 7
  • "I realized using classes was not the best thing to do for several layers deep, " um, *why not*? You realize, `defaultdict` *is a class*, right? – juanpa.arrivillaga Mar 01 '18 at 20:14
  • Yes. I mean making a generic class to use as a data container. – Mike Mar 01 '18 at 20:17
  • Yes, that's what classes are *for*. Extending `defaultdict` could be a reasonable choice, but if it is easier, why not just use your own implementation? I think composition makes more sense than inheritance for a tree. A defaultdict is fundamentally a *mapping*. – juanpa.arrivillaga Mar 01 '18 at 20:19
  • I tried using my own implementation and made 3 sub classes but got tripped up in nuances of referencing variables in the highest level from the sublevels. A simple tree data structure seemed to make more sense and is more straightforward to access data in all levels. – Mike Mar 01 '18 at 20:27
  • Why do you need subclasses? Biggest mistake for beginners to OOP: using inheritance everywhere. You maybe need a base-class, and then a single subclass. But really, you probably don't need any subclasses. Also, "but got tripped up in nuances of referencing variables in the highest level from the sublevels." Not sure what that means. What are "levels" here? Levels in the *tree* or in a class hierarchy? If the latter, just don't use a class heirarchy, ditch the inheritance for now. – juanpa.arrivillaga Mar 01 '18 at 20:28
  • yes levels as in levels of subclasses that I tried to arrange like levels in a tree. Maybe later I will try to put instances of classes inside each other instead of making a different class (subclass) for each level. – Mike Mar 01 '18 at 21:20

0 Answers0