0

I have a tricky python structure issue that I can't quite wrap my head around.

Reading in large files is being pulled out of our application code and into its own package. It used to be that one class, lets call it Object, would be used to access file properties, and be passed around our Application. So Application package would call Object.datetime, but now would have to call Object.file_handler.datetime. So I have a few options:

  1. For every property in FileHandler, create a property in Object that just references the file_handler object. So in Object do:

    @property
    def datetime(self):
        return self.file_handler.datetime
    

    This looks really gross because there are dozens of properties/functions, and it's super tight coupling. This is our current solution.

  2. Find a way to copy those functions/properties automatically. Something like:

    for property in self.file_handler:
        create self.property with value self.file_handler.property
    

    Due to the size of files this MUST be a reference and not a memory copy.

  3. Go through the application and replace every call to self.object.property with self.object.file_handler.property. Since Application is production level code, this is a big ask. It could break other things that we haven't touched in a while (unit tests exist but they're not great).

  4. Restructure everything. FileHandler must be in a different package, and I don't want to copy/paste the functionality into Application. I can't have Object extend FileHandler because Object represents FileHandler at one instance in time (it makes my head hurt; basically Object is an Application instance and there are multiple Applications run per FileHandler object; FileHandler objects are passed to Application through an iterator). I need Object because FileHandler can't implement Application features - they should have been two classes in the first place. If someone is interested in this let me know and I can provide more structural details, but I'm leaning towards a hacky one-off solution.

This is a software design problem that I can't seem to crack. I recognize that our code is fragile and not well designed/tested, but it is what it is (most of it predates me, and this isn't a Software Engineering team - mostly math/physics people). Does anyone have guidance? Option 2 is my favorite but I can't find code to make it work.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
jay kus
  • 1
  • 1
  • 2
    ". Due to the size of files this MUST be a reference and not a memory copy." python never implicitly creates copies, everything has reference semantics, and you have to explicitly copy objects to actually create a new object. In any case, you can trivially proxy attribute access to your `file_handle` object using `__getattr__`, – juanpa.arrivillaga Jan 17 '21 at 17:32
  • 1
    I disagree with the closing of this question, it isn't multiple questions, the OP is proposing different *solutions* to their problem. – juanpa.arrivillaga Jan 17 '21 at 17:37
  • For what it's worth, option 2 is possible, but the only way I can think to do it is so kludgy that it probably wouldn't be worth it. You'd have to get all the attribute names of `self.file_handler` then filter them to exclude magic attributes like `__name__`, `__doc__`, `__class__`, etc. I like Juanpa's suggestion better. – wjandrea Jan 17 '21 at 18:54
  • Hey, your __getattr__ comment helped me google the right things and I found this: https://stackoverflow.com/questions/11637293/iterate-over-object-attributes-in-python – jay kus Jan 18 '21 at 22:37
  • I can't edit my comment above.... not used to StackOverflow juanpa and wjandrea are right, above is simple code for getting all functions/properties. The tricky part would be creating properties for Object on the fly; basically doing solution 3 but in an automated fashion. That would be the ideal. But I don't think it's possible.... I'll keep digging around. Thanks everyone! – jay kus Jan 18 '21 at 22:45
  • @jaykus Oh, I think you misunderstood juanpa's comment. I'll write you an answer in a little while, if juanpa doesn't get there first :) – wjandrea Jan 18 '21 at 22:51

1 Answers1

0

You can define Object.__getattr__() to delegate attribute access to self.file_handler, for attributes not found on self.

For example:

class FileHandler:
    def __init__(self, filename):
        self.filename = filename

class Object:
    def __init__(self, file_handler):
        self.file_handler = file_handler
        self.x = 'foobar'

    def __getattr__(self, name):
        return getattr(self.file_handler, name)

f = FileHandler('some_file')
obj = Object(f)
print(obj.x)  # -> foobar
print(obj.filename)  # -> some_file
print(obj.y)  # -> AttributeError: 'FileHandler' object has no attribute 'y'

For more details, see the documentation link above.

Thanks to juanpa.arrivillaga for suggesting this in a comment.


I might also add some handling to make the exception message clearer, like this:

    def __getattr__(self, name):
        try:
            return getattr(self.file_handler, name)
        except AttributeError:
            # Provide a clearer exception message.
            cls = type(self).__name__
            fh = type(self.file_handler).__name__
            msg = f'Both {cls!r} object and {fh!r} object have no attribute {name!r}'
            raise AttributeError(msg) from None

...

print(obj.y)
# AttributeError: Both 'Object' object and 'FileHandler' object have no attribute 'y'
wjandrea
  • 28,235
  • 9
  • 60
  • 81