0

I am curious about memory consumption / performance of python related to nested classes vs class attributes.

If i have classes called OtherClass, ClassA, ClassB, ClassC where OtherClass needs access to limited attributes of ClassA-C. Assuming ClassA-C are large classes with many attributes, methods, and properties. Which one of these scenarios is more efficient.

Option 1:

def OtherClass(object):
    def __init__(self, classa, classb, classc):
        self.classa
        self.classb
        self.classc

Option 2:

def OtherClass(object):
    def __init__(self, classa_id, classa_atr1, classa_atr2,
                 classb_id, classb_atr1, classb_atr2, 
                 classc_id, classc_atr1, classc_atr2):
        self.classa_id
        self.classb_id
        self.classc_id
        self.classa_atr1
        self.classb_atr1
        self.classc_atr1
        self.classa_atr2
        self.classb_atr2
        self.classc_atr2

I imagine option 1 is better, since the 3 attributes will simply link to the class instance already existing in memory. Where option 2 is adding 6 additional attributes per instance to memory. Is this correct?

Arctelix
  • 4,478
  • 3
  • 27
  • 38
  • 2
    What do you mean by "efficient"? Are you asking about relative memory consumption, execution speed, or something else? – spirulence Jan 28 '15 at 18:46
  • Taking both into consideration and other factors not considered here what would be a better choice. I know its a hard question to generalize, perhaps a pro / con list for each option if there is no clear answer? – Arctelix Jan 28 '15 at 19:02

1 Answers1

3

TL;DR

My answer is that you should prefer option 1 for it's simplicity and better OOP design, and avoid premature optimization.

The Rest

I think the efficiency question here is dwarfed by how difficult it will be in the future to maintain your second option. If one object needs to use attributes of another object (your example code uses a form of composition), then it should have those objects as attributes, rather than creating extra references directly to the object attributes it needs. Your first option is the way to go. The first option supports encapsulation, option 2 very clearly violates it. (Granted, encapsulation isn't as strongly enforced in Python as some langauages, like Java, but it's still a good principle to follow).

The only efficiency-related reason you should prefer number two is if you find your code is slow, you profile, and your profiling shows that these extra lookups are indeed your bottleneck. Then you could consider sacrificing things like ease of maintenance for the speed you need. It is possible that the extra layer of references (foo = self.classa.bar() vs. foo = self.bar()) could slow things down if you're using them in tight loops, but it's not likely.

In fact, I would go one step further and say you should modify your code so that OtherClass actually instantiates the object it needs, rather than having them passed in. With Option 1, if I want to use OtherClass, I have to do this:

classa = ClassA(class_a_init_args)
classb = ClassC(class_b_init_args)
classc = ClassC(class_c_init_args)

otherclass_obj = OtherClass(classa_obj, classb_obj, classc_obj)

That's too much setup required just to instantiate OtherClass. Instead, change OtherClass to this:

def OtherClass(object):
    def __init__(self, classa_init_args, classb_init_args, classc_init_args):
        self.classa = ClassA(class_a_init_args)
        self.classb = ClassC(class_b_init_args)
        self.classc = ClassC(class_c_init_args)

Now instantiating an OtherClass object is simply this:

otherclass_obj = OtherClass(classa_init_args, classb_init_args, classc_init_args)

If possible, another option may be possible to reconfigure your class so that you don't even have to instantiate the other classes! Have a look at Class Attributes and the classmethod decorator. That allows you to do things like this:

class foo(object):

    bar = 2

    @classmethod
    def frobble(self):
        return "I didn't even have to be instantiated!"

print(foo.bar)
print(foo.frobble())

This code prints this:

2
I didn't even have to be instantiated!

If your OtherClass uses attributes or methods of classa, classb, and classc that don't need to be tied to an instance of those classes, consider using them directly via class methods and attributes instead of instantiating the objects. That would actually save you the most memory by avoiding the creation of entire objects.

Community
  • 1
  • 1
skrrgwasme
  • 9,358
  • 11
  • 54
  • 84
  • Thank you for your clear answer and it certainly sounds reasonable. I'm going to give it a little time before accepting. Is my assumption correct as far as a options at the end of the question? – Arctelix Jan 28 '15 at 19:06
  • Your assumption is correct. Option 2 does create more references, but they are also *to the same underlying memory*, just like option 1, so you're only consuming the memory of 6 references, which is trivial in modern systems, unless you're creating a very large number of these objects. So option 1 is indeed *slightly* more memory efficient, and option 2 will be *slightly* faster because there is one less lookup. But both variations are trivially small. Given all the tricks CPUs, caches, and the interpreter can play with optimizations, you probably won't notice a difference in either case. – skrrgwasme Jan 28 '15 at 19:32
  • Thank you for that, and the extended answer. Just an FYI, to see if anything changes. OtherClass is a way of persisting temporary data as it progresses through an extremely complex process of events. ClassesA-C are already instantiated by the process. As OtherClass gets passed from method to method all the temporary data is saved and at the end of the cycle, OtherClass is returned as the result. I was thinking why carry a whole instantiated class when all i need is the instance_id and one property from ClassesA-C... – Arctelix Jan 28 '15 at 21:38
  • It seems to me that if you're passing an object from one method to the next, perhaps your class structure could be improved? It seems like you're mixing procedural and OOP approaches, and perhaps the methods that accept OtherClass as an argument should actually be methods of the OtherClass itself. This may not be the case (I'm doing the same thing in a current project and feel it's justified), but I would suggest getting your code reviewed by someone you trust, just as a sanity check. He/she may be able to suggest a new class hierarchy that avoids the issue entirely. – skrrgwasme Jan 28 '15 at 21:54
  • Yes this structure has been a challenge. However, i am fairly certain this was the best possible approach. The temporary data must be temporary and the OtherClass needs to capture the temporary data. Resetting the state of the MasterCass and ClassA-C after each call to the API is way more complicated and will require way to many callbacks and tear down procedures to justify it. Your thoughtfulness, incite, and suggestions are greatly appreciated! – Arctelix Jan 28 '15 at 22:08