2

Is there a convention of when and how to store values of len() or sum() in python? To provide an example, if you have a Class

class MyClass:

    def __init__(self, single_number = 4, multiple_numbers = [1,2,3]):
        self.single= single_number 
        self.multiple = multiple_numbers

    def info(self):
        print(f"The length of multiple is {len(self.multiple)}")
        print(f"The length of multiple is {len(self.multiple)*4}")
        print(f"The length of multiple is longer than {len(self.multiple)-1}")

if __name__ == "__main__":
    test=MyClass()
    test.info()
    # other stuff
    test.info()

At what point would you start storing len(self.multiple) as its own value? Thankfully, python spares the use of len for some tasks like for my_numbers in multiple_numbers: so I wouldn't need it just for iterations. In addition, the value of len is static for the instance of the class and will be needed (probably) multiple times at different parts within the runtime, so it is not a temporary variable like here. In general, this seems to be a tradeoff between (very small amounts) of memory vs computation. The same issue applies to sum().

Parts of these questions are opinion-based, and I am happy to hear what you think about it, but I am looking primarily for a convention on this.

  1. At what point, if any, should len(self.multiple) be stored as its own value?
  2. Is there a convention for the name? length_of_multiple_numbers seems bloated but would be descriptive.
Finn
  • 2,333
  • 1
  • 10
  • 21
  • 2
    How often does your `multiple` change size? Every time it does you *must* update `self.len_multiple`. – Jongware Mar 16 '20 at 14:56
  • 3
    If the value is static and you would otherwise have to call `len` on it often, store the value. If the value is not static, you might consider storing the value but updating it on any update to the underlying value, but that depends on how often you think `len` would be called between updates. – chepner Mar 16 '20 at 14:56
  • 6
    Only if a) storing it as a variable makes the code easier to read, b) you need to do so because the logic of the program depends on it, or c) you benchmark and determine that it is a bottleneck. – 0x5453 Mar 16 '20 at 14:56
  • @chepner. Given that nothing in Python is private, it's generally difficult to ensure staticness without some hoops. And how much does `len` take to compute anyway? – Mad Physicist Mar 16 '20 at 14:59
  • 4
    `len()` is an O(1) operation on the majority of container types - there's no advantage in storing it. `sum()` is O(n), so it would be a good idea to store this if you're likely to need it more than once. – jasonharper Mar 16 '20 at 14:59
  • @MadPhysicist Not much, though the overhead of the function call itself is something to consider. I used `len` in my comment, but the same points apply to any function that would take the container as an argument. – chepner Mar 16 '20 at 15:01
  • @jasonharper The advantage is readability. In `self.info`, they could store the length as `N` and then there print becomes clearer because you can tell the three lines print respectively `{N}, {N*4}, {N-1}`. – Guimoute Mar 16 '20 at 15:03
  • 1
    @Guimoute in that case, local variable is more useful than storing it in the class (because that's what everyone assumes OP means by "storing as its own value") – h4z3 Mar 16 '20 at 15:06

2 Answers2

5

I would use a local variable, more for code readability than speed:

def info(self):
    n = len(self.multiple)
    print(f"The length of multiple is {n}")
    print(f"The length of multiple is {n*4}")
    print(f"The length of multiple is longer than {n-1}")

Local variable names can be short, since the assignment is on the same screen as the use. I use my own conventions, but they generally follow common informal conventions.

I wouldn't try to assign len(...) to a self attribute, much less a global.

Basically any value that's used repeatedly in a function/method is a candidate for local variable assignment.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
0

I am not convinced that there is much to justify storage, unless the computation cost is heavy each time. See hpaulj's answer.

However, if you really really wanted to, you could use a property and even possibly cache it.

class MyList(list):

   @property
   def len_(self):
      return len(self)  #it's a list
or

   _len_ = None

   @property 
   def len_(self):
      if self._len_ is None:
          self._len_ = len(self)
      return self._len_

    def append(self, value):
       self._len_ = None
       super(MyList, self).append(value)

    ...and all other len-modifying methods also need to clear the cache.

Again, if you cache it, you need to make sure you reset the cache each time your result should change. Which is also the weak point with your idea of storing on an instance variable - the additional complexity to make sure you don't have outdated data should probably only be accepted once you've profiled that this is indeed a performance bottleneck.

(these issues are not helped by using a mutable default argument for multiple_numbers in your example, btw). To generally extend that - if your sum/len depends on the state of mutable items, then storing/caching the computations is an even worse idea. i.e. if MyList refers to objects that themselves have a len/sum which needs to be aggregated, then MyList has no business whatsoever caching/storing.

Naming-wise, I'd probably go for what's a semi-convention naming to avoid shadowing built-ins/conventional names i.e. adding a _: cls -> cls_, list -> list_.

JL Peyret
  • 10,917
  • 2
  • 54
  • 73
  • Your syntax is strange. Decorator has extra text, or in the middle of the class body, ... – Mad Physicist Mar 16 '20 at 18:05
  • Good catch. I "freehand-ed" it on SO rather than editing a .py file and pasting in the code. You know, the really strange thing is I had to delete it several times before the changes took. – JL Peyret Mar 16 '20 at 20:21