0

It appears to me that each instance of a particular class has its own dictionary. This could waste a lot of space when there is a large number of identically structured class objects. Is this actually the case, or is the underlying mechanism more efficient, only creating an object's dictionary when it is explicitly asked for. I am considering an application where I may have a very large number, possibly into millions, of objects, should I avoid using a class and instead use a sequence with a named constant as the index?

Chris Barry
  • 2,250
  • 1
  • 14
  • 8
  • 1
    Millions is not a very large number, but you should check out `__slots__`. For example here: http://stackoverflow.com/questions/472000/usage-of-slots – Paul Hankin Feb 18 '17 at 14:05
  • @Paul Hankin That should have been an answer, then I could have upvoted it. It is precisely the answer I was looking for. – Chris Barry Feb 18 '17 at 14:16
  • It has the additional benefit that only elements named in the __slots__ variable can be accessed, so typing errors are detected sooner. This really should be much more prominent in the Python documentation. – Chris Barry Feb 18 '17 at 14:33

2 Answers2

1

If you want to reduce the overhead you have two options depending on what you actually need.

If you need a class-like structure then you should consider using __slots__. This will avoid the __dict__ but still allows you to have methods, properties and so on. You'll lose the ability to dynamically add attributes (you're restricted to those listed as __slots__).

If you just want a "storage" for objects and don't need methods and similar you can use collections.namedtuple. These provide a "class-like" interface to their items and should be pretty space-efficient.

For example a class that just has two attributes "lastname" and "firstname" could be implemented as:

class Person(object):
    __slots__ = ['firstname', 'lastname']

    def __init__(self, firstname, lastname):
        self.firstname = firstname
        self.lastname = lastname

    def __repr__(self):
        return '{self.__class__.__name__}({self.firstname!r}, {self.lastname!r})'.format(self=self)

>>> p = Person('Tom', 'Riddle')
>>> p
Person('Tom', 'Riddle')
>>> p.firstname
'Tom'

or as namedtuple:

>>> from collections import namedtuple

>>> Person = namedtuple('Person', 'firstname, lastname')

>>> p = Person('Tom', 'Riddle')
>>> p
Person(firstname='Tom', lastname='Riddle')
>>> p.firstname
'Tom'
MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • Being prevented from dynamically adding members is, in my opinion, more often a benefit. I didn't know about namedtuple, and would otherwise have considered it to be a good solution, but __slots__ seems superior. – Chris Barry Feb 18 '17 at 14:40
  • @ChrisBarry Both have their use-cases. I agree that `__slots__` is superior (they allow methods and real properties), but in some cases you just want a "class-like" immutable storage container and then `namedtuple` is a viable alternative. – MSeifert Feb 18 '17 at 14:52
-1

That depends on the data you want to store in each object, but in most cases lists should do.

Daniel
  • 473
  • 4
  • 9