33

Class Foo has a bar, and it is not loaded until it is accessed. Further accesses to bar should incur no overhead.

class Foo(object):

    def get_bar(self):
        print "initializing"
        self.bar = "12345"
        self.get_bar = self._get_bar
        return self.bar

    def _get_bar(self):
        print "accessing"
        return self.bar

Is it possible to do something like this using properties or, better yet, attributes, instead of using a getter method?

The goal is to lazy load without overhead on all subsequent accesses...

martineau
  • 119,623
  • 25
  • 170
  • 301
whats canasta
  • 763
  • 2
  • 7
  • 16
  • You can do that automatically with descriptors: http://jeetworks.org/node/62 – schlamar Jul 05 '13 at 10:16
  • 1
    Werkzeug has a better implementation with extensive comments: https://github.com/mitsuhiko/werkzeug/blob/10b4b8b6918a83712170fdaabd3ec61cf07f23ff/werkzeug/utils.py#L35 – schlamar Jul 05 '13 at 10:22
  • See also: [Python lazy property decorator](http://stackoverflow.com/questions/3012421/python-lazy-property-decorator). – detly Jul 05 '13 at 12:15
  • @whats canasta: Isn't "self.get_bar = self._get_bar" supposed to be "self._bar = self._get_bar" ? – Pintun Sep 23 '16 at 12:49

3 Answers3

21

There are some problems with the current answers. The solution with a property requires that you specify an additional class attribute and has the overhead of checking this attribute on each look up. The solution with __getattr__ has the issue that it hides this attribute until first access. This is bad for introspection and a workaround with __dir__ is inconvenient.

A better solution than the two proposed ones is utilizing descriptors directly. The werkzeug library has already a solution as werkzeug.utils.cached_property. It has a simple implementation so you can directly use it without having Werkzeug as dependency:

_missing = object()

class cached_property(object):
    """A decorator that converts a function into a lazy property.  The
    function wrapped is called the first time to retrieve the result
    and then that calculated result is used the next time you access
    the value::

        class Foo(object):

            @cached_property
            def foo(self):
                # calculate something important here
                return 42

    The class has to have a `__dict__` in order for this property to
    work.
    """

    # implementation detail: this property is implemented as non-data
    # descriptor.  non-data descriptors are only invoked if there is
    # no entry with the same name in the instance's __dict__.
    # this allows us to completely get rid of the access function call
    # overhead.  If one choses to invoke __get__ by hand the property
    # will still work as expected because the lookup logic is replicated
    # in __get__ for manual invocation.

    def __init__(self, func, name=None, doc=None):
        self.__name__ = name or func.__name__
        self.__module__ = func.__module__
        self.__doc__ = doc or func.__doc__
        self.func = func

    def __get__(self, obj, type=None):
        if obj is None:
            return self
        value = obj.__dict__.get(self.__name__, _missing)
        if value is _missing:
            value = self.func(obj)
            obj.__dict__[self.__name__] = value
        return value
schlamar
  • 9,238
  • 3
  • 38
  • 76
  • 6
    The problem with this is outside the scope of a web framework (Werkzueg, Django, Bottle, Pyramid, et al), this doesn't work well with threads. See https://github.com/pydanny/cached-property/issues/6 (which we closed) – pydanny Sep 10 '14 at 21:45
17

Sure, just have your property set an instance attribute that is returned on subsequent access:

class Foo(object):
    _cached_bar = None 

    @property
    def bar(self):
        if not self._cached_bar:
            self._cached_bar = self._get_expensive_bar_expression()
        return self._cached_bar

The property descriptor is a data descriptor (it implements __get__, __set__ and __delete__ descriptor hooks), so it'll be invoked even if a bar attribute exists on the instance, with the end result that Python ignores that attribute, hence the need to test for a separate attribute on each access.

You can write your own descriptor that only implements __get__, at which point Python uses an attribute on the instance over the descriptor if it exists:

class CachedProperty(object):
    def __init__(self, func, name=None):
        self.func = func
        self.name = name if name is not None else func.__name__
        self.__doc__ = func.__doc__

    def __get__(self, instance, class_):
        if instance is None:
            return self
        res = self.func(instance)
        setattr(instance, self.name, res)
        return res

class Foo(object):
    @CachedProperty
    def bar(self):
        return self._get_expensive_bar_expression()

If you prefer a __getattr__ approach (which has something to say for it), that'd be:

class Foo(object):
    def __getattr__(self, name):
        if name == 'bar':
            bar = self.bar = self._get_expensive_bar_expression()
            return bar
        return super(Foo, self).__getattr__(name)

Subsequent access will find the bar attribute on the instance and __getattr__ won't be consulted.

Demo:

>>> class FooExpensive(object):
...     def _get_expensive_bar_expression(self):
...         print 'Doing something expensive'
...         return 'Spam ham & eggs'
... 
>>> class FooProperty(FooExpensive):
...     _cached_bar = None 
...     @property
...     def bar(self):
...         if not self._cached_bar:
...             self._cached_bar = self._get_expensive_bar_expression()
...         return self._cached_bar
... 
>>> f = FooProperty()
>>> f.bar
Doing something expensive
'Spam ham & eggs'
>>> f.bar
'Spam ham & eggs'
>>> vars(f)
{'_cached_bar': 'Spam ham & eggs'}
>>> class FooDescriptor(FooExpensive):
...     bar = CachedProperty(FooExpensive._get_expensive_bar_expression, 'bar')
... 
>>> f = FooDescriptor()
>>> f.bar
Doing something expensive
'Spam ham & eggs'
>>> f.bar
'Spam ham & eggs'
>>> vars(f)
{'bar': 'Spam ham & eggs'}

>>> class FooGetAttr(FooExpensive):
...     def __getattr__(self, name):
...         if name == 'bar':
...             bar = self.bar = self._get_expensive_bar_expression()
...             return bar
...         return super(Foo, self).__getatt__(name)
... 
>>> f = FooGetAttr()
>>> f.bar
Doing something expensive
'Spam ham & eggs'
>>> f.bar
'Spam ham & eggs'
>>> vars(f)
{'bar': 'Spam ham & eggs'}
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • This adds overhead of an extra "if" on every access. Is it possible to redefine the property the first time it's called? – whats canasta Jul 05 '13 at 10:05
  • You would need a flag somewhere anyway, something telling you whether you have already instantiated the property or not. – Stefano Sanfilippo Jul 05 '13 at 10:08
  • 3
    @whatscanasta: Not with a `property`, because Python gives data descriptors priority over instance attributes. But with `__getattr__` you *can* (see update). – Martijn Pieters Jul 05 '13 at 10:10
  • Messing with `__getattr__` is just a bad hack. There are descriptors: https://github.com/mitsuhiko/werkzeug/blob/10b4b8b6918a83712170fdaabd3ec61cf07f23ff/werkzeug/utils.py#L35 – schlamar Jul 05 '13 at 10:28
  • 2
    @schlamar: `__getattr__` is no more a hack than using a non-data descriptor. *Both* set the attribute on the instance to prevent future lookups of the descriptor or `__getattr__` method. – Martijn Pieters Jul 05 '13 at 10:31
  • 3
    @schlamar: Rather than downvote, why don't you post that as an answer yourself? My answer is not wrong or unhelpful. – Martijn Pieters Jul 05 '13 at 10:32
  • Sure, but `__getattr__` hides the attribute before first access (no introspection, ...). So **this** is a hack (or better black magic) while utilizing descriptors is not. – schlamar Jul 05 '13 at 10:34
  • 1
    @schlamar: I'll happily concede that the descriptor trick is neat and tidy, and supports introspection out of the box (I'd have voted for such an answer). But did you know your object can specify a [`__dir__()` method](http://docs.python.org/2/library/functions.html#dir) to list such dynamic attributes? – Martijn Pieters Jul 05 '13 at 10:37
  • 2
    @schlamar: but `__getattr__` has been around *for this purpose* way before descriptors came along. The hook exists *explicitly* to allow you to provide dynamic attributes on custom classes. I would not classify that as a hack, nor downvote an answer as not helpful for using it. – Martijn Pieters Jul 05 '13 at 10:40
  • 2
    @schlamar: But if you are not going to use that as an answer, hope you don't mind that I added it to mine instead. :-) – Martijn Pieters Jul 05 '13 at 11:12
5

Sure it is, try:

class Foo(object):
    def __init__(self):
        self._bar = None # Initial value

    @property
    def bar(self):
        if self._bar is None:
            self._bar = HeavyObject()
        return self._bar

Note that this is not thread-safe. cPython has GIL, so it's a relative issue, but if you plan to use this in a true multithread Python stack (say, Jython), you might want to implement some form of lock safety.

Stefano Sanfilippo
  • 32,265
  • 7
  • 79
  • 80
  • Could you illustrate a bit on what it means to be not thread-safe? Do you mean assigning a value to an attribute is not thread-safe? – GabrielChu Mar 27 '20 at 06:23