0

This question is based off of this question regarding lazy attributes for python classes.

I really like the solution given there:

Here is an example implementation of a lazy property decorator:

import functools

def lazyprop(fn):
    attr_name = '_lazy_' + fn.__name__

    @property
    @functools.wraps(fn)
    def _lazyprop(self):
        if not hasattr(self, attr_name):
            setattr(self, attr_name, fn(self))
        return getattr(self, attr_name)

    return _lazyprop


class Test(object):

    @lazyprop
    def a(self):
        print 'generating "a"'
        return range(5)

Interactive session:

>>> t = Test()
>>> t.__dict__
{}
>>> t.a
generating "a"
[0, 1, 2, 3, 4]
>>> t.__dict__
{'_lazy_a': [0, 1, 2, 3, 4]}
>>> t.a
[0, 1, 2, 3, 4]

This solution allows you to create a @lazyprop for any attribute. However, you must write a method for each attribute that you wish to be lazy. I need something that will work for attributes whose names I won't know ahead of time (of which there may be many).

These attributes are DataFrames read in from hdf5 files. Each file contains many different tables, the names of which I won't know. I have an excellent function, get_all_table_names(filename) that returns the names of all the tables in the file. Currently, I loop through all the names, and read them in one after another. There are however, several tens of GB of data, which take several minutes to read in.

Is there a way to only actually read in a table when a method calls that table? The example given here is perfect, except that I need to know the name of the table ahead of time.

EDIT

The code to load data from an HDF5 file to a Pandas DataFrame looks like the following.

df = read_to_pandas(directory_of_files, 'table_name', number_of_files_to_read)
natemcintosh
  • 730
  • 6
  • 16
  • 1
    Are you asking how to dynamically generate a class that has lazy attributes which aren't specified until runtime? If so, how do the generated getters obtain the data? Would that be part of each attribute's specification as well as its name? – martineau Mar 24 '19 at 23:14
  • "Are you asking how to dynamically generate a class that has lazy attributes which aren't specified until runtime?" Yes – natemcintosh Mar 24 '19 at 23:25
  • "If so, how do the generated getters obtain the data?" The data is obtained from a function that reads tables from HDF5 files, and places them into Pandas DataFrames. Note, that the reader function will read all like tables from a set of HDF5 tables. E.g. if I have 100 MDF5 files, each with a table named 'XYZ' in them, this function will read in all the 'XYZ' tables from all the files and put them in 1 DataFrame. – natemcintosh Mar 24 '19 at 23:30
  • "Would that be part of each attribute's specification as well as its name?" What does specification mean in this context? – natemcintosh Mar 24 '19 at 23:31
  • I meant would the specification of each attribute include a function (or perhaps the arguments to pass an existing one) to obtain the attribute's data — in addition to providing its name. If you add code to your question that shows how to read an MDF5 file into a DataFrame, I or someone else here can probably show you concretely how to generate such a class at runtime — otherwise the best you're likely to get is something fairly vague. – martineau Mar 24 '19 at 23:59
  • I think I see what you're getting at @martineau. I'll add some code showing how a file is read in – natemcintosh Mar 25 '19 at 00:40
  • Sorry, too late. The code in the answer I just added is all I have time for at the moment. Bon appetit! – martineau Mar 25 '19 at 00:46

1 Answers1

0

Here's a generic template showing how to generate a class on-the-fly with dynamic lazy attribute(s):

import functools
import types


def lazyprop(added_value):
    """ Slightly generalize lazy attribute property decorator.
        (i.e. a decorator-factory)
    """
    def prop(fn):
        attr_name = '_lazy_' + fn.__name__ + str(added_value)

        @property
        @functools.wraps(fn)
        def _lazyprop(self):
            if not hasattr(self, attr_name):
                setattr(self, attr_name, fn(self, added_value))
            return getattr(self, attr_name)

        return _lazyprop

    return prop


def make_class(class_name, attrs):

    # Generic methods and class __dict__.
    def __init__(self):
        print('creating instance of class', self.__class__.__name__)

    def getter(self, added_value):
        return 41 + added_value

    cls_dict = {
        '__init__': __init__,
        '__repr__': lambda self: 'class name: %s' % class_name,
    }

    # Create and added lazy attributes.
    for i, attr_name in enumerate(attrs):
        cls_dict[attr_name] = lazyprop(i)(getter)

    cls = types.new_class(class_name, (), {}, lambda ns: ns.update(cls_dict))
    cls.__module__ = __name__

    return cls


if __name__ == '__main__':

    Foobar = make_class('Foobar', ('attr1', 'attr2'))

    foobar = Foobar()    # -> creating instance of class Foobar
    print(foobar)        # -> class name: Foobar
    print(foobar.attr1)  # -> 41
    print(foobar.attr2)  # -> 42
martineau
  • 119,623
  • 25
  • 170
  • 301