0

I would like a class that in its initialize checks if filename exists. If it does it should initialize itself with filename, otherwise it should run init. At a later point I can then run a save method, saving the entire object.

A sketch of what I want:

class data(object):
    def __init__(self, filename):
        if does_not_exist(filename): # create new
             [... expensive computations]
             self.save(filename)
        else: # load existing
            with open(filename,'rb') as fp:
                self = pickle.load(fp)

    def save(self, filename):
        with open(filename,'wb') as fp:
        pickle.dump(self, fp)

When loading I know that I can do something like

tmp = pickle.load(fp)
    self.a = tmp.a
    self.b = tmb.b
    ...

But I hope that there is a better way


I assume this question has been asked before, but couldn't find it :/

Toke Faurby
  • 5,788
  • 9
  • 41
  • 62
  • 1
    Why would you pickle the whole object? Pickle only the storage data you need - you'll have the structure already loaded so it makes little sense to try to save it with the data. – zwer May 22 '17 at 22:55
  • Agreed. I just don't know how to store all the data in a succinct way. I would prefer not having to write something for data object. How would I go about this? – Toke Faurby May 23 '17 at 00:50
  • 1
    You can pickle only relevant data with `__setstate__` and load it back with `__getstate__` magic methods. It not only gives you more control on how to save and bring back instance-relevant data but it also makes the pickling less error prone (accidental class overwrites and such). See the third example from: https://docs.python.org/2/library/pickle.html#example – zwer May 23 '17 at 01:20

1 Answers1

2

Assigning to self within __init__ is meaningless, since you're not modifying the object that self points to -- you're just binding the variable name self in the function to a different object.

What you can do instead is use a staticmethod or classmethod to perform the optional loading from cache:

class Data(object):
    @classmethod
    def init_cached(cls, filename):
        if not os.path.exists(filename):  # create new
            result = cls(filename)
            result.save(filename)
            return result
        else:
            with open(filename, 'rb') as fp:
                return pickle.load(fp)

    def __init__(self, filename):
        pass  # [... expensive computations]

Now, use Data.init_cached() instead of Data() to initialize your object.


A more fancy approach would involve overriding Data.__new__() to achieve the same thing, but where initialization with Data() transparently checks if a cached version exists:

class Data(object):
    def __new__(cls, filename):
        if not os.path.exists(filename):  # create new
            return super(Data, cls).__new__(cls, filename, _save=True)
        else:
            with open(filename, 'rb') as fp:
                return pickle.load(fp)

    def __init__(self, filename, _save=False):
        # [... expensive computations]
        if _save:
            self.save(filename)

Further reading: Python's use of __new__ and __init__?

Mathias Rav
  • 2,808
  • 14
  • 24