1

Is there a way to get the string representation of an object on disk without loading the object into memory? I thought of calling repr() on the file object returned from calling open() on the object but that returns the class/mode of the file object per documentation.

import os
import pickle
import tempfile
import datetime
from copy import copy

class Model:
    def __init__(self, identifier):
        self.identifier = identifier
        self.creation_date = datetime.datetime.now()
    def __repr__(self):
        return '{0} created on {1}'.format(self.identifier, self.creation_date)

identifier = 'identifier'
model1 = Model(identifier)
model2 = copy(model1)

with tempfile.TemporaryDirectory() as directory:
    with open(os.path.join(directory, identifier), 'wb') as f:
        # persist model and delete from RAM
        pickle.dump(model2, f)
        del model2

    with open(os.path.join(directory, identifier), 'rb') as f:
        print('is model stale: {}'.format(repr(model1) != repr(f)))
        print('Disk model: {}'.format(repr(f)))
        print('RAM model: {}'.format(repr(model1)))

I'd like to return the string representation of model2 (i.e. identifier created on <creation_date>) without actually loading model2 into memory.

Do share another workaround you may have used to accomplish a similar purpose.

Thanks.

  • MacOS
  • Python 3.6.4
sedeh
  • 7,083
  • 6
  • 48
  • 65
  • No there isn't any way. – martineau Apr 07 '18 at 00:14
  • 1
    How could that possibly work? The `__repr__` method needs to acces properties of the object, where is it supposed to get them from if the object isn't loaded into memory? – Barmar Apr 07 '18 at 00:17
  • 1
    Couldn't you just embed the `identifier/creation_date` info in the file-name? – ekhumoro Apr 07 '18 at 00:20
  • Good thought @ekhumoro. In our process, we'd like to avoid having the timestamp on the model name itself. – sedeh Apr 07 '18 at 00:23
  • 3
    @sedeh. Or use a non-compressed archive (e.g. [tar](https://docs.python.org/3/library/tarfile.html#module-tarfile)) which contains the pickle plus a small file containing the relevant info. – ekhumoro Apr 07 '18 at 00:26

2 Answers2

0

If you serialize your object as JSON rather than binary .pickle, you can manipulate or filter the text directly without deserializing it. See How to make a class JSON serializable for some nice examples (particularly the jsonpickle and .toJSON answers).

Joshua R.
  • 2,282
  • 1
  • 18
  • 21
0

I wrote an lazy pickle loader many years ago here. You could pickle a ((id, creation_date), model) and then just deserialize that first tuple.

guidoism
  • 7,820
  • 8
  • 41
  • 59