0

I want to pickle an object to be able to store/restore my session. The object holds several references to other objects.

I know that the referred objects get pickled with it, but when unpickling the references change. I also know that pickling the objects together maintains the reference, although this would prove complicated for me to do, as I would need to pickle a whole complex structure of objects.

Example

p1 = Point()
p2 = Point()
p1.nearest_point = p2
p2.nearest_point = p1

line = Line(p1, Point())

with open("pickled", "wb") as file:
    pickler = pickle.Pickler(file)
    pickler.dump(p1)
    pickler.dump(p2)

with open("pickled", "rb") as file:
    pickler = pickle.Unpickler(file)
    p1 = pickler.load()
    p2 = pickler.load()

# True: Reference between the two pickled objects is maintained
assert p1.nearest_point == p2

# False: Reference between not-pickled and pickled objects is broken, leading to a duplicate object (original, unpickled)
assert line.pointA == p1

In this case the solution could be to also pickle the Line object, but in my real-life case I am handling a much more complex structure, where pickling/unpickling every part of the structure would surely lead to oversights and bugs.

How can I correctly handle this?

RedKnight91
  • 340
  • 3
  • 17
  • Does this answer your question? [What is the pickling problem that persistent IDs are used to solve here?](https://stackoverflow.com/questions/56414880/what-is-the-pickling-problem-that-persistent-ids-are-used-to-solve-here) – MisterMiyagi Oct 16 '21 at 10:12
  • Put everything you want to pickle in a list and pickle the list. – BoarGules Oct 16 '21 at 10:23

1 Answers1

0

Pickle maintains references by tracking the address of an object. On unpickling, objects are reconstructed. Depending on the use cases, three approaches can achieve your goal:

  1. Re-referencing to memory object on unpickling. If your pickled data get changed outside of the program, it will be discarded.
  2. Re-referencing to memory object on unpickling and trying to merge changes.
  3. Update any reference to the memory object with unpickled obj.

In the following solution, I assume that we take approach 2:

The utility function is based on a (now-deleted) comment made by "Tiran" in a weblog discussion @Hophat Abc references in his own answer that will work in both Python 2 and 3.

Disclaimer: The utility function is not safe. You may need add your own safe guard on top of it, or use other ways to dereference from an id. Also, use id is not safe. The code is just for demonstration, but you got the idea.

import _ctypes

def di(obj_id):
    """ Inverse of id() function. """
    return _ctypes.PyObj_FromPtr(obj_id)

def merge(obj1, obj2):
    """Merge changes."""
    obj1.__dict__ = obj2.__dict__
    return obj1

p1 = Point()
p2 = Point()
p1.nearest_point = p2
p2.nearest_point = p1

line = Line(p1, Point())

with open("pickled", "wb") as file:
    pickler = pickle.Pickler(file)

    seen = {}
    def persistent_id(obj):
        # Bypass ids
        if type(obj) == int:
            return None

        if id(obj) in seen:
            return id(obj)
        else:
            seen[id(obj)] = True
            # pickle id with the object
            return (id(obj), obj)

    pickler.persistent_id = persistent_id
    pickler.dump(p1)
    pickler.dump(p2)

with open("pickled", "rb") as file:
    unpickler = pickle.Unpickler(file)

    def persistent_load(pid):
        if type(pid) == tuple:
            # Call di for dereferencing
            return merge(di(pid[0]), pid[1])
        else:
            return di(pid)
    
    unpickler.persistent_load = persistent_load
    p1 = pickler.load()
    p2 = pickler.load()

# True: Reference between the two pickled objects is maintained
assert p1.nearest_point == p2

# True: Referenceing restored.
assert line.pointA == p1
Tianium
  • 1
  • 1