2

an essential part of my project is being able to save and load class instances to a file. For further context, my class has both a set of attributes as well as a few methods.

So far, I've tried using pickle, but it's not working quite as expected. For starters, it's not loading the methods, nor it's letting me add attributes that I've defined initially; in other words, it's not really making a copy of the class I can work with.

Relevant Code:

class Brick(object):

    def __init__(self, name, filename=None, areaMin=None, areaMax=None, kp=None): 
        self.name = name
        self.filename = filename
        self.areaMin = areaMin
        self.areaMax = areaMax
        self.kp = kp
        self.__kpsave = None
        if filename != None:
            self.__logfile = file(filename, 'w')
    def __getstate__(self):
        f = self.__logfile
        self.__kpsave = []
        for point in self.kp: 
            temp = (point.pt, point.size, point.angle, point.response, point.octave, point.class_id)
            self.__kpsave.append(temp)
        return (self.name, self.areaMin, self.areaMax, self.__kpsave,
                f.name, f.tell())
    def __setstate__(self, state):
        self.value, self.areaMin, self.areaMax, self.__kpsave, name, position = state
        f = file(name, 'w')
        f.seek(position)
        self.__logfile = f
        self.filename = name
        self.kp = [] 
        for point in self.__kpsave:
            temp = cv2.KeyPoint(x=point[0][0], y=point[0][1], _size=point[1], _angle=point[2], _response=point[3],
                                _octave=point[4], _class_id=point[5])
            self.kp.append(temp)
    def calculateORB(self, img):
        pass #I've omitted the actual method here

(There are a few more attributes and methods, but they're not relevant)

Now, this class definition works just fine when creating new instances: I can make a new Brick with just the name, I can then set areaMin or any other attribute, and I can use pickle(cPickle) to dump the current instance to a file just fine (I'm using those getstate and setstate because pickle won't work with OpenCV's Keypoint elements).

The problem comes, of course, when I do load the instance: using pickle load() I can load the instance from a file, and the values I set previously will be there (ie I can access areaMin just fine if I did set a value for it) but I can't access either methods or add values to any of the other attributes if I never changed their values. I've noticed that I don't need to import my class definition either if I'm simply pickling from a completely different source file.

Since all I want to do is build a "database" of sorts from my class objects, what's the best way to approach this? I know something that should work is to simply write a .Save() method that writes a .py source file where I essentially create an instance of the class, so I can then .Load() which will do exec and eval as appropriate, however, this seems like the worst possible way to do this, so, how should I actually do this?

Thanks.

jsbueno
  • 99,910
  • 10
  • 151
  • 209
Hawawa
  • 93
  • 1
  • 1
  • 6
  • Picke does not "load the methods" - it sves the instance data - in this case, as you return them in `__getstate__`. The code that will unpickle that data has to be able to access the same "Brick" class declaration as the pickling code. – jsbueno Apr 12 '17 at 17:37
  • 1
    could you show an example of code that doesn't do what you expect it to? This sentence: "I can't access either methods or add values to any of the other attributes if I never changed their values." makes no sense to me, I think an example would help. – Tadhg McDonald-Jensen Apr 12 '17 at 17:49
  • @TadhgMcDonald-Jensen for example: I create an instance where I define name and areaMin, and then save it with pickle.dump(). Then, in a different source file, I read that file with pickle.load. The original class has areaMax as well as a calculateORB method, however, I can't do either `MyBrick.areaMax = 800` nor I can do `MyBrick.calculateORB(img)`: Both return a "MyBrick has no areaMax attribute" – Hawawa Apr 12 '17 at 19:20
  • @jsbueno How do I do that then? I've tried importing that class on the source file where I'm loading the files, but it seems the code is ignoring that import statement (and PyCharm does show it greyed out like with all other "useless" lines of code). I'm not sure how to tell pickle where to look for the class declaration – Hawawa Apr 12 '17 at 19:22
  • The important thing is that the qualified name of the class - that is: packahge, subpackeges, module and class name - are the same in pickling and unpickling code. So, if `class Brick` is inside a brick.py file, just ensure that in both files you do `import brick` to have `brick.Brick` available. Nonethelsse I had not actally looked at teh contents of your `__get`and `__setstate__` - you are doing it wrong - I will fillin an answer. – jsbueno Apr 12 '17 at 19:43

1 Answers1

6

You should not try to do I/O inside your __getstate__ and __setstate__ methods - those are called by Pickle, and the expted result is just an in-memory object that can be further pickled.

Moreover, if your "Point" class in the "self.kp" attribute is just a regular Python class, there is no need for you to customize pickling at all -

What you have to worry about is to deal with the I/O at the point you call Pickle. If you really need to load different instances independently, you could resort to the "shelve" module, or, better yet, use pickle.dumps and store the resulting string in a DBMS (which can be the built-in sqlite).

All in all:

class Point(object):
    ...

class Brick(object):
    def __init__(self, point, ...):
         self.kp = point

Then, to save a single object to a file:

with open("filename.pickle", "wb") as file_:
    pickle.dump(my_brick, file_, -1)

and restore with:

my_brick = pickle.load(open("filename.pickle", "rb", -1)

To store several instances and recover all at once, you could just dump then in sequence to the same open file, and them read one by one until you got a fault due to "empty file" - or ou can simply add all objects you want to save to a List, and pickle the whole list at once.

To save and retrieve arbitrary objects that you can retrieve giving some attrbute like "name" or "id" - you can resort to the shelve module: https://docs.python.org/3/library/shelve.html or use a real database if you need complex queries and such. Trying to write your own ad hoc binary format to allow for searching the required instance is an horrible idea - as you'd have to implement all the protocol for that file 0 reading, writting, safeguards, corner cases, and such.

jsbueno
  • 99,910
  • 10
  • 151
  • 209
  • Thanks! The reason I'm doing that bit of code inside `__setstate__` and `__getstate__` is because OpenCV's "Keypoint" elements can't be pickled, I'm basically doing what's shown here http://stackoverflow.com/questions/10045363/pickling-cv2-keypoint-causes-picklingerror - by doing what I was doing with the setstate/getstate I could avoid the error and store keypoints (kp) just fine. In any case, making sure I was in fact importing the right class did work, so thank you for pointing that out! – Hawawa Apr 13 '17 at 14:15