0

I have 3 processes that will each create a large object, which takes about 6 seconds. However, when using getstate and setstate, copying them takes more than 60 seconds. Is there a way to make the pickling faster? The objects are python classes inherited from a C++ extensions with boost/python. I have tried using the highest protocol as suggested in the post here since apparently dill is faster than pickle partly because it uses the highest protocole (from this post), but it's not making any difference.

Here's the basic structure of the code, the pickle support for deque and vectors are implemented in another file and is based on this post:

C++ base class:

class Base{
public:
    Base(int a, boost::python::object o);
    // attributes
    vector<Info*>* infos;
    deque<Foo*>* foos;
    int a;
    boost::python::object o;
    // functions
};
Base::Base(int a, boost::python::object o)){
    // assign args
    this->a = a;
    this->o = o;
    for(int i = 0; i < 50; i++){
        this->infos->push_back(new Info(i,i));
    }
}
struct base_picke_suite : boost::python::pickle_suite {
    static boost::python::tuple getinitargs(Base const& base){
        return boost::python::make_tuple(base.a, base.o);
    }
    static boost::python::tuple getstate(boost::python::object obj){
        const Base& base = boost::python::extract<Base&>(obj)();
        return boost::python::make_tuple(base.infos, base.foos, obj.attr("__dict__");
    }
    void setstate(boost::python::object obj, boost::python::tuple state){
        Base& base = boost::python::extract<Base&>(obj)();
        base.infos = state[0];
        base.foos = state[1];
        boost::python::dict d = extract<dict>(obj.attr("__dict__"));
        d.update(state[2]);
    }
    static bool getstate_manages_dict() { return true; }
};

In the python class:

from Base import *
class Derived(Base):
    def __init__(self, a, o):
        super().__init__(a, o)
        self.i = 1
        self.j = 2
    def __getinitargs__(self):
        return (self.a, self.o)
    def __getstate__(self):
        return (self.infos, self.foos, self.i, self.j, self.__dict__)
    def __setstate__(self, state):
        self.infos = state[0]
        self.foos = state[1]
        self.i = state[2]
        self.j = state[3]
        self.__dict__.update(state[4])
    __getstate_manages_dict__ = True

While everything compiles and runs, the picking is 2x slower than native python code. Is there a way to fix this?

Thanks

qwerty_99
  • 640
  • 5
  • 20
  • Short answer: likely no. Longer answer: with self-contained code, people could help you profile/verify – sehe Jul 31 '20 at 21:42
  • would it be faster to use pickle.dumps and pickle.load or is that basically the same thing? – qwerty_99 Aug 02 '20 at 19:28
  • Sounds like the same? However, worth a try. I'll help if I can repro the slowness – sehe Aug 02 '20 at 23:06

0 Answers0