0

There is a small change I would like to make to a popular Python library for a library I wrote to work.

Specifically, the scikit-learn library has the following code snippet in this file:

from pickle import whichmodule
    try:
    # Python 2 compat
    from cPickle import loads
    from cPickle import dumps
except ImportError:
    from pickle import loads
    from pickle import dumps
    import copyreg

# Customizable pure Python pickler in Python 2
# customizable C-optimized pickler under Python 3.3+
from pickle import Pickler

from pickle import HIGHEST_PROTOCOL

That I would like to change to this:

from pickle import whichmodule
from dill import loads
from dill import dumps
import copyreg

# Customizable pure Python pickler in Python 2
# customizable C-optimized pickler under Python 3.3+
from dill import Pickler
from dill import HIGHEST_PROTOCOL

Currently, I am changing the file manually and it works.

If there was a way for me to save this one changed file to the repo I wrote that uses scikit-learn and have magic happen so that when I import scikit-learn in my repo, the import used my updated version of the file instead of the standard one, that would be amazing.

This question was helpful if I wanted to fake out a local import of pickle, but wasn't applied to scikit-learn's import of pickle.

Community
  • 1
  • 1
Jason Sanchez
  • 477
  • 2
  • 6
  • 19
  • That file you link to is itself taken from [joblib](https://github.com/joblib/joblib/blob/master/joblib/pool.py). – BrenBarn Feb 26 '17 at 07:04
  • An alternative is to try monkey-patching the module: import it, then do `sklearn.externals.joblib.pool.loads = dill.loads`, etc. – BrenBarn Feb 26 '17 at 07:06
  • Yes and it is used in scikit-learn for various functions that I in turn use. I do not use this file directly. – Jason Sanchez Feb 26 '17 at 07:06
  • @BrenBarn I will try that right now. – Jason Sanchez Feb 26 '17 at 07:07
  • 1
    Tried it, but got this error: `PicklingError: Pickler.__init__() was not called by CustomizablePickler.__init__()` (I can post the full stack trace if that would help). – Jason Sanchez Feb 26 '17 at 07:15
  • Ah, it seems that `pool.py` defines its own class that inherits from the real `Pickler`. You could try copying that class to your own module (so that it inherits from `dill.Pickler`) and then monkeypatching it in as well. – BrenBarn Feb 26 '17 at 07:36
  • Incidentally, how have you installed `sklearn`? If you are installing it from source, you could use `pip install -e` to make an "editable" install. Then if you modify the source, it will work as you suggest in your question. The only issue is that you have to be careful if you upgrade (since it may wipe out your changes, or at least require you to rebase on the updated version). – BrenBarn Feb 26 '17 at 07:38
  • It is installed as part of Anaconda. The install is editable (I changed the file and it works). I just wanted my library to not require a user to manually change files. – Jason Sanchez Feb 26 '17 at 07:43
  • Ah, I didn't realize that by "my repo" you meant "my own library that uses sklearn". I think the monkeypatching route is the way to go, although it will never be a fully stable solution because you can't control what happens if in the future `sklearn` changes in a way that breaks your modifications. – BrenBarn Feb 26 '17 at 20:39
  • I updated the question to clarify that point. Thanks for letting me know it is confusing. – Jason Sanchez Feb 26 '17 at 23:52

0 Answers0