0

My code is currently structured as follows.

In script1.py, I have the following:

def func():
    class MyClass():
        pass
    return MyClass

In script2.py, I have the following:

import script1
import pickle

the_class = script1.func()

f = open(FILE_PATH, "wb")
pickle.dump(the_class, f)
f.close()

However, script2.py gives an error:

AttributeError: Can't pickle local object 'func.<locals>.MyClass'

Question #1: Why is this occurring? How can I fix this?

Question #2: Can I restructure the code to achieve the same result in a neater way? I want the definition of my class to occur in a separate module. Once I "bring in" the class into my main script, I would like to save it.

Thank you!

dstivd
  • 11
  • 2
  • 1
    Possible duplicate of [I can "pickle local objects" if I use a derived class?](https://stackoverflow.com/questions/36994839/i-can-pickle-local-objects-if-i-use-a-derived-class) – Ashraff Ali Wahab Nov 14 '19 at 18:44
  • See [What can be pickled and unpckled](https://docs.python.org/3/library/pickle.html#what-can-be-pickled-and-unpickled). – chepner Nov 14 '19 at 18:46
  • Definitely a dupplicate of the link posted by @AshraffAliWahab. I voted accordingly. – exhuma Nov 14 '19 at 18:47
  • @Carcigenicate: I tried dill (by replacing ```import pickle``` with ```import dill as pickle```). It still doesn't work. Similar error message. – dstivd Nov 14 '19 at 18:47

2 Answers2

1

To respond to question 1: This is occurring because every time you call the function you will create a new class. So the class is not known at the module level. This is a requirement for pickle as far as I remember. And I don't see a way to fix this.

To tackle the second question it would be good to know why you want to do this. What is the problem you want to solve? This might help guide to a solution.

exhuma
  • 20,071
  • 12
  • 90
  • 123
  • Thanks. The class I want to define depends on certain parameters (passed to ```func(...)```), and I don't want the class definition taking space in my main script. By calling ```script1.func(params)```, I can quickly define the class. Once the class is defined, I want to save it for later in order to be able to reuse it later without redefining it. – dstivd Nov 14 '19 at 18:52
  • That is definitely a case which is not possible with `pickle`. You you're adventurous you could convert it to an AST and save that. But I think the following answer (from the duplicate question mentioned above) may be worth a shot: https://stackoverflow.com/a/37002397/160665 If that does not solve it, it may be worth posting a new question which is more targeted at the actual issue. – exhuma Nov 14 '19 at 18:55
  • I've had a look at the link. Using ```dill``` doesn't help. The error message is ```_pickle.PicklingError: Can't pickle .MyClass'>: it's not found as script1.func..MyClass```. Any idea why? ```dill``` should just be saving the class definition? – dstivd Nov 14 '19 at 19:02
  • ok. I'll take a stab at it. But I don't think it will be *exactly* like you want. But maybe it will still be useful for you. – exhuma Nov 14 '19 at 20:01
  • ... giving this some thought... your example is not sufficient to formulate an answer. This *highly* depends on exactly how dynamic your class will be and what kind of attributes you want to attach. I can see a way to solve this by pickling more basic data types instead of a whole class. But without more details it's impossible to give a good answer. – exhuma Nov 14 '19 at 20:23
  • For my particular case, I have now rewritten the code and I am passing parameters to ```__init__``` while just having one class. Instead of saving the class, I just save the parameters I pass to ```__init__```. This is fine for now, but I will eventually need to define the class more dynamically (e.g. very different methods functions inside the class) and it might not be possible to keep it all as one parameterized class... – dstivd Nov 14 '19 at 20:30
  • That sounds like it's not something possible using `pickle`. Another alternative might be to just save Python code and evaluate those using [compile()](https://docs.python.org/3/library/functions.html#compile) and [exec()](https://docs.python.org/3/library/functions.html#exec). But those have security considerations. – exhuma Nov 14 '19 at 20:35
  • Thanks a lot! I'll have a look at them; haven't used those before. Also, I'm not attached to using ```pickle``` in this scenario. I essentially just want to pack away the chunk of code defining my class and replace it with a short call to a function (or something similar) that returns the requested class definition. Then I want to be able to access that same class definition later. – dstivd Nov 14 '19 at 20:43
  • If you go the `compile`/`exec` route and the generated code does not come from (or include) user-input then it's actually safe to use. So for your use-case that might be doable. But I still wonder if the route of storing **code** does not hint to an architectural issue in your application. But that really depends on your specific case. It might still be a good decision to do this. – exhuma Nov 14 '19 at 20:48
  • My case actually has to do with statistical analysis, so there is no user input. My main concerns are efficient and readable code. So perhaps ```compile``` or ```exec``` is a good choice here, as I am not concerned about security here. – dstivd Nov 14 '19 at 20:57
0

I'm the dill author. I'm not sure why you say dill can't help you. First if we try to serialize your class, it works:

>>> def func():
...   class MyClass():
...     pass
...   return MyClass
... 
>>> import dill
>>> c = func()
>>> c
<class '__main__.func.<locals>.MyClass'>
>>> 
>>> dill.dumps(c)
b'\x80\x03cdill._dill\n_create_type\nq\x00(cdill._dill\n_load_type\nq\x01X\x04\x00\x00\x00typeq\x02\x85q\x03Rq\x04X\x07\x00\x00\x00MyClassq\x05h\x01X\x06\x00\x00\x00objectq\x06\x85q\x07Rq\x08\x85q\t}q\n(X\n\x00\x00\x00__module__q\x0bX\x08\x00\x00\x00__main__q\x0cX\x07\x00\x00\x00__doc__q\rNutq\x0eRq\x0f.'
>>> c_ = dill.loads(_)
>>> c_
<class '__main__.MyClass'>
>>> 

So, since we know the class serializes, we can then fetch it from your module... although here's the thing... if you try to serialize anything defined in a module that isn't installed (i.e. from just a script you import from the current directory), then it should fail due to the serializer's inability to find the locally defined module (unless it's in __main__). It's the module that's an issue... if you are opposed for whatever reason to installing the module... then you have to resort to a lesser-known feature of dill, which is to grab source code from an object. As seen below, you can then either grab the class and define it locally, or you can grab the function and define it locally. Then it will serialize:

>>> import dill
>>> import script1
>>> exec(dill.source.getsource(script1.func(), lstrip=True))
>>> MyClass
<class '__main__.MyClass'>
>>> # or...
>>> exec(dill.source.getsource(script1.func))
>>> c = func()
>>> 
>>> dill.dumps(c)
b'\x80\x03cdill._dill\n_create_type\nq\x00(cdill._dill\n_load_type\nq\x01X\x04\x00\x00\x00typeq\x02\x85q\x03Rq\x04X\x07\x00\x00\x00MyClassq\x05h\x01X\x06\x00\x00\x00objectq\x06\x85q\x07Rq\x08\x85q\t}q\n(X\n\x00\x00\x00__module__q\x0bX\x08\x00\x00\x00__main__q\x0cX\x07\x00\x00\x00__doc__q\rNutq\x0eRq\x0f.'
>>> dill.loads(_)
<class '__main__.MyClass'>
>>> 

The thing to do is not to try to serialize from a module that is not installed. Either install the module you will import from, or put the function that generates the class inside the same file as the script that is dumping the class.

Mike McKerns
  • 33,715
  • 8
  • 119
  • 139