71

I'm learning how to use pickle. I've created a namedtuple object, appended it to a list, and tried to pickle that list. However, I get the following error:

pickle.PicklingError: Can't pickle <class '__main__.P'>: it's not found as __main__.P

I found that if I ran the code without wrapping it inside a function, it works perfectly. Is there an extra step required to pickle an object when wrapped inside a function?

Here is my code:

from collections import namedtuple
import pickle

def pickle_test():
    P = namedtuple("P", "one two three four")
    my_list = []
    abe = P("abraham", "lincoln", "vampire", "hunter")
    my_list.append(abe)
    with open('abe.pickle', 'wb') as f:
        pickle.dump(abe, f)
    
pickle_test()
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Dirty Penguin
  • 4,212
  • 9
  • 45
  • 69
  • 1
    Unfortunately, pickle doesn't seem to work well with namedtuples. – Antimony May 04 '13 at 17:48
  • 10
    @Antimony: `pickle` handles namedtuple classes just fine; classes defined in a function local namespace not so much. – Martijn Pieters May 04 '13 at 23:37
  • 2
    possible duplicate of [Python: Can't pickle type X, attribute lookup failed](http://stackoverflow.com/questions/4677012/python-cant-pickle-type-x-attribute-lookup-failed) – Air Jun 17 '14 at 02:23
  • @AirThomas This question was asked/answered a year ago :) – Dirty Penguin Jun 17 '14 at 20:01
  • That doesn't affect whether it's a duplicate - and now the questions are linked to each other in the sidebar, which is useful. The comment is not meant as a criticism, it's automatically generated when flagging. – Air Jun 17 '14 at 20:18
  • 2
    None taken. I just thought it was funny. Question linking is very useful indeed :) – Dirty Penguin Jun 17 '14 at 20:23
  • There is a similar problem. If the type variable and the type string used in the constructor are not the same then pickle will also fail. e.g. `P = namedtuple("Q", "one two three four")` – Andrew Hoos Feb 04 '15 at 22:37
  • For posterity: that error also occurs if the `typename` argument to namedtuple doesn't match the class name returned by namedtuple. See: https://stackoverflow.com/a/28149627/3396951 – Minh Tran Aug 17 '18 at 19:15

5 Answers5

96

Create the named tuple outside of the function:

from collections import namedtuple
import pickle

P = namedtuple("P", "one two three four")

def pickle_test():
    my_list = []
    abe = P("abraham", "lincoln", "vampire", "hunter")
    my_list.append(abe)
    with open('abe.pickle', 'wb') as f:
        pickle.dump(abe, f)

pickle_test()

Now pickle can find it; it is a module global now. When unpickling, all the pickle module has to do is locate __main__.P again. In your version, P is a local, to the pickle_test() function, and that is not introspectable or importable.

Note that pickle stores just the module and the class name, as taken from the class's __name__ attribute. Make sure that the first argument of the namedtuple() call matches the global variable you are assigning to; P.__name__ must be "P"!

It is important to remember that namedtuple() is a class factory; you give it parameters and it returns a class object for you to create instances from. pickle only stores the data contained in the instances, plus a string reference to the original class to reconstruct the instances again.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 10
    So, what if I am creating the `namedtuple` dynamically because I don't know the fields until runtime? Is there still a way to bypass this issue? I tried creating another method outside of the class but that didn't work. – Chuim Jun 26 '13 at 19:53
  • 8
    @Chuim: Assign it to your module globals (use `globals()` to get a mapping) under the *same name*, and `pickle` can find it still. – Martijn Pieters Jun 26 '13 at 21:12
  • If you are creating multiple namedtuples with dynamic fields, unpickling won't work after you override `P` in `globals`. – Davit Tovmasyan Aug 01 '20 at 19:20
  • @DavitTovmasyan: No, each class needs a separate name. – Martijn Pieters Aug 01 '20 at 19:28
  • It doesn't seem to be working on Python 3.x – facehugger Nov 29 '21 at 05:45
  • @facehugger: the technique applies just as much to Python 3 as it did to Python 2. The _test code_ was specific to Python 2, but you can trivially alter that by replacing `"w"` with `"wb"` (open the file in binary mode). I've edited the question and answer to make that change. – Martijn Pieters Jan 19 '22 at 19:08
  • @facehugger: note that the variable name of the named tuple must match the name passed to the namedtuple function. In this case they are both 'P'. If these names differ, attribute lookup will fail. – MRule Mar 27 '22 at 11:47
  • @Martijn Pieters: could you please edit this question to reflect that the global variable name that the namedtuple is assigned to *must* match the name passed as the first argument in the namedtuple function? In this case they are both `P`. Your edit cue is full, so I can't submit this as an edit. – MRule Mar 27 '22 at 11:49
  • @MRule: it is not a hard requirement, it'll just cause a bit of confusion. `Q = namedtuple('P', ...) is just the same as `class P: ...`, then `Q = P` and `del P`. Both would make the class unfindable by pickle, of course. – Martijn Pieters Mar 27 '22 at 12:06
  • I ran some tests and I get the "cannot look up attribute" error every time the names differ. Python3? hmm.. – MRule Mar 27 '22 at 21:00
  • @MRule: yes, that's what I meant by *would make the class unfindable by pickle*. Had you added `actual_name = other_name` to your module, with `actual_name` matching the firsts argument to `namedtuple()`, it would work again. – Martijn Pieters Mar 28 '22 at 11:44
13

I found this answer in another thread. This is all about the naming of the named tuple. This worked for me:

group_t =            namedtuple('group_t', 'field1, field2')  # this will work
mismatched_group_t = namedtuple('group_t', 'field1, field2')  # this will throw the error
Michael
  • 8,362
  • 6
  • 61
  • 88
Ruvalcaba
  • 445
  • 1
  • 4
  • 9
11

After I added my question as a comment to the main answer I found a way to solve the problem of making a dynamically created namedtuple pickle-able. This is required in my case because I'm figuring out its fields only at runtime (after a DB query).

All I do is monkey patch the namedtuple by effectively moving it to the __main__ module:

def _CreateNamedOnMain(*args):
    import __main__
    namedtupleClass = collections.namedtuple(*args)
    setattr(__main__, namedtupleClass.__name__, namedtupleClass)
    namedtupleClass.__module__ = "__main__"
    return namedtupleClass

Mind that the namedtuple name (which is provided by args) might overwrite another member in __main__ if you're not careful.

Chuim
  • 1,983
  • 3
  • 17
  • 20
  • 20
    Simply set it on `globals()` instead: `globals()[namedtupleClass.__name__] = namedtupleClass`. Then there is *no need* to set the `__module__`. – Martijn Pieters Jun 26 '13 at 21:13
  • When I tried `globals()[namedtupleClass.__name__] = namedtupleClass` it did indeed allow me to pickle my object, but when I tried to unpickle it didn't have the `namedtupleClass` it needed. My advice is to **just use a dictionary** until they make pickle smart enough to do this. – Teque5 Jan 20 '17 at 21:55
  • @Teque5 it works as long as then name you pass to `namedtuple()` is unique in the the module – Hubert Kario Feb 02 '22 at 13:31
6

Alternatively, you can use cloudpickle or dill for serialization:

from collections import namedtuple

import cloudpickle
import dill



def dill_test(dynamic_names):
    P = namedtuple('P', dynamic_names)
    my_list = []
    abe = P("abraham", "lincoln", "vampire", "hunter")
    my_list.append(abe)
    with open('deleteme.cloudpickle', 'wb') as f:
        cloudpickle.dump(abe, f)
    with open('deleteme.dill', 'wb') as f:
        dill.dump(abe, f)


dill_test("one two three four")
Peque
  • 13,638
  • 11
  • 69
  • 105
2

The issue here is the child processes aren't able to import the class of the object -in this case, the class P-, in the case of a multi-model project the Class P should be importable anywhere the child process get used

a quick workaround is to make it importable by affecting it to globals()

globals()["P"] = P
rachid el kedmiri
  • 2,376
  • 2
  • 18
  • 40