7

I am trying to dump a dictionary into pickle format, using 'dump' command provided in python. The file size of the dictionary is around 150 mb, but an exception occurs when only 115 mb of the file is dumped. The exception is:

Traceback (most recent call last): 
  File "C:\Python27\generate_traffic_pattern.py", line 32, in <module> 
    b.dump_data(way_id_data,'way_id_data.pickle') 
  File "C:\Python27\class_dump_load_data.py", line 8, in dump_data 
    pickle.dump(data,saved_file) 
  File "C:\Python27\lib\pickle.py", line 1370, in dump 
    Pickler(file, protocol).dump(obj) 
  File "C:\Python27\lib\pickle.py", line 224, in dump 
    self.save(obj) 
  File "C:\Python27\lib\pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
  File "C:\Python27\lib\pickle.py", line 649, in save_dict 
    self._batch_setitems(obj.iteritems()) 
  File "C:\Python27\lib\pickle.py", line 663, in _batch_setitems 
    save(v) 
  File "C:\Python27\lib\pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
  File "C:\Python27\lib\pickle.py", line 600, in save_list 
    self._batch_appends(iter(obj)) 
  File "C:\Python27\lib\pickle.py", line 615, in _batch_appends 
    save(x) 
  File "C:\Python27\lib\pickle.py", line 286, in save 
    f(self, obj) # Call unbound method with explicit self 
  File "C:\Python27\lib\pickle.py", line 599, in save_list 
    self.memoize(obj) 
  File "C:\Python27\lib\pickle.py", line 247, in memoize 
    self.memo[id(obj)] = memo_len, obj 
MemoryError

I am really confused, since my same code was working fine earlier.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
tanzil
  • 1,461
  • 2
  • 12
  • 17
  • 2
    This is not specific to Pickle. Python is requesting more memory from the OS to store some more objects and the OS told Python that there is no more memory available for the process. This error could have happened anywhere in your code. – Martijn Pieters May 06 '13 at 17:09
  • To test the code, I even tried to load the same pickle file (which I have dumped earlier) and then tried to dump it again, and strangely I am getting the same exception. – tanzil May 06 '13 at 17:10
  • How to resolve this issue? And how it was working earlier? – tanzil May 06 '13 at 17:11
  • 1
    When pickling, the same loop keeps creating objects as needed, so it could be that the same location triggers the same exception, yes. Apparently, there was either more memory available earlier on your machine, or your Python session used less memory in other locations, and did not hit the limits set by the OS. – Martijn Pieters May 06 '13 at 17:11
  • 3
    If this is a systemic problem for your script, your options are to get more memory, lift the OS per-process restrictions (using `ulimit` perhaps) or profiling your application memory use (see [Which Python memory profiler is recommended?](http://stackoverflow.com/q/110259)). – Martijn Pieters May 06 '13 at 17:14
  • Is it possible to increase memory limit of OS?.I am using windows xp. – tanzil May 06 '13 at 17:15
  • I don't know Windows XP all that well, I don't think that by default there is a per-process limit, or anything that you can tweak. That leaves you with 'get more memory' (by adding physical memory or closing other programs so more memory is available) and 'profile your application'. – Martijn Pieters May 06 '13 at 17:23
  • This is pretty much specific to pickle. It is memoizing like hell, to come along recursive object situations... – Vajk Hermecz Aug 27 '13 at 08:58

2 Answers2

3

Are you dumping just that one object, and that's all?

If you are calling dump many times, then calling Pickler.clear_memo() between dumps will flush the internally stored backreferences (causing the 'leak'). And your code should just work fine...

Vajk Hermecz
  • 5,413
  • 2
  • 34
  • 25
3

Have you tried this?

import cPickle as pickle
p = pickle.Pickler(open("temp.p","wb")) 
p.fast = True 
p.dump(d) # d is your dictionary
richie
  • 17,568
  • 19
  • 51
  • 70