Create different pickle files iterating a list of DataFrames

Question

I'm trying to write a function that will write pickle files from a list containing DataFrames. I want to iterate through that list and create a different pickle file, with a different file name, from each DataFrame. I've written this function:

def picklecreator(dflist):
a=1
for b in dflist:
    b.to_pickle('filename_' + str(a) + '.pkl')
    a=+1

return 1

This function only creates the first pickle file 'filename_1.pkl' How can I make it to work for all the DataFrames in my list?

MaxU - stand with Ukraine · Accepted Answer · 2016-10-25T20:11:22.200

0

You can do it this way:

def picklecreator(dflist):
    for i, b in enumerate(dflist):
        b.to_pickle(r'd:/temp/filename_{:02d}.pkl'.format(i+1))
    return 1

But i would use HDF store instead - it's much more flexible and more convenient.

Demo:

def save_dflist_hdfs(dflist, file_name_ptrn='d:/temp/data_{:02}.h5', **kwarg):
    for i, df in enumerate(dflist):
        df.to_hdf(file_name_ptrn.format(i+1), 'df{:02d}'.format(i+1), **kwarg)
    return len(dflist)

then you can call it like this:

save_dflist_hdfs(dflist, r'd:/temp/data_{:02}.h5', format='t',
                 complib='blosc', complevel=5)

edited Oct 25 '16 at 20:11

answered Oct 25 '16 at 19:49

MaxU - stand with Ukraine

205,989
36
386
419

Awesome!! Thanks MaxU! Is HDF store more for super big files? Or do you use it whatever the file size is due to its flexibility and convenience? – castor Oct 25 '16 at 20:39
@castor, glad i could help. I'm using HDF because it's very flexible and convenient, especially [its ability to query data conditionally](http://stackoverflow.com/a/38401560/5741205), to compress files and because of its [speed](http://stackoverflow.com/questions/37010212/what-is-the-fastest-way-to-upload-a-big-csv-file-in-notebook-to-work-with-python/37012035#37012035) – MaxU - stand with Ukraine Oct 25 '16 at 20:42

score 0 · Answer 2 · answered Mar 30 '19 at 15:38

0

I think the problem lies in the code a=+1. You should write a += 1 instead.

answered Mar 30 '19 at 15:38

wsw

183
3
10

Create different pickle files iterating a list of DataFrames

2 Answers2