2

I'm trying to write a function that will write pickle files from a list containing DataFrames. I want to iterate through that list and create a different pickle file, with a different file name, from each DataFrame. I've written this function:

def picklecreator(dflist):
a=1
for b in dflist:
    b.to_pickle('filename_' + str(a) + '.pkl')
    a=+1

return 1

This function only creates the first pickle file 'filename_1.pkl' How can I make it to work for all the DataFrames in my list?

MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
castor
  • 109
  • 7

2 Answers2

0

You can do it this way:

def picklecreator(dflist):
    for i, b in enumerate(dflist):
        b.to_pickle(r'd:/temp/filename_{:02d}.pkl'.format(i+1))
    return 1

But i would use HDF store instead - it's much more flexible and more convenient.

Demo:

def save_dflist_hdfs(dflist, file_name_ptrn='d:/temp/data_{:02}.h5', **kwarg):
    for i, df in enumerate(dflist):
        df.to_hdf(file_name_ptrn.format(i+1), 'df{:02d}'.format(i+1), **kwarg)
    return len(dflist)    

then you can call it like this:

save_dflist_hdfs(dflist, r'd:/temp/data_{:02}.h5', format='t',
                 complib='blosc', complevel=5)
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • Awesome!! Thanks MaxU! Is HDF store more for super big files? Or do you use it whatever the file size is due to its flexibility and convenience? – castor Oct 25 '16 at 20:39
  • @castor, glad i could help. I'm using HDF because it's very flexible and convenient, especially [its ability to query data conditionally](http://stackoverflow.com/a/38401560/5741205), to compress files and because of its [speed](http://stackoverflow.com/questions/37010212/what-is-the-fastest-way-to-upload-a-big-csv-file-in-notebook-to-work-with-python/37012035#37012035) – MaxU - stand with Ukraine Oct 25 '16 at 20:42
0

I think the problem lies in the code a=+1. You should write a += 1 instead.

wsw
  • 183
  • 3
  • 10