0

So, i've been pickling multiple data to the same pickle file with this type of code:

for i in range(0, x):
    with open('file', 'ab') as f:
        pickle.dump(data, f) 

As far everything works fine. You can read this using:

data = []
with open('file', 'rb') as f:
    while True:
        try:
            data.extend(pickle.load(f))
        except:
            break

The problem is that if i re-run the first part of code I can only access what was written to the file the first time. I suppose it may have something to do with EOF, but this really shouldn't be the case if we are opening file in 'append' mode. The pickle documentation is also beyond useless. My question is how to read from pickled file if it was appended to from multiple scripts?

  • Are you certain that pickle can read from a file with multiple "entries"? I don't know how exactly it works, but does it save the length of each entry at the beginning? – Haytam Jul 31 '20 at 11:18
  • Not sure. I think it saves the new entry each time you dump something. Maybe one can't do this. It just is a bit mind-boggling because writing works just fine. There is obviously simple solution by saving to new file every time but it is annoying. – Mateusz Puto Jul 31 '20 at 11:24
  • Can I ask why you need to try to put multiple pickle calls to store your data like this? Wouldn't HDF5 be a better solution if you have data that doesn't fit in memory? [This question](https://stackoverflow.com/questions/37928794/which-is-faster-for-load-pickle-or-hdf5-in-python) might help – Magnus Berg Sletfjerding Jul 31 '20 at 11:27
  • 1
    I wouldn't use append mode but read all data to memory, add new value and save all data again. Probably (similar to JSON) it uses structure which can't append to existng file and you have to recreate full file. – furas Jul 31 '20 at 11:37
  • I think pickle is fine. I'm not intending on storing huge amounts of data. Thanks for help anyways. I'll do multiple files although i still think the question of why does pickle work this way is relevant. – Mateusz Puto Jul 31 '20 at 11:41

0 Answers0