0

I'm currently using pickle to save some big data which contains many numpy matrices of size 10k*10k. Even though I use several similar (separate) python files, whenever I save the data, the size of the saved dat file is always 4 GB. So, is that just a coincidence or it can't save more than this amount?

Also, when I load the data it uses more than 90% of the memory, which is not useful to me. I have heard of cpickle and joblib, here a comparison of them: What are the different use cases of joblib versus pickle?

I would like to reduce memory usage. Should I switch to the joblib? What would be the fastest way?

Thanks for any suggestions.

p.s. I use Python 3.8 on Ubuntu 20.04 with Spyder IDE.

Lynx
  • 25
  • 6
  • Can you give more detailed informations on this data, or better a example? – max9111 Apr 10 '20 at 12:09
  • It consists of some arrays, maybe 6-7 of them, which have dimensions 10000*10000. What else do you need? – Lynx Apr 10 '20 at 15:26
  • Pickle and Joblib are comfortable, but slow and produce quite large files. If your data is for example only a list of some arrays adapting an answer like this https://stackoverflow.com/a/56761075/4045774 can be a good solution. – max9111 Apr 10 '20 at 15:59

0 Answers0