Need to write and read huge pandas DF. I am using pickle format right now:
.to_pickle
to write DF to pickleread_pickle
to read pickle file.
I have couple of issues when pickle file size is huge (2 GB in this case)
- Read speed is very slow (23 second to read the data)
- Increasing RAM/core in VM is not improving speed
How can I read it faster? Can I use some other format which is much faster? Can I leverage parallel processing/more core functionality to read it faster?