1

I want to load pickle file of size 4.23GB. I use below code to load the data:

import _pickle as cPickle
def read_pickle(file):
    try:
        with open(file, "rb") as input_file:
            data = cPickle.load(input_file)
            return data
    except Exception as e:
        print("Error in reading data from pickle file",e)

SyStem Configuration: 16Cores 32GB RAM

Output:

%time data=read_pickle(file)

CPU times: user 5.79 s, sys: 1.21 s, total: 7 s
Wall time: 7 s

As multiple user are using this in code so I want to load this file once in my code and use its return data each and every time is there anyway to map this file to disk to avoid load file every time or reduce loading time.

2 Answers2

0

what kind of data are stored in the file? If it contains only data i suggest to find an alternative. If it contains instances or other data, try to alter dunder methods getstate and setstate in order to avoid storing useless data like raw data, temporary data structures and so on.

Glauco
  • 1,385
  • 2
  • 10
  • 20
0

7 seconds is decent time to read the 4 GB file back to RAM and recreate the structures you have.

For your query about serving multiple users you might want to look at Redis (or any other in-memory key-value data structure server) to hold the data and serve the users from there rather than recreating from pickle file for each user.

lllrnr101
  • 2,288
  • 2
  • 4
  • 15