I have a list which contains a very large numpy
array, a very small numpy
array, and some fields which are very small in size. I wish to save my list as a file and load it later on. If I use pickle
as described in how to save/read class wholly in Python, then (although it loads fast) the saved file is way too large.
Question: What is the best way of saving such a list from a space point of view if we do not compress the file?
An example of the described list that is saved using pickle
.
import pickle
import numpy as np
class MyClass:
def __init__(self):
self.largearray = np.random.rand(10000,10000,3) * 255
self.smallarray = np.random.rand(100,100,3) * 255
self.attribute1 = True
self.attribute2 = 'Some String'
self.attribute3 = 888
self.list = [self.largearray, self.smallarray, self.attribute1, self.attribute2, self.attribute3]
a = MyClass()
with open(f'test.pickle', 'wb') as file:
pickle.dump(a.list, file)
with open(f'test.pickle', 'rb') as file2:
a_loaded = pickle.load(file2)
Edit: As the comment points out, the problem comes from numpy rather than pickle. I should convert numpy to some other data structure such that it does not take too much space and it can be quickly converted to numpy when loaded. What is the best structure to achieve it?