I have the following data structure:
One sample contains 5 vectors. In all the vectors there are elements from the same classes but the classes are different between the vectors. These vectors are really big with thousands of elements. I usually have several (5-10) samples.
At the moment I have a vector for every sample what contains the vectors of the classes. And I store the vectors of the samples in a vector so I can manage all the samples at once.
I use vector cause while filling my dataset I use .append()
. But later on I won't change the data just iterate through and analyze it.
My problem is with memory. Now the dataset eats a lot of it. So some optimization would be great.
That's why I ask if there is a better way to store this dataset?
I've heard that array is better if I don't change my data. Is it worth maybe to convert everything to array after loaded them as vector? What do you recommend?
For example, I show a dataset below similar to mine:
class van:
#some data
pass;
class bus:
#some more data
pass;
class motorcycle:
#something else
pass;
all_data = []
for i in range(7):
vans = [van() for i in range(5000)]
buses = [bus() for i in range(2000)]
mcycles = [motorcycle() for i in range(3000)]
dataset = [vans, buses, mcycles]
all_data.append(dataset)