I have a large HDF5 file containing 16000 different 512x512 numpy arrays. obviously reading the file to the ram will make it crash (Total size of the file 40 GB).
I want to load this array into data and then split data into train_x and test_x. Tha labels are stored locally.
I did this which only create a path to the file without fetching
h5 = h5py.File('/file.hdf5', 'r')
data = h5.get('data')
but when I try to split data into train and test:
x_train= data[0:14000]
y_train= label[0:16000]
x_test= data[14000:]
y_test= label[14000:16000]
I get the error
MemoryError: Unable to allocate 13.42 GiB for an array with shape (14000, 256, 256) and data type float32
I want to load them in batches and train a keras model but obviously previous error doesn't allow me to
model.compile(optimizer=Adam(learning_rate =0.001),loss
='sparse_categorical_crossentropy',metrics =['accuracy'])
history= model.fit(x_train,y_train,validation_data=
(x_test,y_test),epochs =32,verbose=1)
How can I get around this issue?