Efficient way of using numpy memmap when training neural network with pytorch

Question

I'm training a neural network on a database of images. My images are of full HD (1920 x 1080) resolution, but for training, I use random crops of size 256x256. Since reading the full image and then cropping is not efficient, I'm using numpy memmap to load only the 256x256 cropped image. That is, I'm doing something like below

image_mmap = numpy.load(npy_image_path.as_posix(), mmap_mode=mmap_mode)
cropped_image = image_mmap[y1:y2, x1:x2]

Since the same images will be loaded in every epoch, would it be better to load all the memmaps initially and in every epoch, just call the second line above to get the cropped image?

PS: I have tried both approaches and I haven't really found a big difference between them. My intuition says loading all memmaps in __init__ function should be better than loading memmaps again and again in every epoch, but that is not the case. If you can explain why that may be happening, that'll also help me.

The reason I'm asking this question even though both approaches work similarly for me is that, I want to know what is the best practice going forward.

Efficient way of using numpy memmap when training neural network with pytorch

0 Answers0