I'm trying to train a (pretty big) neural network using a GPU. The network is written in pytorch. I use python 3.6.3 running on ubuntu 16.04. Currently, the code is running, but it's taking about twice as long as it should to run because my data-grabbing process using the CPU is run in series to the training process using the GPU. Essentially, I grab a mini-batch from file using a mini-batch generator, send that mini-batch to the GPU and then train the network on that minibatch. I've timed the two processes (grabbing a mini batch and training on that mini batch), and they are similar in how long they take (both take around 200ms). I'd like to do something similar to keras' fit_generator method which runs the data-grabbing in parallel to the training (it creates a que of minibatches that can be sent to the GPU when the GPU wants to train on that mini batch). What is the best way to do that? For concreteness, my data generator code and training code run something like this (pseudocode):
#This generator opens a file, grabs and yields a mini batch
def data_gen(PATH,batch_size=32):
with h5py.File(PATH,'r') as f:
for mini-batch in mini-batches:
X = f['X'][mini-batch]
Y = f['Y'][mini-batch]
yield (X,Y)
for epoch in range(epochs):
for data in data_gen(PATH):
mini_X,mini_Y = data
mini_X = autograd.Variable(torch.Tensor(mini_X))
mini_Y = autograd.Variable(torch.Tensor(mini_Y))
out = net(mini_X)
loss = F.binary_cross_entropy(out,mini_Y)
loss.backward()
optimizer.step()
Something like that. As you can see, I use the data_gen as an actual generator for the for-loop, so it's being run sequentially with the training. I would like to run it in parallel and have it generate a que of minibatches which I can then feed to my network. Currently, it takes more than 5 hours to run one epoch, I think with a parallelized version of this, I could get that down to 3 hours or less. I looked into multiprocessing on python, but the explanation on the official documentation was a bit dense for me since I have only limited prior experience in parallel computing. If there's some resources I could take a look at, pointing me towards those resources would be very helpful too! Thanks.