3

I am doing classifying video sequence, I need 2 things:

  1. Because of limited GPU memory, I want to accumulate gradient across mini-batch, and then average gradient value, and then back propagation.

  2. I need to know how to shuffle between mini-batch but not shuffle inside each mini-batch, because I want the video sequence keep its order.

machen
  • 283
  • 2
  • 10

1 Answers1

0

Question 1: You can forward and backward each minibatch but not call optimizer.update(), after you have repeated forward & backward for necessary minibatches, you can call optimizer.update() to updated based on accumulated gradients.

If you want to achieve it with trainer module, I think you need to override StandardUpdater to define your own Updater class to do above.

Question 2: Are you using trainer module? If so, you can define your own iterator to achieve this. See also below for reference how to define iterator class.

corochann
  • 1,604
  • 1
  • 13
  • 24
  • about question 2: the difficult part is how to write such keep_inside_batch_order iterator in Parallel iterator? I mean to use multiple process to parallel fetch data and keep shuffle=False inside mini-batch.? – machen Mar 02 '18 at 03:24
  • Maybe I need to know more concrete situation to answer this question. Could you provide detail information in new question thread? – corochann Mar 02 '18 at 09:09