3

I am new in using keras for my project. I have been working with generator in my model.

I am literally confused what value should i input

1) In fit_generator : steps_per_epoch & validation_steps ?

2) evaluate_generator : steps ?

3) predict_generator : steps ?

I have referred keras documentation and few other stack1, stack2 questions. I cannot able to understand. Better I can provide the example of my data shape what I am currently working and follow my questions accordingly. Also, please correct if my understanding is wrong

model.fit_generator(trainGen, steps_per_epoch=25, epochs = 100, validation_data=ValGen, validation_steps= 4)

Q1: For every epoch, there were 25 steps. For each step trainGen yields a tuple of shape (244*100*4, 244*100*2) and perform training.

What will be my batch_size and batches if my steps_per_epoch is 25 ?

Q2: I understood that val_acc and val_loss will be calculated at the end of 25th step of the an epoch. I choose my validation_steps = 4. So ValGen yields a tuple of shape (30*100*4, 30*100*2) 4 times at the end of 25th step of an epoch

I have chosen arbitrarily validation_steps = 4. But how to choose correct number of validation_steps ? How does val_loss & val_acc calculated ? (calculate the mean 4 times either as single batch or using batch_size)

Q3: Say for example in evaluate_generator & predict_generator, my Generator yields a tuple shape (30*100*4, 30*100*2) for both.

How to choose the correct number for steps argument for both evaluate_generator & predict_generator ? In keras document it is mentioned as Total number of steps (batches of samples) to yield from generator before stopping ? In my case what will the batches of samples ?

If any additional information required let me know.

Mari
  • 698
  • 1
  • 8
  • 27

1 Answers1

3

Steps are not a parameter that you "choose", you can compute it as:

steps = number of samples / batch size

So here the only parameter that you are free to choose is the batch size, which is chosen to a value where the model does not run out of memory while training. Typical values are between 32 and 64.

For the training set, you use the number of samples of the training set and divide it for the training batch size, and for the validation set, you divide the number of samples in the validation set with the validation batch size. Both batch sizes can be equal.

This applies to all functions that use generators.

Dr. Snoopy
  • 55,122
  • 7
  • 121
  • 140
  • Whether, can I choose my batch_size be less than 32. say if i choose batch_size = 15, what will be effect (both positive and negative) during training and validation ? – Mari Jul 29 '19 at 14:01
  • @Mari Small batch sizes have the effect of reducing performance (training takes longer), and the gradient estimate is more noisy, which can cause learning problems (loss decrease is noisy). – Dr. Snoopy Jul 29 '19 at 14:03
  • Thanks for your answer. – Mari Jul 29 '19 at 14:05
  • @Dr.Snoopy How about the validation_steps in Question 2? – thinkdeep Oct 12 '20 at 23:49