1

I have seen many places that people reset the test generator in Keras while predicting the output, but I am unable to find why. Can you make it clear?

I have a custom generator like this

def dual_datagen(df,clinical_features,batch_size=20):
    eff_generator=data_gen.flow_from_dataframe(df,directory='/content/data',
                                               target_size=(img_shape,img_shape),
                                               x_col='img_id',
                                               y_col=col,
                                               class_mode='raw',
                                               shuffle=False,
                                               batch_size=batch_size)
    number_of_batches = len(clinical_features)/batch_size
    counter =0
    while True:
        x_1 = eff_generator.next()
        x_2 = np.array(clinical_features[batch_size*counter:batch_size*(counter+1)]).astype('float32')
        counter += 1

        yield [x_1[0], x_2], x_1[1]
        if counter >= number_of_batches:
          counter = 0

How can I reset it?

Talha Anwar
  • 2,699
  • 4
  • 23
  • 62

1 Answers1

0

See this answer

In the case of data science though, it is even more obvious. Making a dataset generator usually involves data cleaning, piping and prepocessing. This makes generators usually very slow to initialize.

So, instead of creating a brand new generator, you just use the itertools.tee to multiply the generator. The only downside to is that you now have two versions of the generator in memory. In most cases, that is not a problem. Data storage usually is of less concern then time use.

You can use itertools.tee to re-use the generator.

Does this answer your question?

tornikeo
  • 915
  • 5
  • 20