2

I have a standard CNN for image classification, using the following generator to get the dataset:

generator = validation_image_generator.flow_from_directory(batch_size=BATCH_SIZE,
                                                           directory=val_dir,
                                                           shuffle=False,
                                                           target_size=(100,100),
                                                           class_mode='categorical')

I can easily get the predicted labels with:

predictions = model.predict(dataset)

Now I want to get the (original) true labels and images for all the predictions, in the same order as the predictions in order to compare them. I am sure that information is easily stored somewhere, but I haven't been able to find it.

Pythonless
  • 53
  • 2
  • 6
  • what do you mean by the true labels? this function gives you the labels (usually number) you use in the training set! so you know the actual meaning behind each label. – Mahsa Seifikar Nov 12 '19 at 06:25
  • I mean the original label for that image in the dataset, as opposed to the prediction of the model. – Pythonless Nov 12 '19 at 06:48
  • Could you post more of your code than just the one line above? E.g. the part where you define the Generator. That way we could help you much better. – Tinu Nov 12 '19 at 07:53
  • @Tinu You are right! I have added the generator code. – Pythonless Nov 12 '19 at 10:24

2 Answers2

2

you have to get images from datagenerator and give them to model.predict. if image_gen is your ImageDataGenerator so you can use:

X,y = image_gen.next()
prediction = model.predict(X)

now X is your images (in batch for example X[0] is first image, X[1] is the second image and so on), y is their corresponding labels and prediction is your models output for each image.

this will give a batch from ImageDataGenerator and shows X, y and prediction. to run this for a whole epoch, you have to use a for loop:

for step in range(step_per_epoch):
    X, y = image_gen.next()
    prediction = model.predict(X)

where step_per_epoch should be dataset_size/batch_size.

but remember ImageDataGenerators work randomly. so if you have 100 images and your batch size is 10, if you take 10 batches from your ImageDataGenerator, you may see some images twice and you wont see some other images.

mqod
  • 93
  • 6
  • 1
    Note that this should be run in a for loop to get a whole epoch of batches – Dr. Snoopy Nov 12 '19 at 08:03
  • That is great, I have only one problem, using this method I only get as many elements as there are in a batch for the generator, that is, I am limited to batch size. I would like to get all the images. Thank you very much for your help! – Pythonless Nov 12 '19 at 09:56
  • @Pythonless excuse me, I thought that part was easy. i updated my answer. – mqod Nov 14 '19 at 08:46
0

You want to have the true labels, images and predictions at once. This can be done similar to the example from the documentation:

# here's a more "manual" example
for e in range(epochs):
    print('Epoch', e)
    batches = 0
    for x_batch, y_batch in datagen.flow(x_train, y_train, batch_size=32):
        model.fit(x_batch, y_batch)
        batches += 1
        if batches >= len(x_train) / 32:
            # we need to break the loop by hand because
            # the generator loops indefinitely
            break

The above is a training-loop, you can adjust this to get a test evaluation loop easily, e.g.:

for x_val, y_val in generator().flow_from_directoy(...):
    y_pred = model.predict(x_val)
    score = your_score_func(y_pred,y_val)
Tinu
  • 2,432
  • 2
  • 8
  • 20