5

I'm trying to do transfer learning, using a pretrained Xception model with a newly added classifier.

This is the model:

base_model = keras.applications.Xception(
    weights="imagenet",
    input_shape=(224,224,3),
    include_top=False
)

The dataset I'm using is oxford_flowers102 taken directly from tensorflow datasets. This is a dataset page.

I have a problem with selecting some parameters - either training accuracy shows suspiciously low values, or there's an error.

I need help with specifying this parameter, for this (oxford_flowers102) dataset:

  1. Newly added dense layer for the classifier. I was trying with: outputs = keras.layers.Dense(102, activation='softmax')(x) and I'm not sure whether I should select the activation function here or not.
  2. loss function for model.
  3. metrics.

I tried:

model.compile(
    optimizer=keras.optimizers.Adam(),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[keras.metrics.Accuracy()],
)

I'm not sure whether it should be SparseCategoricalCrossentropy or CategoricalCrossentropy, and what about from_logits parameter?

I'm also not sure whether should I choose for metricskeras.metrics.Accuracy() or keras.metrics.CategoricalAccuracy()

I am definitely lacking some theoretical knowledge, but right now I just need this to work. Looking forward to your answers!

Innat
  • 16,113
  • 6
  • 53
  • 101
Nickname11
  • 501
  • 9
  • 23

1 Answers1

10

About the data set: oxford_flowers102

The dataset is divided into a training set, a validation set, and a test set. The training set and validation set each consist of 10 images per class (totaling 1020 images each). The test set consists of the remaining 6149 images (minimum 20 per class).

'test'        6,149
'train'       1,020
'validation'  1,020

If we check, we'll see

import tensorflow_datasets as tfds
tfds.disable_progress_bar()

data, ds_info = tfds.load('oxford_flowers102', 
                          with_info=True, as_supervised=True)
train_ds, valid_ds, test_ds = data['train'], data['validation'], data['test']

for i, data in enumerate(train_ds.take(3)):
  print(i+1, data[0].shape, data[1])
1 (500, 667, 3) tf.Tensor(72, shape=(), dtype=int64)
2 (500, 666, 3) tf.Tensor(84, shape=(), dtype=int64)
3 (670, 500, 3) tf.Tensor(70, shape=(), dtype=int64)
ds_info.features["label"].num_classes
102

So, it has 102 categories or classes and the target comes with an integer with different shapes input.

Clarification

First, if you keep this integer target or label, you should use sparse_categorical_accuracy for accuracy and sparse_categorical_crossentropy for loss function. But if you transform your integer label to a one-hot encoded vector, then you should use categorical_accuracy for accuracy, and categorical_crossentropy for loss function. As these data set have integer labels, you can choose sparse_categorical or you can transform the label to one-hot in order to use categorical.

Second, if you set outputs = keras.layers.Dense(102, activation='softmax')(x) to the last layer, you will get probabilities score. But if you set outputs = keras.layers.Dense(102)(x), then you will get logits. So, if you set activations='softmax', then you should not use from_logit = True. For example in your above code you should do as follows (here's some theory for you):

...
(a)
# Use softmax activation (no logits output)
outputs = keras.layers.Dense(102, activation='softmax')(x)
...
model.compile(
    optimizer=keras.optimizers.Adam(),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=False),
    metrics=[keras.metrics.Accuracy()],
)

or,

(b)
# no activation, output will be logits
outputs = keras.layers.Dense(102)(x)
...
model.compile(
    optimizer=keras.optimizers.Adam(),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[keras.metrics.Accuracy()],
)

Third, uses string identifier such as metrics=['acc'] , optimizer='adam'. But in your case, you need to be a bit more specific as you mention loss function specific. So, instead of keras.metrics.Accuracy(), you should choose keras.metrics.SparseCategoricalAccuracy() if you target are integer or keras.metrics.CategoricalAccuracy() if your target are one-hot encoded vector.

Code Examples

Here is an end-to-end example. Note, I will transform integer labels to a one-hot encoded vector (right now, it's a matter of preference to me). Also, I want probabilities (not logits) from the last layer which means from_logits = False. And for all of these, I need to choose the following parameters in my training:

# use softmax to get probabilities 
outputs = keras.layers.Dense(102, 
                   activation='softmax')(x)

# so no logits, set it false (FYI, by default it already false)
loss = keras.losses.CategoricalCrossentropy(from_logits=False),

# specify the metrics properly 
metrics = keras.metrics.CategoricalAccuracy(),

Let's complete the whole code.

import tensorflow_datasets as tfds
tfds.disable_progress_bar()

data, ds_info = tfds.load('oxford_flowers102', 
                         with_info=True, as_supervised=True)
train_ds, valid_ds, test_ds = data['train'], data['validation'], data['test']


NUM_CLASSES = ds_info.features["label"].num_classes
train_size =  len(data['train'])

batch_size = 64
img_size = 120 

Preprocess and Augmentation

import tensorflow as tf 

# pre-process functions 
def normalize_resize(image, label):
    image = tf.cast(image, tf.float32)
    image = tf.divide(image, 255)
    image = tf.image.resize(image, (img_size, img_size))
    label = tf.one_hot(label , depth=NUM_CLASSES) # int to one-hot
    return image, label

# augmentation 
def augment(image, label):
    image = tf.image.random_flip_left_right(image)
    return image, label 


train = train_ds.map(normalize_resize).cache().map(augment).shuffle(100).\
                          batch(batch_size).repeat()
valid = valid_ds.map(normalize_resize).cache().batch(batch_size)
test = test_ds.map(normalize_resize).cache().batch(batch_size)

Model

from tensorflow import keras 

base_model = keras.applications.Xception(
    weights='imagenet',  
    input_shape=(img_size, img_size, 3),
    include_top=False)  

base_model.trainable = False
inputs = keras.Input(shape=(img_size, img_size, 3))
x = base_model(inputs, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
outputs = keras.layers.Dense(NUM_CLASSES, activation='softmax')(x)
model = keras.Model(inputs, outputs)

Okay, additionally, here I like to use two metrics to compute top-1 and top-3 accuracy.

model.compile(optimizer=keras.optimizers.Adam(),
              loss=keras.losses.CategoricalCrossentropy(),
              metrics=[
                       keras.metrics.TopKCategoricalAccuracy(k=3, name='acc_top3'),
                       keras.metrics.TopKCategoricalAccuracy(k=1, name='acc_top1')
                    ])
model.fit(train, steps_per_epoch=train_size // batch_size,
          epochs=20, validation_data=valid, verbose=2)
...
Epoch 19/20
15/15 - 2s - loss: 0.2808 - acc_top3: 0.9979 - acc_top1: 0.9917 - 
val_loss: 1.5025 - val_acc_top3: 0.8147 - val_acc_top1: 0.6186

Epoch 20/20
15/15 - 2s - loss: 0.2743 - acc_top3: 0.9990 - acc_top1: 0.9885 - 
val_loss: 1.4948 - val_acc_top3: 0.8147 - val_acc_top1: 0.6255

Evaluate

# evaluate on test set 
model.evaluate(test, verbose=2)
97/97 - 18s - loss: 1.6482 - acc_top3: 0.7733 - acc_top1: 0.5994
[1.648208498954773, 0.7732964754104614, 0.5994470715522766]
Innat
  • 16,113
  • 6
  • 53
  • 101