0

I tried implementing AlexNet as explained in this video. Pardon me if I have implemented it wrong, this is the code for my implementation it in keras.

Edit : The cifar-10 ImageDataGenerator

cifar_generator = ImageDataGenerator()

cifar_data = cifar_generator.flow_from_directory('datasets/cifar-10/train', 
                                                 batch_size=32, 
                                                 target_size=input_size, 
                                                 class_mode='categorical')

The Model described in Keras:

model = Sequential()

model.add(Convolution2D(filters=96, kernel_size=(11, 11), input_shape=(227, 227, 3), strides=4, activation='relu'))
model.add(MaxPool2D(pool_size=(3 ,3), strides=2))

model.add(Convolution2D(filters=256, kernel_size=(5, 5), strides=1, padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(3 ,3), strides=2))

model.add(Convolution2D(filters=384, kernel_size=(3, 3), strides=1, padding='same', activation='relu'))
model.add(Convolution2D(filters=384, kernel_size=(3, 3), strides=1, padding='same', activation='relu'))
model.add(Convolution2D(filters=256, kernel_size=(3, 3), strides=1, padding='same', activation='relu'))

model.add(MaxPool2D(pool_size=(3 ,3), strides=2))

model.add(Flatten())
model.add(Dense(units=4096))
model.add(Dense(units=4096))
model.add(Dense(units=10, activation='softmax'))

model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

I have used an ImageDataGenerator to train this network on the cifar-10 data set. However, I am only able to get an accuracy of about .20. I cannot figure out what I am doing wrong.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Nevin Baiju
  • 305
  • 3
  • 18
  • 1
    Could you please add the `ImageDataGenerator` part to your post as well? – today Jul 18 '18 at 14:09
  • 1
    Pass the `rescale=1/255.` argument to `ImageDataGenerator` and then report the accuracy you get after making the changes suggested by @desertnaut as well. – today Jul 18 '18 at 14:42
  • Cifar images are 32x32 and you are using an initial kernel of 11x11. You are losing a lot of information. Resizing 32x32 to 227x227 is not a good idea. – dgumo Jul 18 '18 at 14:50
  • @dgumo The situation did not change even after implementing both the changes, I guess resizing the images to such a large value is the culprit. – Nevin Baiju Jul 18 '18 at 15:03
  • 2
    @NevinBaiju I was pointing out the problems in your approach - those are not the solutions :-) – dgumo Jul 18 '18 at 15:12

1 Answers1

1

For starters, you need to extend the relu activation to your two intermediate dense layers, too; as they are now:

model.add(Dense(units=4096))
model.add(Dense(units=4096))

i.e. with linear activation (default), it can be shown that they are equivalent to a simple linear unit each (Andrew Ng devotes a whole lecture in his first course on the DL specialization explaining this). Change them to:

model.add(Dense(units=4096, activation='relu'))
model.add(Dense(units=4096, activation='relu'))

Check the SO thread Why must a nonlinear activation function be used in a backpropagation neural network?, as well as the AlexNet implementations here and here to confirm this.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
  • I applied that and there was no improvement in the accuracy. I think resizing the 32*32 images to 227*227 could be the reason why this model performs poorly. – Nevin Baiju Jul 18 '18 at 15:02
  • 1
    @NevinBaiju It should be clear by now that the modification proposed is absolutely *necessary* (try leaving it out while modifying other things), but not of course sufficient, as there may be other things wrong with your implementation. As such, arguably it deserves at least an upvote (i.e. "helpful, but did not resolve the problem completely on its own")... – desertnaut Jul 18 '18 at 15:07