2

I'm doing a binary classification with CNN using Matconvnet on Matconvnet. And now, I'm trying to realize it through Keras on Python. The network is not complex at all and I achieved 96% accuracy on Matconvnet. However with Keras, even I tried my best to ensure every setting is exactly the same with before, I can't get the same result. Or event worse, the model doesn't work at all.

Here are some details about the setting. Any ideas or help is appreciated!

  • Input

    The images are 20*20 size. Training size is 400 and testing size 100, validation size 132.

    • Matconvnet: images stored in 20*20*sample_size method
    • Keras: images stored in sample_size*20*20*1 method
  • CNN Structure (3*3)*3 conv- (2*2) maxpooling- fully connected- softmax- logloss

    • Matconvnet: Use convolutionized layer instead of fully connected one. Here is the code:

      function net = initializeCNNA()
      f=1/100 ;
      net.layers = {} ;
      net.layers{end+1} = struct('type', 'conv', ...
                 'weights', {{f*randn(3,3,1,3, 'single'), zeros(1,   3, 'single')}}, ...
                 'stride', 1, ...
                 'pad', 0) ;    
      net.layers{end+1} = struct('type', 'pool', ...
                 'method', 'max', ...
                 'pool', [2 2], ...
                 'stride', 2, ...
                 'pad', 0) ;
      net.layers{end+1} = struct('type', 'conv', ...
                 'weights', {{f*randn(9,9,3,2, 'single'),                                                  zeros(1,2,'single')}}, ...
                 'stride', 1, ...
                 'pad', 0) ;
      net.layers{end+1} = struct('type', 'softmaxloss') ;    
      net = vl_simplenn_tidy(net) ;
      
    • Keras:

        model = Sequential()
        model.add(Conv2D(3, (3,3),kernel_initializer=\
        keras.initializers.RandomNormal(mean=0.0, stddev=0.1, seed=None), input_shape=input_shape))
        model.add(MaxPooling2D(pool_size=(2, 2),strides=(2, 2)))
        model.add(Flatten())
        model.add(Dense(2,activation='softmax',\
        kernel_initializer=keras.initializers.RandomNormal(mean=0.0, stddev=0.1, seed=None))) 
  • Loss Function
    • Matconvnet: softmaxloss
    • Keras: binary_crossentropy
  • Optimizer

    • Matconvnet: SGD

      trainOpts.batchSize = 50;
      trainOpts.numEpochs = 20 ;
      trainOpts.learningRate = 0.001 ;
      trainOpts.weightDecay = 0.0005 ;
      trainOpts.momentum = 0.9 ;
      
    • Keras: SGD

      sgd = optimizers.SGD(lr=0.001, momentum=0.9, decay=0.0005)
      model.compile(loss='binary_crossentropy',
      optimizer=sgd,
      metrics=['accuracy'])
      
  • Initialization: filters:N(0,0.1), bias: 0
  • normalization: no batch normalization except normalization while input to have 0 mean and 1 std for images.

Above are the aspects I reviewed to make sure I did the correct replication. Yet I don't understand why it doesn't work on Keras. Here are some guesses:

  • Matconvnet uses a convolutionized layer instead of fully connected layer and may imply some fancy way to update the parameters.
  • They use a different algorithm to apply SGD whose parameters have different meaning.

I also did other tries:

  • Change optimizer in Keras into Adadelta(). No improvement.
  • Change network structure and make it deeper. It works!

    But still want to know why Matconvnet can achieve that good result with a much simpler one.

Yujia Deng
  • 21
  • 4

1 Answers1

0

"Matconvnet uses a convolutionized layer instead of fully connected layer and may imply some fancy way to update the parameters."

No. Technically, there should be no difference between convolution and fully connected layers. I'm pretty sure there's no fancy way to update the parameters.

More comments coming..

some of the discussion in this post may help: Can't replicate a matconvnet CNN architecture in Keras

DataHungry
  • 351
  • 2
  • 9