I build a model with 4 VGG16 (not including the top) and then concatenate the 4 outputs from the 4 VGG16 to form a dense layer, which is followed by a softmax layer, so my model has 4 inputs (4 images) and 1 output (4 classes).
I first do the transfer learning by just training the dense layers and freezing the layers from VGG16, and that works fine.
However, after unfreeze the VGG16 layers by setting layer.trainable = True
, I get the following errors:
tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018 23:12:28.501894: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX TITAN X major: 5 minor: 2 memoryClockRate(GHz): 1.076
pciBusID: 0000:0a:00.0
totalMemory: 11.93GiB freeMemory: 11.71GiB
2018 23:12:28.744990: I
tensorflow/stream_executor/cuda/cuda_dnn.cc:444] could not convert BatchDescriptor {count: 0 feature_map_count: 512 spatial: 14 14 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX} to cudnn tensor descriptor: CUDNN_STATUS_BAD_PARAM
Then I follow the solution in this page and set os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
. The first error above is gone, but I still get the second error:
keras tensorflow/stream_executor/cuda/cuda_dnn.cc:444 could not convert BatchDescriptor to cudnn tensor descriptor: CUDNN_STATUS_BAD_PARAM
If I freeze the VGG16 layers again, then the code works fine. In other works, those errors only occur when I set the VGG16 layers trainable.
I also build a model with only 1 VGG16, and that model also works fine.
So, in summary, only when I unfreeze the VGG16 layers in a model with 4 VGG16, I get those errors.
Any ideas how to fix this?