I am working on manually converting a pretrained matconvnet model to a tensorflow model. I have pulled the weights/biases from the matconvnet model mat file using scipy.io and obtained numpy matrices for the weights and biases.
Code snippets where data
is a dictionary returned from scipy.io:
for i in data['net2']['layers']:
if i.type == 'conv':
model.append({'weights': i.weights[0], 'bias': i.weights[1], 'stride': i.stride, 'padding': i.pad, 'momentum': i.momentum,'lr': i.learningRate,'weight_decay': i.weightDecay})
...
weights = {
'wc1': tf.Variable(model[0]['weights']),
'wc2': tf.Variable(model[2]['weights']),
'wc3': tf.Variable(model[4]['weights']),
'wc4': tf.Variable(model[6]['weights'])
}
...
Where model[0]['weights']
are the 4x4x60 numpy matrices pulled from matconvnet model for for layer, for example. And this is how I define the place holder for the 9x9 inputs.
X = tf.placeholder(tf.float32, [None, 9, 9]) #also tried with [None, 81] with a tf.reshape, [None, 9, 9, 1]
Current Issue: I cannot get ranks to match up. I consistently getValueError:
ValueError: Shape must be rank 4 but is rank 3 for 'Conv2D' (op: 'Conv2D') with input shapes: [?,9,9], [4,4,60]
Summary
- Is it possible to explicitly define a tensorflow model's weights from numpy arrays?
- Why is the rank for my weight matrices 4? Should my numpy array be something more like [?, 4, 4, 60], and can I make it that way?
Unsuccessfully Attempted:
- Rotating numpy matrices: I know that matlab and python have different indexing, (0 based indexing vs 1 based, and row major vs column major). Even though I believe I have converted everything appropriately, I still have experimented using libraries like np.rot90() changing 4x4x60 array to 60x4x4.
- Using tf.reshape: I have attempted to use tf.reshape on the weights after wrapping them with a tf.Variable wrapper, but I get Variable has no attribute 'reshape'
NOTE: Please note, I am aware that there are a number of scripts to go from matconvnet to caffe, and caffe to tensorflow (as described here, for example, https://github.com/vlfeat/matconvnet/issues/1021). My question is related to tensorflow weight initialization options: