Considering this tutorial and this question. If we try to calculate parameter witch caffe framework use we can see the layer wise parameter with :
for layer_name, param in net.params.iteritems():
print layer_name + '\t' + str(param[0].data.shape), str(param[1].data.shape)
from tutorial :
The param shapes typically have the form (output_channels, input_channels, filter_height, filter_width) (for the weights) and the 1-dimensional shape (output_channels,) (for the biases).
the output became :
conv1 (96, 3, 11, 11) (96,)
conv2 (256, 48, 5, 5) (256,)
conv3 (384, 256, 3, 3) (384,)
conv4 (384, 192, 3, 3) (384,)
conv5 (256, 192, 3, 3) (256,)
fc6 (4096, 9216) (4096,)
fc7 (4096, 4096) (4096,)
fc8 (1000, 4096) (1000,)
and according to question for calculating:
sum([prod(v[0].data.shape) for k, v in net.params.items()])
What happen to bias? shouldn't we add up param[1]
to sum ? What Caffe put into bias parameters(0 or 1 or other)? fifth parameter is bias, isn't it? Am i understand it correctly?
Edit : If i multiplied it with these code :
for k, v in net.params.items():
weight_param = weight_param + prod(v[0].data.shape) * prod(v[1].data.shape)
It return huge number : 228224076800 , Are those real parameter used by system?