The method is a util function of Faster R-CNN, so I assume you understood what is the "anchor" proposed in Faster R-CNN.
base_size
and anchor_scales
determines the size of the anchor.
For example, when base_size=16
and anchor_scales=[8, 16, 32]
(and ratio=1.0
), height and width of the anchor will be 16 * [8, 16, 32] = (128, 256, 512)
, as you expected.
ratio
determines the height and width aspect ratio.
(I might be wrong in below paragraph, please correct if I'm wrong.)
I think base_size
need to be set as the size of the current hidden layer's scale. In the chainercv
Faster R-CNN implementation, extractor
's feature is fed into rpn
(region proposal network) and generate_anchor_base
is used in rpn
. So you need to take care what is the feature of extractor
's output. chainercv
uses VGG16 as the feature extractor, and conv5_3
layer is used as extracted feature (see here), this layer is a place where max_pooling_2d
is applied 4 times, which results 2^4=16 times smallen feature.
For the another question, I think your understanding is correct, py - h / 2
will be negative value. But this anchor_base
value is just a relative value. Once anchor_base
is prepared at the initialization of model (here), actual (absolute value) anchor
is created in each forward call (here) in _enumerate_shifted_anchor
method.