Why YOLO and YOLOv2 downsample the input by 32 in multi-scale training?

Asked Feb 05 '22 at 04:29

Active Feb 05 '22 at 04:29

Viewed 164 times

I currently read YOLOv2 paper. And I couldn't understand why YOLO and YOLOv2 downsample the input by 32 in multi-scale training.

Can someone explain to me why the width and height is a multiple of 32?

I know that YOLO takes images of size 320×320, 352×352, … and 608×608 (with a step of 32) but this is not an understandable answer to me.

0 Answers0