Training Yolov3-tiny on Google Colab, but it stopped after 4000 iterations. How do I continue training?

Question

(I am a beginner) I trained the model with yolov3-tiny.cfg and darknet53.con.74 because I had trouble loading the yolov3-tiny.weights(not sure if this matters). The model trained in colab for 3000 iterations (a couple hours) before It stopped. When I use these weights, the model performs poorly (I know tiny yolo is less precise but this this is extremely inaccurate) I'm pretty sure this is too few iterations, but when I load in the last training weights that are saved on my drive to continue training, I get this:

!./darknet detector train data/obj.data cfg/yolov3-tiny_training.cfg /mydrive/yolov3/yolov3-tiny_training_last.weights -dont_show

When I run this, I get this:

 CUDA-version: 10010 (10010), cuDNN: 7.6.5, GPU count: 1  
 OpenCV version: 3.2.0
yolov3-tiny_training
 0 : compute_capability = 370, cudnn_half = 0, GPU: Tesla K80 
net.optimized_memory = 0 
mini_batch = 4, batch = 64, time_steps = 1, train = 1 
   layer   filters  size/strd(dil)      input                output
   0 conv     16       3 x 3/ 1    416 x 416 x   3 ->  416 x 416 x  16 0.150 BF
   1 max                2x 2/ 2    416 x 416 x  16 ->  208 x 208 x  16 0.003 BF
   2 conv     32       3 x 3/ 1    208 x 208 x  16 ->  208 x 208 x  32 0.399 BF
   3 max                2x 2/ 2    208 x 208 x  32 ->  104 x 104 x  32 0.001 BF
   4 conv     64       3 x 3/ 1    104 x 104 x  32 ->  104 x 104 x  64 0.399 BF
   5 max                2x 2/ 2    104 x 104 x  64 ->   52 x  52 x  64 0.001 BF
   6 conv    128       3 x 3/ 1     52 x  52 x  64 ->   52 x  52 x 128 0.399 BF
   7 max                2x 2/ 2     52 x  52 x 128 ->   26 x  26 x 128 0.000 BF
   8 conv    256       3 x 3/ 1     26 x  26 x 128 ->   26 x  26 x 256 0.399 BF
   9 max                2x 2/ 2     26 x  26 x 256 ->   13 x  13 x 256 0.000 BF
  10 conv    512       3 x 3/ 1     13 x  13 x 256 ->   13 x  13 x 512 0.399 BF
  11 max                2x 2/ 1     13 x  13 x 512 ->   13 x  13 x 512 0.000 BF
  12 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  13 conv    256       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 256 0.089 BF
  14 conv    512       3 x 3/ 1     13 x  13 x 256 ->   13 x  13 x 512 0.399 BF
  15 conv     21       1 x 1/ 1     13 x  13 x 512 ->   13 x  13 x  21 0.004 BF
  16 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
  17 route  13                                 ->   13 x  13 x 256 
  18 conv    128       1 x 1/ 1     13 x  13 x 256 ->   13 x  13 x 128 0.011 BF
  19 upsample                 2x    13 x  13 x 128 ->   26 x  26 x 128
  20 route  19 8                               ->   26 x  26 x 384 
  21 conv    256       3 x 3/ 1     26 x  26 x 384 ->   26 x  26 x 256 1.196 BF
  22 conv     21       1 x 1/ 1     26 x  26 x 256 ->   26 x  26 x  21 0.007 BF
  23 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
Total BFLOPS 5.449 
avg_outputs = 325057 
 Allocate additional workspace_size = 12.46 MB 
Loading weights from /mydrive/yolov3/yolov3-tiny_training_last.weights...
 seen 64, trained: 256 K-images (4 Kilo-batches_64) 
Done! Loaded 24 layers from weights-file 
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
 Detection layer: 16 - type = 27 
 Detection layer: 23 - type = 27 
Saving weights to /mydrive/yolov3/yolov3-tiny_training_final.weights
 Create 6 permanent cpu-threads

Does anyone know how to load the last weights in so it continues training?

score 0 · Answer 1 · answered Jul 09 '20 at 15:46

0

To solve this, I added -clear 1 to the end of the train command. By doing this, the stats of the images that the model trained on before will be cleared as discussed in this post

answered Jul 09 '20 at 15:46

Jordan

63
2
8

Training Yolov3-tiny on Google Colab, but it stopped after 4000 iterations. How do I continue training?

1 Answers1