Why is the Average Precision of my TensorFlow model -1.000?

Question

I've been training an object_detection TensorFlow model for a while now, but I've noticed that the evaluation table always shows a value of -1.000, whereas I've seen other peoples' outputs display various values between 0 and 1. Is there something wrong with my TFRecords or something? I've evaluated the output and after a few thousand steps the model is definitely improving, so I'm not sure why this always happens. Thank you!

My output:

 Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = -1.000
 Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = -1.000
 Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = -1.000
 Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
 Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = -1.000
 Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = -1.000
 Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = -1.000
 Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000

My TensorBoard Output

My pipeline.config file:

model {
  ssd {
    num_classes: 2
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    feature_extractor {
      type: "ssd_mobilenet_v1"
      depth_multiplier: 1.0
      min_depth: 16
      conv_hyperparams {
        regularizer {
          l2_regularizer {
            weight: 3.99999989895e-05
          }
        }
        initializer {
          truncated_normal_initializer {
            mean: 0.0
            stddev: 0.0299999993294
          }
        }
        activation: RELU_6
        batch_norm {
          decay: 0.999700009823
          center: true
          scale: true
          epsilon: 0.0010000000475
          train: true
        }
      }
    }
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    box_predictor {
      convolutional_box_predictor {
        conv_hyperparams {
          regularizer {
            l2_regularizer {
              weight: 3.99999989895e-05
            }
          }
          initializer {
            truncated_normal_initializer {
              mean: 0.0
              stddev: 0.0299999993294
            }
          }
          activation: RELU_6
          batch_norm {
            decay: 0.999700009823
            center: true
            scale: true
            epsilon: 0.0010000000475
            train: true
          }
        }
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.800000011921
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.20000000298
        max_scale: 0.949999988079
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.333299994469
      }
    }
    post_processing {
      batch_non_max_suppression {
        score_threshold: 0.300000011921
        iou_threshold: 0.600000023842
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
    normalize_loss_by_num_matches: true
    loss {
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      classification_loss {
        weighted_sigmoid {
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.990000009537
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
  }
}
train_config {
  batch_size: 24
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
  optimizer {
    rms_prop_optimizer {
      learning_rate {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.00400000018999
          decay_steps: 800720
          decay_factor: 0.949999988079
        }
      }
      momentum_optimizer_value: 0.899999976158
      decay: 0.899999976158
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "pretrained_model/model.ckpt"
  from_detection_checkpoint: true
  num_steps: 300
}
train_input_reader {
  label_map_path: "./CMFCD/pascal_label_map.pbtxt"
  tf_record_input_reader {
    input_path: "./CMFCD/data/annotations/train.record"
  }
}
eval_config {
  num_examples: 47
  max_evals: 10
  use_moving_averages: false
}
eval_input_reader {
  label_map_path: "./CMFCD/pascal_label_map.pbtxt"
  shuffle: false
  num_readers: 1
  tf_record_input_reader {
    input_path: "./CMFCD/data/annotations/test.record"
  }
}

EDIT - SOLUTION: It's because the script I was using to generate the TFRecords was outdated and created empty records with no bounding boxes, only images. I found the updated script and used that, it works fine now.

-1 is the default value if such category is absent. See [here](https://stackoverflow.com/a/55196020/1621414) for better explanations. — Danny Fang, Jun 26 '19 at 14:07
@danyfang by category do you mean class of object to be detected? — Charles Averill, Jun 26 '19 at 14:13
Here category means the condition within the brackets, e.g. [ IoU=0.50:0.95 | area= all | maxDets=100 ], this one means with maximum 100 detections, and detection boxes of all sizes and detection boxes have IoU bigger than 0.5. If one detection box that satisfies the condition, it will be considered in this category and then calculate the precision/recall. — Danny Fang, Jun 26 '19 at 14:18
BTW, this evaluation metrics is of COCO format. This [page](http://cocodataset.org/#detection-eval) has more info — Danny Fang, Jun 26 '19 at 14:19
@danyfang Okay, that makes sense to me. However it still confuses me because my output images have detection boxes with confidence values typically over 70%. Did I mess something up in my config file or something? Thank you for the help by the way. — Charles Averill, Jun 26 '19 at 14:20
It is uncommon to see all categories to be -1. You can paste your config file here. — Danny Fang, Jun 26 '19 at 14:26
I didn't see anything particular within the config file except the num_steps is really small. Typically you should train the model until the training loss goes below 1. — Danny Fang, Jun 26 '19 at 14:53
@danyfang My loss starts at 0.34541082 and hovers around there, loss = 0.345314, step = 100, loss = 0.3452035, step = 200, loss = 0.34443077, step = 900, Loss for final step (3000): 0.3443216. — Charles Averill, Jun 26 '19 at 15:13

score 0 · Accepted Answer · answered Jun 26 '19 at 13:57

0

In general this means that the detector did not produce any detection with a meaningful confidence score (all detections with zero confidence), so when evaluating AP there is nothing to evaluate, and the COCO API evaluation code returns -1.0

answered Jun 26 '19 at 13:57

Dr. Snoopy

55,122
7
121
140

Is this common while the model is still training at a low step count, or is there an error in my config file or something like that? – Charles Averill Jun 26 '19 at 14:11
@CharlesAverill no it is not common. There should be something wrong your settings. – Hadi GhahremanNezhad Jun 27 '19 at 21:38

Why is the Average Precision of my TensorFlow model -1.000?

1 Answers1