Multi-class object detection with tensorflow: weird behavior on evaluation

Question

I am using Tensorflow Object detection, with faster_rcnn_inception_v2_coco as pretrained model. I'm on Windows 10, with tensorflow-gpu 1.6 on NVIDIA GeForce GTX 1080, CUDA 9.0 and CUDNN 7.0.

I'm trying to training a multi-class object detection with a custom dataset, but I had some weird behavior. I have 2 classes: Pistol and Knife (with respectively 876 and 664 images, all with similar size from 360x200 to 640x360, and similar ratio). So, I think that the dataset is balanced. I splitted it into Train set (1386 images: 594 knife, 792 pistol) and Test set (154 images: 70 knife, 84 pistol)

The CNN seems that can detect only one of the two object with good accuracy, and which object can detect (of the two classes) changes randomly during the training steps and in the same image (example: step 10000 it detect only pistol, step 20000 only knife, step 30000 knife, step 40000 pistol, step 50000 knife, etc..), as showen below:

]

Moreover, the Loss looks weird, and the accuracy during the evaluation are never high for both classes together.

During the training phase, the loss seems to oscillate at every training step.

Loss:

Total Loss:

From the mAp (image below) you can see that the two objects are never identified together at the same step:

If I trained these two classes separately, I can achieve a good 50-60% accuracy. If I train these two classes together, the results is what you have seen.

Here you can find the generate_tfrecord.py and the model configuration file (that I changed to made it multi-class). The label map isthe following:

item {
  id: 1
  name: 'knife'
}

item {
  id: 2
  name: 'pistola'
}

Any suggest are welcome.

UPDATES After 600k iterations, loss is still oscillating. The scenario is the following: Loss, Total Loss, and mAp.

Did you randomize the order of the 1386 images: 594 knife, 792 pistol before feeding into the network for training? — Suleiman, Jan 30 '19 at 13:53
@Suleiman no, it's a "staggered" order (200 pistols then 50 knife, 50 pistols, 200 knife and so on..), and the batch is 1 (faster rcnn default). Is it important? There's a smart/easy way to do it or I need to do it manually, renaming the dataset randomly? — darkdrake, Jan 30 '19 at 13:58
It might be best to random shuffle them if not your network may just learn and predict the last class. With tensorflow you can easily implement it using python code like [this](https://stackoverflow.com/questions/976882/shuffling-a-list-of-objects) — Suleiman, Jan 30 '19 at 14:17
@Suleiman yes thanks, but i'm using the object detection tool of tensorflow ([link](https://github.com/tensorflow/models/tree/master/research/object_detection). Do you know where can I shuffle my train inputs there? i don't find it inside train.py — darkdrake, Jan 30 '19 at 14:21
I don't know about the object detection tool, but it looks like you can manually shuffle the images yourself but it will be quite tedious — Suleiman, Jan 30 '19 at 14:33
yes. however, I'll try and let you know. Thanks for your advice. — darkdrake, Jan 30 '19 at 14:57
@Suleiman just for completeness: shuffling my dataset changed nothing, loss and mAp still oscillating after 300k iterations. — darkdrake, Jan 31 '19 at 07:24
@Suleiman you were right, now it works, as I explained in the answer below. Thanks for the help. — darkdrake, Jan 31 '19 at 13:41

score 0 · Accepted Answer · answered Jan 31 '19 at 13:26

Finally, I solved my issue.

I follow the advice of @Suleiman, but at first time I shuffled only the test.csv and train.csv. I saw that inside my generate_tfrecords.py the items will be reordered by filename, so the shuffle from before was useless. I shuffled the dataset inside generate_tfrecords.py by changing

    examples = pd.read_csv(FLAGS.csv_input)
    grouped = split(examples, 'filename')
    for group in grouped:
        tf_example = create_tf_example(group, path)

to this:

    examples = pd.read_csv(FLAGS.csv_input)
    grouped = split(examples, 'filename')
    shuffle(grouped) // shuffling list of entries
    for group in grouped:
        tf_example = create_tf_example(group, path)

adding the shuffle of the list of entries. The results improved a lot, as you can see in the plots of Loss, Total Loss and mAp:

Loss and Total Loss:

mAp:

Now there's only a peak in the loss, maybe for some faults in the dataset that I'll will clean. Obviously, also the evaluation and the detection are now quite good.

SO REMEMBER: the order of the images in your TFRecords are very important (expecially when batch size is 1)!

Thanks Suleiman for the hint.

Multi-class object detection with tensorflow: weird behavior on evaluation

1 Answers1