5

Are the labels used for training and the ones used for validation the same? I thought they should be the same; however, there seem to be a discrepancy in the labels that are available online. When I downloaded the imagenet 2012 labels for its validation data from the official website, I get labels that start with kit_fox as the first label, which matches the exact 2012's dataset validation images I downloaded from the official website. This is the example of the labels: https://gist.github.com/aaronpolhamus/964a4411c0906315deb9f4a3723aac57

However, for almost all the pretrained models, including those trained by Google, the imagenet labels they use for training, actually start with tench, tinca tinca instead. See here: https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a

Why is there such a huge discrepancy? Where did the 'tinca tinca' kind of labels come from?

If we use the first label mapping that corresponds to the actual validation images, we face another problem: 2 classes ("Crane" and "maillot") are actually duplicated, i.e. they have the same name but refer to different kind of crane - the mechanical crane and the animal crane - resulting in 100 image in 2 of the classes instead of the supposed 50. If we do not use the first mapping, where is a reliable source of the validation images that correspond to the second label mapping?

kwotsin
  • 2,882
  • 9
  • 35
  • 62
  • I also realised that 'maillot' is present twice in the dataset, and it means the same thing both times. 'crane' is also present twice, but here we have different meanings - the bird and the object. – anushka Feb 23 '20 at 18:52

1 Answers1

0

I have the same problem in my finetuning. You solve your problem change the name of classes tench, tinca tinca to the synset number. You can find here the mapping