0

I want to apply attention-ocr to detect all digits on number board of cars. I've read your README.md of attention_ocr on github(https://github.com/tensorflow/models/tree/master/research/attention_ocr), and also the way I should do to use my own image data to train model with the StackOverFlow page.(https://stackoverflow.com/a/44461910/743658) However, I didn't get any information of how to store annotation or label of the picture, or the format of this problem. For object detection model, I was able to make my dataset with LabelImg and converting this into csv file, and finally make .tfrecord file. I want to make .tfrecord file on FSNS dataset format.

Can you give me your advice to go on this training steps?

2 Answers2

0

Please reread the mentioned answer it has a section explaining how to store the annotation. It is stored in the three features image/text, image/class and image/unpadded_class. The image/text field is used for visualization, some models support unpadded sequences and use image/unpadded_class, while the default version relies on the text padded with null characters to have the same length stored in the feature image/class. Here is the excerpt to store the text annotation:

char_ids_padded, char_ids_unpadded = encode_utf8_string(
   text, charset, length, null_char_id)
example = tf.train.Example(features=tf.train.Features(
  feature={
    'image/class': _int64_feature(char_ids_padded),
    'image/unpadded_class': _int64_feature(char_ids_unpadded),
    'image/text': _bytes_feature(text)
    ...
  }
))
Alexander Gorban
  • 1,238
  • 1
  • 11
  • 17
0

If you have worked with tensorflow object detection, then the apporach should be much easier for you.

  1. You can create the annotation file (in .csv format) using labelImg or any other annotation tool.

However, before converting it into tensorflow format (.tfrecord), you should keep in mind the annotation format. (FSNS format in this case)

The format is : files text xmin ymin xmax ymax

So while annotating dont bother much about the class (as you would have done in object detection !! Some random name should suffice.)

  1. Convert it into .tfrecords.

  2. And finally labelMap is a list of characters which you have annotated.

Hope it helps !

Ebin Zacharias
  • 195
  • 1
  • 7