This is more of an algorithmy question - I am not very mathematical so was looking for an engineery solution... If this is off topic for SO let me know and I will delete the question.
I created a mashup of open source goodness to do Optical Character Recognition on difficult backgrounds: https://github.com/metalaureate/tesseract-docker-ocr
I want to use it to scan labels with a pre-defined ID code, e.g., 2826672. The accuracy is about 70% for digits.
Question: how do I add redundancy programmatically to my code to increase accuracy to 99%, and how do I decode it? I can imagine some really kludgy ways, like doubling and inverting the digits, but I don't know how to do this in a way that honors information theory without my having to translate a lot of math.
How do I add and decode digits to correct for OCR errors?