I'm using Tesseract with Python to attempt to read license plates using the function image_to_string(). The license plates include only uppercase alphas and digits. Occasionally, Tesseract misreads digits or uppercase characters as lowercase characters.
I know that I can specify a white list of characters to include only uppercase alphas and digits. What I really want to know is whether the white list causes the OCR algorithm to bypass the white listed characters and continue to try to match the symbol with non-white listed characters, or does it simply cause the image_to_string() function to discard characters that it has interpreted that are not on the white list?