As stated in the documentation on improving the accuracy of OCR https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality#Noise Noise reduction on a bitmap is very important so,
I have this code that I referenced from here.
image processing to improve tesseract OCR accuracy
And I have modified and debugged the code to look something like this:
public Bitmap RemoveNoise(Bitmap bmap)
{
for (int x = 0; x < bmap.getWidth(); x++)
{
for (int y = 0; y < bmap.getHeight(); y++)
{
int pixel = bmap.getPixel(x, y);
if (pixel.R < 162 && pixel.G < 162 && pixel.B < 162)
bmap.setPixel(x, y, Color.BLACK);
}
}
for (int x = 0; x < bmap.getWidth(); x++)
{
for (int y = 0; y < bmap.getHeight(); y++)
{
int pixel = bmap.getPixel(x, y);
if (pixel.R > 162 && pixel.G > 162 && pixel.B > 162)
bmap.setPixel(x, y, Color.WHITE);
}
}
return bmap;
}
My challenge is that after debugging the code I have error on pixel.R, pixel.G and pixel.B and that is where I am hooked up right now. Also, could this be a better algorithm or approach of removing noise from an image. Thanks