I am trying to remove background color so as to improve the accuracy of OCR against images. A sample would look like below:
I'd keep all letters in the post-processed image while just removing the light purple color textured background. Is it possible to use some open source software such as Imagemagick to convert it to a binary image (black/white) to achieve this goal? What if the background has more than one color? Would the solution be the same?
Further, what if I also want to remove the purple letters (theater name) and the line so as to only keep the black color letters? Simple cropping might not work because the purple letters could appear at other places as well.
I am looking for a solution in programming, rather than via tools like Photoshop.