My approach would be to go like that:
There is some structure which stores which pixels already have been handled.
Write a routine bool belongs_to_black_text(const PixelCoordinates& coords);
which will be applied on every pixel in the image. Replace the argument accordingly to however you handle coordinates.
- Check if the pixel was already handled, if so, return (we need this to be time efficient)
- Check whether the pixel is black, else return.
- Find the connected component of black pixels. Mark all its pixels as handled.
- Find the union of the connected components of all white pixels adjacent to the black component. Mark all its pixels as handled.
- Check whether the frustrum of the white component is bigger than that of the black connected component. If it is, it indicates that it surrounds the black component, thus the black component is text.
- Check whether the frustrum of the white component has a certain minimal size, in order to rule out that it is a white letter with a black hole.
- Act accordingly (invert those).
As stept 6 inverts also the white component, we need to work on two images, the original and the result, we need the original image to retain the white component.
An improvement would be to mark previously white components surrounding text somehow so that we do not have to find the component anew for each new letter.
Edit: Changed the single white connected component to the union of the connected components of all white pixels adjacent to the black component. Else, the white component could have been something like the eye within some letter.
Note: Maybe a better approach is to start with white pixels, create the connected component, then find out every connected component within the white component and invert the whole thing.
Edit: added another step, which is now step 6.
Pseudocode:
Given:
Pixel: both coordinate and content
PixelList: stores a list of Pixel
Greyscale: a pixel can be evaluated to it's greyscale value within 0..1
is black: true iff Greyscale < 0.5
is white: not is black
Image: 2D-array of Pixel, accessed via image(pixel)
minimal_size: to be too big to be assumed to be a letter
Supporting algorithms:
connected_component(pixel, condition): creates the connected component wrt the condition, using something like BFS,
example: "find me all black pixels connected to this pixel over other black pixels"
adjacent(pixel_list, condition): all pixels that are adjacent to any pixel in the given list that fulfill the condition
Frustrum(pixel_list): finds the minimal and maximal coordinates of the list, both in x- and y-direction
Algorithm:
instantiate 2D-array of bool that is named handled, same size as image, default value false
instantiate a copy of image named result
for each pixel in image:
if handled(pixel) continue
if pixel is white continue
PixelList black_component = connected_component(pixel, is black)
PixelList white_seed = adjacent(black_component)
PixelList white_component
for each white_pixel in white seed:
merge connected_component(white_pixel, is white) into white_component
handled(white_component) = true
handled(black_component) = true
if Frustrum(white_component) < minimal_size continue
if Frustrum(black_component) is within Frustrum(white_component)
invert white_component and black_component in result