2

I have an image where the letters are perfectly aligned and straight and I think the color of the letters are brighter than the noisy background color.

enter image description here

If I can replace the noisy color by white using a java program, Then I can use OCR to get the text.

My Question is that if I have RGB of the brightest pixels (that forms text) suppose (124,140,192), what would be the range of RGB for colors lighter than that(124,140,192)

Gaurav Sharma
  • 745
  • 7
  • 23
  • 3
    *"If I can replace the noisy color by white using a java program, Then I can use OCR to get the text."* Let us hope you can't, given those texts are *explicitly intended* for human consumption only. – Andrew Thompson Jul 18 '13 at 17:59
  • 3
    A captcha that easy, deserves to get "hacked". – Michael Petrotta Jul 18 '13 at 18:00
  • 4
    This question appears to be off-topic because it is about hacking a security device intended to ensure 'people only'. – Andrew Thompson Jul 18 '13 at 18:00
  • I edited the question, deleted the captcha tag (since this got nothing to do with captchas), and removed the hacking part... – jsedano Jul 18 '13 at 18:01
  • if you run OCR on it even with the background noise you wouldnt really have a problem considering there is currently a significant threshold difference in color between the background noise and the actual text.A simple edge detection algo would do the trick. – Rajeshwar Jul 18 '13 at 18:03
  • 2
    @AndrewThompson Have you considered that OP may have this system on his site, but is having difficulty convincing management that it needs to be upgraded? – corsiKa Jul 18 '13 at 18:03
  • 3
    I dont see the need for a downvote on this. Upvoting – Rajeshwar Jul 18 '13 at 18:04
  • @Rajeshwar I believe the downvote is automatic when a close vote is issued. Presumably because anything worthy of a close vote would make the question fall under "not useful", one of the downvote criteria. – corsiKa Jul 18 '13 at 18:09
  • @GauravSharma - Can you elaborate on why are you trying to ocr a captcha image? – Leigh Jul 18 '13 at 18:13
  • @Leigh I just want to check the vulnerability of website and also I want to extract around 100 records from that site but each time I have to pass through the captcha. – Gaurav Sharma Jul 18 '13 at 18:18
  • 1
    Given the monotony of the colors in the image, would it not be easier to grab a set of all the colors on the image and sort by each of the components (looping 3 times for r, g, b). Replacing all the colors starting highest to lowest and attempting to read the text each time would probably give you an answer faster then replacing each color that may be "brighter". As that would be a huge set. – Andrew Jul 18 '13 at 18:29

1 Answers1

1

This answer has formulas for determining brightness of colors: https://stackoverflow.com/a/596243/2479481

Should be able to use perceived brightness (0.299*R + 0.587*G + 0.114*B) to identify lighter colors.

Community
  • 1
  • 1
Luke Willis
  • 8,429
  • 4
  • 46
  • 79