7

We have a database filled with OCRed data and manually typed data.

When doing searches with CONTAINS command, not all of the results are appearing. So a search for "monkey man" doesn't return the records that have m0nkey man, momkey man in the data.

Is there a way to allow for these issues in the data?

I've had a cursory glance at Lucene.NET and Soundex but can't see these being of much use.

Thanks for any ideas

jimmy
  • 709
  • 3
  • 15
  • 33
  • I'm not sure if you will find free-ware to do this. My company have used Informatica with Data Quality that has this capability, this tool is used to clean up the data so later you can query and not worry about missing words that are misspelled. –  Jan 17 '14 at 18:04

1 Answers1

7

I believe you are looking for something called Fuzzy matching.

Similar post:

SQL Fuzzy Matching

Maybe useful:

http://web.archive.org/web/20100209050309/http://anastasiosyal.com/archive/2009/01/11/18.aspx

Community
  • 1
  • 1
Zzyrk
  • 907
  • 8
  • 16