I need to develop an application that will index several texts and I need to search for people’s names inside these texts. The problem is that, while a person’s correct name is “Gregory Jackson Junior”, inside the text, the name might me written as:
- Greg Jackson Jr
- Gegory Jackson Jr
- Gregory Jackson
- Gregory J. Junior
I plan to index the texts on a nightly bases and build a database index to speed up the search. I would like recommendation for good books and/or good articles on the subject.
Thanks
Asked
Active
Viewed 2,285 times
2
-
You question is incorrectly phrased. The examples do not indicate misspelling but change in the form of writing a full name. And, I am curious, would your search expect to match on words like 'son' with reference to the example? – nik Jun 25 '09 at 14:23
-
Actually, one of the names might me misspelled as well. I don't need synonym matches like junior and son. Thanks – Pascal Jun 25 '09 at 14:30
-
Did you ever find anything to accomplish this? – Garrett Lancaster Feb 16 '12 at 01:20
3 Answers
2
Check these related questions.
-
Thanks for the references. I did check them out prior to posting the question. The first one was focused on articles and real-time search. And the second article, the best answers were refering to a particular database engine, but had little algorithm content. – Pascal Jun 25 '09 at 14:39
2
Your question is incorrectly phrased. The examples do not indicate misspelling but change in the form of writing a full name.
And,
- would your search expect to match on words like son with reference to the example?
- would it expect to match bob when looking for a name called Robert?
Ok, reading your comment suggests you do not want to venture into that.

nik
- 13,254
- 3
- 41
- 57
1
For the record. Use a Bayesian filter. You may use mechanical truck for initializing your algorithm.

MatthieuBizien
- 1,647
- 1
- 10
- 19