2

I need to develop an application that will index several texts and I need to search for people’s names inside these texts. The problem is that, while a person’s correct name is “Gregory Jackson Junior”, inside the text, the name might me written as:
- Greg Jackson Jr
- Gegory Jackson Jr
- Gregory Jackson
- Gregory J. Junior
I plan to index the texts on a nightly bases and build a database index to speed up the search. I would like recommendation for good books and/or good articles on the subject.
Thanks

unmounted
  • 33,530
  • 16
  • 61
  • 61
Pascal
  • 2,944
  • 7
  • 49
  • 78
  • You question is incorrectly phrased. The examples do not indicate misspelling but change in the form of writing a full name. And, I am curious, would your search expect to match on words like 'son' with reference to the example? – nik Jun 25 '09 at 14:23
  • Actually, one of the names might me misspelled as well. I don't need synonym matches like junior and son. Thanks – Pascal Jun 25 '09 at 14:30
  • Did you ever find anything to accomplish this? – Garrett Lancaster Feb 16 '12 at 01:20

3 Answers3

2

Check these related questions.

Algorithm to find articles with similar text

How to search for a person's name in a text? (heuristic)

Community
  • 1
  • 1
Shoban
  • 22,920
  • 8
  • 63
  • 107
  • Thanks for the references. I did check them out prior to posting the question. The first one was focused on articles and real-time search. And the second article, the best answers were refering to a particular database engine, but had little algorithm content. – Pascal Jun 25 '09 at 14:39
2

Your question is incorrectly phrased. The examples do not indicate misspelling but change in the form of writing a full name.

And,

Ok, reading your comment suggests you do not want to venture into that.

nik
  • 13,254
  • 3
  • 41
  • 57
1

For the record. Use a Bayesian filter. You may use mechanical truck for initializing your algorithm.

MatthieuBizien
  • 1,647
  • 1
  • 10
  • 19