2

My client has a database of real estate records spanning 400 years. They are interested to have alternate spelling suggestions made to users based on their data.

I'm assuming in a case like this it would be down to a table that has a rows for "Martin" and suggestions of "Martyn" and "Martine" etc.

Does anyone know of a third party solution which would index their data?

EDIT

Per magicmike's answer SOUNDEX() rocks this!

SELECT `Last` FROM `Names` 
WHERE SOUNDEX(`Last`) RLIKE SOUNDEX ('Martin')
GROUP BY `Last`

yields this nice little list:

Martens
Martenstein
Martin
Martina
Martindale
Martine
Martineau
Martines
Martinet
Martinez
Martini
Martino
Martinstein
Mertens
Miradona
Moret and Marchand
Mortimer
Morton
jerrygarciuh
  • 21,158
  • 26
  • 82
  • 139

2 Answers2

2

Google's "did you mean" is pretty interesting: How does the Google "Did you mean?" Algorithm work?

In the past, I've managed to implement something similar with SOUNDEX which can approximate that functionality.

Martin, Martyn, and Martine yield the same output from SOUNDEX.

You could include all the results from the SOUNDEX match, or select the results they asked for and then select distinct name from table where SOUNDEX(name) = SOUNDEX(search_var) as your 'suggestions'.

As an optimization, you can pre-calculate SOUNDEX on search fields and persist that as an indexed column to avoid table scans.

It's not as sophisticated as Google's Did You Mean, but you can get reasonably close very quickly.

Community
  • 1
  • 1
Code Magician
  • 23,217
  • 7
  • 60
  • 77
1

Index the data? You can take a look at Apache Lucene

Renato Dinhani
  • 35,057
  • 55
  • 139
  • 199