3

I'm trying to make a simple partial address search utility in PHP. The table in question simply has an "address" column. My goal is to be able to have the user enter a partial address into a form and have my script retrieve the 25 closest matches in that table.

The obvious and, in my opinion, sloppy way to do this would be to select every address in the table then have a PHP loop go through each one of them, calculate percentage similarity to the search term, order them, and output. This seems like a big waste of resources when you consider that the table has tens of thousands of rows and I'm looking for at most 25.

I would like to do something like this:

SELECT id, firstname, lastname, PERCENTMATCH(address, $searchterm) AS matchpercent
FROM accounts
WHERE matchpercent > 85
ORDER BY matchpercent
LIMIT 25

I haven't been able to find any way to do that from within my query, however. Is this possible or do I have to go the sloppy route?

DWilliams
  • 451
  • 8
  • 22
  • 2
    See http://stackoverflow.com/questions/634995/implementation-of-levenshtein-distance-for-mysql-fuzzy-search – jmz Oct 14 '10 at 19:02
  • I previously suggested http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_soundex but I'm not sure this applies to you. So I will leave this as a comment instead. – Alex Oct 14 '10 at 19:05
  • http://stackoverflow.com/questions/16413/parse-usable-street-address-city-state-zip-from-a-string You could parse out the address and then work from there – Phill Pafford Oct 14 '10 at 19:30
  • Detect's answer seems like the usual way to go. But, for the sake of clarity, what do you mean by "percentage string match"? Do you mean the length of the longest common subsequence as a percentage of the *search* string or of each particular *record* string? – user359996 Oct 14 '10 at 19:31
  • @jmz - That seems like it might help but unfortunately the link in the accepted answer is broken. – DWilliams Oct 14 '10 at 20:49
  • @Phill Pafford - I don't think that's what I'm trying to do. I don't need to extract city/state/zip information, it's already separated for me. There are address/city/state/zip columns in the database. If the address was 123 Main Street, New York NY 12345, only 123 Main Street would be stored in address and that's what I'm trying to search on – DWilliams Oct 14 '10 at 20:51
  • @user359996 - Sorry my wording was a bit vague, I'm looking for something that performs like the similar_text() function in PHP – DWilliams Oct 14 '10 at 20:52
  • @DWilliams: The site might have been down when the answer was writen. http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/ – jmz Oct 15 '10 at 08:18

1 Answers1

3

Full-Text search can score query relevancy: http://dev.mysql.com/doc/refman/5.5/en/fulltext-search.html

Detect
  • 2,049
  • 1
  • 12
  • 21
  • Hmm the problem I see with that is that it requires modification to an existing table. The table I'm attempting to search was created by our production software, not software I wrote myself. I'm very leery about making any structural modifications to it, even if they seem potentially insignificant. – DWilliams Oct 14 '10 at 20:47