3

I have the following requirement.

I have a table with a column that contains the city names. I am going to implement a search option by City.

But the user may not enter the city name correctly.

Examples : The city "Matara" is sometimes spelled as "Mathara". The city "Nuwara Eliya" is sometimes written as "Nuwaraeliya"

I can keep the consistency on the database column but I want to return the hits even the end user uses an alternative word.

What is the approach I need to use to implement this effectively?

Chathuranga Chandrasekara
  • 20,548
  • 30
  • 97
  • 138
  • Why not create a dropdown list and let the user choose from that? – Reinard Mar 22 '12 at 09:07
  • mmm... It is not a "simple" search. The user may combine the city with many other keywords. – Chathuranga Chandrasekara Mar 22 '12 at 09:09
  • Stackoverflow careers uses Yahoo for this type of requirement. [Results for Nuwaraeliya look like this](http://where.yahooapis.com/geocode?q=Nuwaraeliya) (Edit though seems to match Narwaliya,Narala,Naral,Naruala,Narhela,Nurwala,Narayola,Kareli,Narhaoli,Norrala but **not** Nuwara Eliya!) – Martin Smith Mar 22 '12 at 09:10
  • @Martin Smith : I checked the feasibility. The problem is those words are in Sinhalese and the Sri Lankan cities are not properly indexed with Google or Yahoo. As an example I am not seeing the result I expect from the above query. So I think I cannot rely on any such services. :( – Chathuranga Chandrasekara Mar 22 '12 at 09:14
  • @ChathurangaChandrasekara - Yes I've checked both your examples now and it doesn't even list your desired result among the options. Probably quite US centric. – Martin Smith Mar 22 '12 at 09:15

3 Answers3

2

You should probably implement a string distance check like Levenshtein distance

More approaches can be found here: How do you implement a "Did you mean"?

Community
  • 1
  • 1
Chen Harel
  • 9,684
  • 5
  • 44
  • 58
  • Some Cities might have two completely different names, where the levenshtein distance is very high. For "Matara + Mathara" this works, but for others it won't. Also some Cities might sound equally but are totally different cities. – Christian Mar 22 '12 at 09:19
1

I think the above problem can be sufficiently solved by using Levenshtein Distance, PHP Similar Text or JaroWinkler Similarity. All the approaches provided me the sufficiently correct results.

Edit Distance Tool

enter image description here

Chathuranga Chandrasekara
  • 20,548
  • 30
  • 97
  • 138
0

You want something like a phonetic search. Several algotithm exists. You can get an overview here

The idea is to add a column to you table with the phonetic equivalent to your city, and perform the search against this (after having performed the same function for the searched term).

Some RDBMS such as Oracle possess a pre-implemented SOUNDEX function, that could allow you to perform the search without the added column.

PATRY Guillaume
  • 4,287
  • 1
  • 32
  • 41