1

I am trying to automate an application, where it is as like google engine. When user enters "société" (Accented character), it gives details which has either "société"(Accented) or "societe" (English) in it.

So my job is to validate is details contains given keyword. I dont have problem on comparing accented characters. EX: "société" = "société". But the case "société" == "societe" fails. Python code below:

 >>> "société".find("societe")
 -1 #Fail
 >>> "société".find("société")
 0  #Success
 >>> 

So how to match equivalent english word in Accented characters.

Any help will be highly apreciated. Thanks

Guruswamy B M
  • 317
  • 1
  • 5
  • 13
  • I don't think there is an easy way to compare them directly, I would look into removing the accents before searching. http://stackoverflow.com/questions/517923/what-is-the-best-way-to-remove-accents-in-a-python-unicode-string – Astrogat Jul 20 '15 at 11:18
  • Should Gödel map to Godel or Goedel? The latter is standard, but the former is what you would get by simply stripping off accents. It seems like you need a dictionary of equivalents. – John Coleman Jul 20 '15 at 11:27

1 Answers1

1

You can use unidecode:

>>> from unidecode import unidecode
>>> unidecode(u"société")
societe
enrico.bacis
  • 30,497
  • 10
  • 86
  • 115