This question is related to my earlier question Accent insensitive search django sqlite
As mentioned in the response there is no direct way to do so. I have come up with a solution, but I am not sure if it is a good one:
Use Case: Assume that the database has a table NewsArticles
with one of the column being ArticleText
. As the name implies ArticleText
contains the text of the news articles which includes several words with accented characters. Let's say one such word present in the ArticleText
for an article with Primary Key aid123
is Puerto Aisén
. Now, a user can search for either Puerto Aisén
or Puerto Aisen
and should be able to get the article with PK aid123
back with the found accented word in bold (<b>Puerto Aisén</b>
).
Solution: I add one more column in the table normalizedArticleText
and make it contain the unicode.normalize
(accent removed) version of the text. Now whenever a search query comes, I first determine if the query contains accented character or not by using s.decode('ascii')
and then search accordingly in the corresponding column.
Problem: I am duplicating the whole data. Also, there is no way for me to bold the accented keyword if the search query was the non-accented version of the keyword.
Any brilliant suggestions? I am using django with sqlite