I have an application where I need to search in various text-based fields. The application is developed using NHibernate as an ORM.
I would like to implement Porter Stemming in searches, in order to be able to return relevant results even when the keyword matches a similar word, for example the description of a product contains memories
while the search keyword is memory
.
Can anyone suggest the best practices for such types of searches? The first idea that comes to mind is to store two version of the same field in database, for example:
Description
Description_Search
The Description
column would be the text as entered by the website administrator, and is the text visible on the frontend.
The Description_Search
would include the same text, but passed through a Porter-Stemming algorithm. Search queries would then be based on the Description_Search
field, rather than Description
.
Does this make sense? Is it a waste of space having to store two version of almost the same text?
Also, would Lucene.Net
help in such a case? I am also looking into integrating Lucene.Net for full-text based searches but haven't yet looked into it in detail.
Thanks in advance!