6

I have a requirement to index a series of key phrases assigned to articles. The phrases are stored as a string with a \r\n delimiter and one phrase may contain another phrase, for example:

This is a key phrase
This is a key phrase too
This is also a key phrase

Would be stored as

keywords: "This is a key phrase\r\nThis is a key phrase too\r\nThis is also a key phrase"

An article which has only the phrase This is a key phrase too should not be matched when a search for This is a key phrase is performed.

I have a custom indexer implementing ISimpleDataService which works fine and indexes the content, but I can't work out how to get a query such as "This is a key phrase" to return results.

From what I've read, I thought the default QueryParser should split on delimiters and see each entry as a separate value, but it doesn't seem to work that way.

Although I've tried various implementations, my current search code looks like this:

var searcher = ExamineManager.Instance.SearchProviderCollection["KeywordsSearcher"];
var searchCriteria = searcher.CreateSearchCriteria(BooleanOperation.Or);
var query = searchCriteria.Field("keywords", keyword).Compile();
var searchResults = searcher.Search(query).OrderByDescending(x => x.Score).ToList();

The 'simple' way I thought to do this was to add each keyword as a separate 'keyword' field, but the SimpleDataSet provided as part of the .NET implementation uses a Dictionary<string, string>, which precludes me from being able to have more than one key with the same name.

I'm new to Lucene and Umbraco, so any advice would be gratefully received.

Town
  • 14,706
  • 3
  • 48
  • 72
  • So far, I've ended up splitting the keywords and creating a field for each (Keyword1, Keyword2, etc). This isn't ideal, so I'm hoping there's a better way... – Town Feb 22 '18 at 14:54

0 Answers0