2

I have a file that contains a lot of names of people and some meta for each person.

I'm trying to do a search for the people, given some search query.

So I'm first going to do an exact search (easy, implemented as a Dictionary with a Key Lookup).

If that fails (or fails to reach the max number of records specified), then I was hoping to try and go back over all the keys again and do a StartsWith check.

Now this works fine. But I was curious if there's a better way to do this. Looking around the interwebs it looks like there's a data structure called a Patricia Trie alg.

so ..

  1. Is the Patricia Trie the recommended way of doing a StartsWith check over a large set of string-keys?
  2. If yes, is there a nuget package for this?
  3. No - i do no want to do a Contains check. nope. nada. zilch.

Edit 1:

Also, my data will be loaded in once at app startup so I'm happy to incur a slower 'build' step while creating/populating the data structure. Especially if runtime search-perf is better.

Also, I understand that the word 'performance' depends on heaps of factors, like the number of items stored and the size of the search queries etc. So this is more an academic question which I wish to do some code-comparison vs a microptimisation anti-pattern.

Edit 2:

Also, this previous SO question sorta talks about a similar problem I'm having but the answers seem to be talking about a substring search and not a prefix (i.e. StartsWith) search.

halfer
  • 19,824
  • 17
  • 99
  • 186
Pure.Krome
  • 84,693
  • 113
  • 396
  • 647
  • 1) not necessarily. a performance tradeoff. the patricia trie will waste less (additional) storage because it is a compact structure, but algorithmically, it will still require you to apply a startswith on the deepest node that partially matches (users will input single letters, trie may store longer chunks) – Cee McSharpface Jul 28 '17 at 15:08
  • You might also consider [Suffix Array](https://en.wikipedia.org/wiki/Suffix_array). See also https://stackoverflow.com/questions/2487576/trie-vs-suffix-tree-vs-suffix-array – hatchet - done with SOverflow Jul 28 '17 at 15:09
  • @hatchet I thought a Patricia Trie was a (speciailized?) form of a suffix Array? – Pure.Krome Jul 29 '17 at 00:00
  • Also @hatchet I thought suffix tree/suffix array was good for SUBSTRING searches ... vs StartsWith/Prefix searches. – Pure.Krome Jul 29 '17 at 00:06
  • If you searched for ' Rob' (note leading space) it should find Robert. You'd prepend the source strings with a space so the first word in the string would also match such a search term. How you use it depends on how you construct it. Are you tokenizing names? For example see this post, and my answer where OP wanted to search an array of strings: [Fastest way to search a list of names in C#](https://stackoverflow.com/questions/10606728/fastest-way-to-search-a-list-of-names-in-c-sharp). If you included an example of what you're searching, it might be more clear how to apply different structures. – hatchet - done with SOverflow Jul 29 '17 at 00:18

0 Answers0