I have the following situation: I have a big collection of strings (lets say 250.000+) of average length of maybe 30. What I have to do is to do many searches within these .. mostly those will be of StartsWith and Contains kind.
The collection is static at runtime. Which means the initial reading and filling of the collection of choice is done only once .. therefore the performance of building the datastructure is absolutely not important. Memory is also not a problem: which also means that I don't mind having two collections with the same data in each if needed (like one for the startswith and another for contains). Only thing that matters is performance of the searches which should return all elements which match the searchcondition.
For startswith I came upon a Trie or Radix-tree .. but maybe there are even better choices?
For contains .. I have no good idea yet at all (beside running a linq query on a list which wont be very fast with that amount of data).
Thanks in advance everyone!
update: I forgot an important part: with Contains i mean no exact matches in the collection .. but i want to find all strings in the collection which contain the given searchstring