0

I have a huge collection of strings. I will find out all the strings which starts with the given character more frequently. What would be a best collection to do this. I will initialize the collection in sorted order.

Thanks

Kumaran
  • 65
  • 9

3 Answers3

1

If you want a map from a character to all strings starting with that character, you might find ILookup<TKey, TElement> suitable. It's very similar to a Dictionary<TKey, TValue>, with two main differences:

  1. Instead of a 1:1 mapping, it performs a 1:n mapping (i.e. there can be more than one value per key).

  2. You cannot instantiate (new) nor populate it (.Add(…)) yourself; instead, you let .NET derive a fully populated instance from another collection by calling .ToLookup(…) on the latter.

Here's an example how to build such a 1:n map:

using System.Collections.Generic;  // for List<T>
using System.Linq;                 // for ILookup<TKey, TValue> and .ToLookup(…)

// This represents the source of your strings. It doesn't have to be sorted:
var strings = new List<string>() { "Foo", "Bar", "Baz", "Quux", … };

// This is how you would build a 1:n lookup table mapping from first characters
// to all strings starting with that character. Empty strings are excluded:
ILookup<char, string> stringsByFirstCharacter =
    strings.Where(str => !string.IsNullOrEmpty(str))  // exclude empty strings
           .ToLookup(str => str[0]);                  // key := first character

// This is how you would look up all strings starting with B.
// The output will be Bar and Baz:
foreach (string str in stringsByFirstCharacter['B'])
{
    Console.WriteLine(str);
}

P.S.: The above hyperlink for ILookup<…> (the interface) refers you to the help page for Lookup<…> (the implementation class). This is on purpose, as I find the documentation for the class easier to read. I would however recommend to use the interface in your code.

stakx - no longer contributing
  • 83,039
  • 20
  • 168
  • 268
  • This does not work for my situation. Those words are not english words. Single letter will be combination of multiple alphabets. so i will be in situation to return all words if combined with others.. – Kumaran Feb 17 '14 at 05:41
  • @Kumaran: Well, you don't *have to* map from `char` to `string`s. You could also map from `string` to `string`s, allowing you to choose arbitrarily long prefixes instead of only single characters (Unicode code points). You'd have to change the type to `ILookup`, and you'd need to replace the `.ToLookup(…)` lambda that extracts the first character with one that extracts a string prefix. – stakx - no longer contributing Feb 17 '14 at 10:24
0

If you need to search regularly with a huge collection of strings, then use a Hash table. Remember to distribute the table evenly to speed up the look-up operation.

Toan Nguyen
  • 11,263
  • 5
  • 43
  • 59
0

Well so you need to create an index on function from string.

For this Id suggest using

Dictionary<string,List<string>> data structure.

ToLookup isn't so good cause it limits your ability to maniuplate the data structure.

Valentin Kuzub
  • 11,703
  • 7
  • 56
  • 93