0

I have a large dictionary of pre-computed values of type in a C# dictionary.

Now I have a search string, I need to find all entries in the dictionary where the search string is part of "string" value.i.e Say I have,

Dictionary<PersonId, PersonDescription> values;

Search string is "office" I would need to find all entries in the dictionary where PersonDescription string value contains the term "office".

How best to search in the fastest way possible?

Robert
  • 10,403
  • 14
  • 67
  • 117
theOne
  • 357
  • 1
  • 2
  • 12
  • Try to look for custom comparer, look at this answer http://stackoverflow.com/a/8400865/335905 – celerno Mar 28 '14 at 18:55
  • LINQ's Where method? http://msdn.microsoft.com/en-us/library/system.linq.enumerable.where.aspx –  Mar 28 '14 at 18:57

2 Answers2

2

Dictionaries are optimised on accessing by the key. So, you're going to have to iterate over every single element anyway to see if the value contains the key. Here's two potential ways:

Iterate

List<PersonDescription> found = new List<PersonDescription>();
foreach(var pair in values)
{
   if(pair.Value.SomeField.Contains("office"))
       found.Add(pair.Value);
}

Note that the above adds the PersonDescription, you might want to add PersonId.

Use a KeyedCollection class (and then iterate)

You'll have to derived a class from this (but it's simple) (also assumes PersonId is a member of PersonDescription:

using System.Collections.Generic.ObjectModel

public class PersonDictionary : KeyedCollection<int, PersonDescription>
{
    protected override int GetKeyforItem(PersonDescription description)
    {
        return description.Personid; // hope it's on this class!
    }
} 

Now you can iterate over this, and also access it randomly:

List<PersonDescription> found = new List<PersonDescription>();
for(int i = 0; i < values.Count; ++i)
{
    if(values[i].Field.Contains("office"))
        found.Add(values[i]);
}

You can substitute the manual loops specified above for LINQ if you prefer.

Dictionaries are generally used because you want a fast look up by a key. I am not saying this is an XY Problem, but it smells like it. Alternatively you might actually need to access a dictionary both by key and by searching all items

Community
  • 1
  • 1
Moo-Juice
  • 38,257
  • 10
  • 78
  • 128
  • Hi Moo-Juice...you may be right...its been a long day. Basically, I have a long list of PersonId, PersonDesc string values in a csv file and I need to perform sort of a fast search for items in the file that match the search string as part of PersonDesc, Ideally I would need an index I believe but as a first pass, thought Id use a dictionary to load up the values. – theOne Mar 28 '14 at 19:08
  • (unless I'm mistaken) neither of these will make a partial string match (like `Field.Contains("office")` instead of `Field == "office"`) any faster than iterating through a `List<>`. `Dictionary` and its hash-based brethren are great for *exact* matches, not *inexact* ones. Maybe you could use an appropriate dictionary/lookup for a quick exact search before the longer inexact one... – Tim S. Mar 28 '14 at 19:10
  • @TimS., I am assuming that *in general* he wants to access using an id, but also wants to perform searches. I believe `KeyedCollection` might help here, but you're right - I doubt it'll be any more different than a `List<>`. However, I don't have all his use-cases in front of me :) – Moo-Juice Mar 28 '14 at 19:12
  • Thanks @TimS. Any suggestions on any other datastructures that I could use...I would prefer ready-made ones part of C# .Net 4.0 – theOne Mar 28 '14 at 19:12
  • I will definitely look into that @TimS. That actually looks like something I should be looking at. Thank you all for your suggestions. I will mark this as the answer...Again thanks. – theOne Mar 28 '14 at 19:29
1
string searchTerm = "office";
var PeopleIDs = dict.Where(person => person.Value.Contains(searchTerm))
                    .Select(item => item.Key);

Do you want something more complex than that?

Jonesopolis
  • 25,034
  • 12
  • 68
  • 112