0

I'm searching through a generic list (or IQueryable) which contains 3 columns. I'm trying to find the value of the 3 column, based on 1 and 2, but the search is really slow. For a single search, the speed isn't noticeable, but I'm performing this search on a loop, and for 700 iterations, it takes a combined time of over 2 minutes, which isn't any use. Columns 1 and 2 are int and column 3 is a double. Here is the linq I'm using:

public static Distance FindByStartAndEnd(int start, int end, IQueryable<Distance> distanceList)
{
    Distance item = distanceList.Where(h => h.Start == start && h.End == end).FirstOrDefault();
    return item ;
}

There could be up do 60,000 entries in the IQueryable list. I know that is quite a lot, but I didn't think it would pose any problem for searching.

So my question is, is there a better way to search through a collection when needing to match 2 columns to get value of a third? I guess I need all 700 searches to be almost instant, but it takes about 300ms for each which soon mounts up.

UPDATE - Final Solution #######################

I've now created a dictionary using Tuple with start and end as the key. I think this could be the right solution.

var dictionary = new Dictionary<Tuple<int, int>, double>();

var key = new Tuple<int, int>(Convert.ToInt32(reader[0]), Convert.ToInt32(reader[1]));
var value = Convert.ToDouble(reader[2]);

if (value <= distance)
{
    dictionary.Add(key, value);
}
var key = new Tuple<int, int>(5, 20);

Works fine - much faster

e-on
  • 1,547
  • 8
  • 32
  • 65
  • Seems like the obvious problem of indexing. What if the values where in a dictionary and were sorted wouldn't that be faster? – awright18 Oct 24 '12 at 09:06
  • Yes I think the 2 column key in a dictionary could be a good way to do it. I'll need to look into it – e-on Oct 24 '12 at 09:30

5 Answers5

5

Create a dictionary where columns 1 and 2 create the key. You create the dictionary once and then your searches will be almost instant.

empi
  • 15,755
  • 8
  • 62
  • 78
  • Ok, I've got this up and running now and it populates the dictionary fine. I'll add the code to the original post as an update. However, it doesn't recognise the key and can't find the right entry. Do you know how to create an equality comparer for this? – e-on Oct 24 '12 at 12:01
  • @e-on: try the simplest solution - create the key as a string col1-col2. If this starts working try experimenting with e.g. Tuple. – empi Oct 24 '12 at 12:20
  • Thanks empi - I got that working. As it turned, the problem wasn't with Tuple, but with some of my rows of data. It's a really fast solution now :) – e-on Oct 25 '12 at 09:43
0

If you have control over your collection and model classes, there is a library which allows you to index the properties of the class, which can greatly speed up searching.

http://i4o.codeplex.com/

marknuzz
  • 2,847
  • 1
  • 26
  • 29
0

Your problem is that LINQ has to execute the expression tree everytime you return the item. Just call this method with multiple start and end values

public static IEnumerable<Distance> FindByStartAndEnd
    (IEnumerable<KeyValuePair<int, int>> startAndEnd,
    IQueryable<Distance> distanceList)
{

    return
        from item in distanceList
        where 
            startAndEnd.Select(s => s.Key).Contains(item.Start)
            && startAndEnd.Select(s => s.Value).Contains(item.End)
        select item;
}
Jan P.
  • 3,261
  • 19
  • 26
  • Hi Jan, thanks for this - daft question, but how would I pass a KeyValuePair to the method? I was trying this, but getting an error: `FindByStartAndEnd(new IEnumerable>, distanceList)`? – e-on Oct 24 '12 at 09:28
  • You create a List of KeyValuePair with all your start and end integers and pass it to the method. – Jan P. Oct 24 '12 at 09:37
0

I'd give a hashSet a try. This should speed up things ;)

Community
  • 1
  • 1
mkowalik
  • 133
  • 1
  • 9
0

Create a single value out of the first two columns, for example by concatenating them into a long, and use that as a key in a dictionary:

public long Combine(int start, int end) {
  return ((long)start << 32) | end;
}

Dictionary<long, Distance> lookup = distanceList.ToDictionary(h => Combine(h.Start, h.End));

Then you can look up the value:

public static Distance FindByStartAndEnd(int start, int end, IQueryable<Distance> distanceList) {
  Distance item;
  if (!lookup.TryGetValue(Combine(start, end), out item) {
    item = null;
  }
  return item;
}

Getting an item from a dictionary is close to an O(1) operaton, which should make a dramatic difference from the O(n) operaton to loop through the items to find one.

Guffa
  • 687,336
  • 108
  • 737
  • 1,005