1

I have two lists and I'm trying to return items that are not in the other list. Here is my code:

var Results = ListOne.Where(x => ListTwo.All(a => a.EmployeeNum != x.EmployeeNum && a.Sched != x.Sched));

This takes about 9-10 seconds to complete. ListOne has about 1200 records and ListTwo has about 33000 records.

Johnny
  • 8,939
  • 2
  • 28
  • 33
Kane
  • 43
  • 6
  • 4
    That's going to be difficult to speed up using lists. One approach could be to turn `ListTwo` into a `HashSet`, then you could just use the set's `Contains` method. – Etienne de Martel Apr 03 '19 at 20:49
  • I won't be able to change it into a set because there are duplicate employee numbers, but they have different sched. I believe a Set needs to be unique for every element. – Kane Apr 03 '19 at 20:54
  • @Kane: `HashSet>` with a help of `Tuple` you combine two properties into one – Dmitry Bychenko Apr 03 '19 at 20:55
  • 2
    Use the Except method with a custom equality comparer, i.e. `var results = ListOne.Except(ListTwo, ProjectionEqualityComparer.Create(x => (x.EmployeeNum, x.Sched));` using Jon Skeet's [ProjectionEqualityComparer](https://stackoverflow.com/questions/188120/can-i-specify-my-explicit-type-comparator-inline). – ckuri Apr 03 '19 at 20:57
  • 1
    MoreLinq's `ExceptBy` - https://github.com/morelinq/MoreLINQ/blob/master/MoreLinq/ExceptBy.cs https://www.nuget.org/packages/morelinq/ – mjwills Apr 03 '19 at 21:33
  • I suspect if we knew where the two lists came from (Queries) there would be better ways to do this too, with instantiating two lists to then compare. – Austin T French Apr 03 '19 at 21:37

3 Answers3

4

Using HashSet<T>, as it has O(1) search time could improve performances, e.g.

var hashSet = new HashSet<T>(ListTwo.Select(x => Tuple.Create(x.EmployeeNum, x.Sched)));
var results = ListOne.Where(x => !hashSet.Contains(Tuple.Create(x.EmployeeNum, x.Sched)));
Johnny
  • 8,939
  • 2
  • 28
  • 33
  • I've tried this way, but for someone reason the HashSet is coming back with more than what was in the list. Shouldn't it be the equal amount of items? – Kane Apr 03 '19 at 21:35
  • @Kane shouldn't be more but less(if you have duplicates) or equal – Johnny Apr 03 '19 at 21:38
  • sorry. I misread the numbers. So if there's duplicate in EmployeeNum, it will omit that? Even though EmployeeNum can have different/multiple Sched that was originally in the List. – Kane Apr 03 '19 at 21:51
  • @Kane duplicates are entries with the same `EmployeeNum` and `Sched`... – Johnny Apr 04 '19 at 04:39
2

You can also create your own IEqualityComparer (assumes you have a class called Employee):

var results = ListTwo.Except(ListOne, new EmployeeComparer());

IEqualityComparer Implementation:

public class EmployeeComparer : IEqualityComparer<Employee>
{
    public int GetHashCode(Employee co)
    {
        if (co == null)
        {
            return 0;
        }

        return co.EmployeeNum.GetHashCode();
    }

    public bool Equals(Employee x1, Employee x2)
    {
        if (object.ReferenceEquals(x1, x2))
        {
            return true;
        }

        if (object.ReferenceEquals(x1, null) || object.ReferenceEquals(x2, null))
        {
            return false;
        }

        return x1.EmployeeNum == x2.EmployeeNum && x1.Sched == x2.Sched;
    }
}
Stemado
  • 599
  • 5
  • 10
  • ['Except'](https://referencesource.microsoft.com/system.core/system/linq/Enumerable.cs.html#64071682ee3bf309) uses a hashset internally, so your implementation will perform poorly in case there are multiple items with the same EmployeeNum and different Sched. You should create a combined hashcode of these two properties. `EqualityComparer` classes ought to have consistent implementations of `GetHashCode` and `Equals`. – Theodor Zoulias Apr 18 '19 at 08:54
  • @TheodorZoulias do you have any sources/reading material? I would like to learn more about this. – Stemado Apr 18 '19 at 12:38
  • 1
    Sure. Microsoft [recommends](https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.equalitycomparer-1?view=netframework-4.7.2) that we derive from the EqualityComparer class instead of implementing the IEqualityComparer interface. Some [guidelines for GetHashCode](https://blogs.msdn.microsoft.com/ericlippert/2011/02/28/guidelines-and-rules-for-gethashcode/). About combining hashcodes: [Quick and Simple Hash Code Combinations](https://stackoverflow.com/questions/1646807/quick-and-simple-hash-code-combinations) – Theodor Zoulias Apr 18 '19 at 12:58
0

try this

var Results = ListOne.AsParallel().Where(x => ListTwo.All(a => a.EmployeeNum != x.EmployeeNum && a.Sched != x.Sched)).ToList();
Khaled Sameer
  • 296
  • 2
  • 10