0

I have two List<Guid>()s and I want to find GUID values that are not in the second list.

How do I do this using LINQ? I think LINQ would be a more efficient approach than a foreach().

David Brossard
  • 13,584
  • 6
  • 55
  • 88
Sam
  • 26,817
  • 58
  • 206
  • 383
  • 1
    Possible duplicate of https://stackoverflow.com/questions/3944803/use-linq-to-get-items-in-one-list-that-are-not-in-another-list – David Brossard Mar 21 '18 at 00:07
  • 3
    Possible duplicate of [Use LINQ to get items in one List<>, that are not in another List<>](https://stackoverflow.com/questions/3944803/use-linq-to-get-items-in-one-list-that-are-not-in-another-list) – Keith Nicholas Mar 21 '18 at 00:19

2 Answers2

4

For that you can use the LINQ Except() extension method:

var result = list1.Except(list2);
Peter B
  • 22,460
  • 5
  • 32
  • 69
  • Don’t you need custom comparer for that since Guid’s are not value types, so even if the string values are the same this would not yield desired results if Guid instances are different? – Vidmantas Blazevicius Mar 21 '18 at 00:12
  • Would this be more efficient/faster than the other approach posted? – Sam Mar 21 '18 at 00:12
  • @VidmantasBlazevicius Guids are structs so this will work just fine (https://dotnetfiddle.net/Widget/3eNTFU) – Manfred Radlwimmer Mar 21 '18 at 07:47
  • Sorry for the delay; I tested the code and it works. It is irrelevant if Guid is a value type or not (although [it *is* a value type](https://stackoverflow.com/q/2344213/1220550)). Guid supports hashing and .Equals() which are used for the equality check inside `Except()`, and that makes it work with `.Except()`. I don't know about the performance, Linq has a reputation of being allocation-heavy so that may make it not the fastest way to get results, then again that won't matter too much unless your lists are 10000+ long. – Peter B Mar 21 '18 at 10:17
1

I made a test, to compare the how much time different methods take to complete this task.

For the test, I used 2 List<Guid> of 200 items each.
The second list contains ~1/10 of pseudo-random elements which are also in the first one.

I measured the time each method required to complete using a StopWatch().

Since Except, Where and LookUp are cached, the test has been restarted each time. It can however be useful to know that the cached Functions take only a few Tick (1 ~ 7) to complete once initialized.
If the same query must be repeated multiple times, these Functions' feature can really make the difference.

This is how the two Lists are created:

static Random random = new Random();
// [...]

random.Next(0, 10);
List<Guid> guid1 = new List<Guid>(200);
List<Guid> guid2 = new List<Guid>(200);
int insertPoint = random.Next(0, 10);
for (int x = 0; x < 200; x++)
{
    guid1.Add(Guid.NewGuid());
    guid2.Add((x == insertPoint) ? guid1.Last() : Guid.NewGuid());

    if (x > 9 && ((x % 10F) == 0.0F))
        insertPoint = random.Next(x, x + 10);
}

These are the Functions tested:

List1 Except List2:

var result1 = guid1.Except(guid2);

List1.Item Where != List2.Item

var result2 = guid1.Where(guid1 => guid2.All(g => g != guid1));

List1.Items FindAll != List2.Items

var result3 = guid1.FindAll(g1 => guid2.All(g2 => g2 != g1));

List1.Item LookUp Contains (List2.Item)

var lookUpresult = guid1.ToLookup(g1 => guid2.Contains(g1));
var result4 = lookUpresult[false].ToList();

List1 Hashset GroupBy Contains (List2 Hashset)

var guidHS1 = new HashSet<Guid>(guid1);
var guidHS2 = new HashSet<Guid>(guid2);
var hsGroups = guid1.GroupBy(g => guidHS2.Contains(g));
var result5 = hsGroups.First().ToList();

ForEach List1->Item ForEach List2->Item (Item1 == Item2) => List3

List<Guid> guid3 = new List<Guid>();
bool found;
foreach (Guid guidtest in guid1) {
    found = false;
    foreach (Guid guidcompare in guid2) {
        if (guidtest == guidcompare) {
            found = true;
            break;
        }
    }
    if (!found) guid3.Add(guidtest);
}

These are the results of this test: (20 rounds)

Number of equal elements found: 181~184

EXCEPT          => Time: 1724 ~ 4356 ticks
WHERE           => Time: 3651 ~ 7360 ticks
FINDALL         => Time: 3037 ~ 6472 ticks
LOOKUP          => Time: 9406 ~ 16502 ticks
HASHSET GROUPBY => Time: 1773 ~ 3597 ticks
FOREACH         => Time: 650  ~ 1529 ticks
Jimi
  • 29,621
  • 8
  • 43
  • 61
  • Wow! This is very interesting. So the `foreach` beats `LINQ`! – Sam Mar 21 '18 at 15:47
  • Well, what Linq offers goes beyond the pure speed. Also, as noted, if you repeat the operation, even with different Lists, `Except` will take 1 to 7 (in this test) **TICKS** to complete. – Jimi Mar 21 '18 at 16:00