Generic list comparison in C#

Question

I have a method that finds differences between two lists of ints using a dictionary. Essentially the code loops the first list, adding each int to the dictionary and setting (to 1 where not already present)/incrementing the value. It then loops the second list setting (to -1 where not already present)/decrementing the value.

Once it has looped both lists you end up with a dictionary where keys with values = 0 indicate a match, keys with values >=1 indicate presence only in the first list and values <=-1 indicate presence only in the second list.

Firstly, is this a sensible implementation?

Secondly, I would like to make it more generic, at the moment it can only handle int based lists. I'd like something that could handle any object where the caller could potentially define the comparison logic...

    public static Dictionary<int, int> CompareLists(List<int> listA, List<int> listB)
    {
        // 0        Match
        // <= -1    listB only
        // >= 1     listA only
        var recTable = new Dictionary<int, int>();

        foreach (int value in listA)
        {
            if (recTable.ContainsKey(value))
                recTable[value]++;
            else
                recTable[value] = 1;
        }

        foreach (int value in listB)
        {
            if (recTable.ContainsKey(value))
                recTable[value]--;
            else
                recTable[value] = -1;
        }

        return recTable;

    }

Thanks in advance!

In response to: "It won't work properly if to example you have same value appears twice in listA and once in listB, result will be positive, which say "listA only" in your comments."

Let me clarify; if a value appears twice in listA it should also appear twice in listB - So if a value is in listA twice and once in listB, I don't care which one from listA it picks to match, as long as the one non-reconciling item is reported correctly.

Imagine the use-case where you are trying to reconcile lots of payment amounts between two files, it's entirely feasible to have repeating amounts but it doesn't really matter which of the duplicates are matched as long as the non-reconciling values are reported.

`Firstly, is this a sensible implementation?` Might be better for the CodeReview site, since it's working code. `I'd like something that could handle any object where the caller could potentially define the comparison logic` Have you checked out [`IComparable`](https://msdn.microsoft.com/en-us/library/4d7sx9hd%28v=vs.110%29.aspx)? (Edit: or actually [`IEquatable`](https://msdn.microsoft.com/en-us/library/ms131187%28v=vs.110%29.aspx)?) — 31eee384, Dec 08 '15 at 16:43
See here http://stackoverflow.com/questions/1674742/intersection-of-multiple-lists-with-ienumerable-intersect — skalinkin, Dec 08 '15 at 16:44
It won't work properly if to example you have same value appears twice in `listA` and once in `listB`, result will be positive, which say "listA only" in your comments. — Sinatr, Dec 08 '15 at 16:46
A sensible implementation of what? If you explain the purpose behind your function, people can suggest a better way of achieving it. — , Dec 08 '15 at 16:50

score 1 · Answer 1 · answered Dec 08 '15 at 16:53

To answer your second question, here's how to make it more generic:

public static Dictionary<T, int> CompareLists<T>(IEnumerable<T> listA, 
    IEnumerable<T> listB, IEqualityComparer<T> comp)
{
    var recTable = new Dictionary<T, int>(comp);

    foreach (var value in listA)
    {
        if (recTable.ContainsKey(value))
            recTable[value]++;
        else
            recTable[value] = 1;
    }

    foreach (var value in listB)
    {
        if (recTable.ContainsKey(value))
            recTable[value]--;
        else
            recTable[value] = -1;
    }

    return recTable;
}

This is more generic because:

I pass in the type T instead of an int.
I use IEnumerables instead of Lists.
I pass in an IEqualityComparer and pass it to the Dictionary constructor which needs to use it.
I use var in the foreach loops instead of int. You can also use T.

You call this code like this:

static void Main()
{
    int[] arr1 = { 1, 2, 3 };
    int[] arr2 = { 3, 2, 1 };

    var obj = CompareLists(arr1, arr2, EqualityComparer<int>.Default);

    Console.ReadLine();
}

Here's an example of implementing IEqualityComparer. This treats all odd ints as equal and all even ints as equal:

public class MyEq : IEqualityComparer<int>
{
    public bool Equals(int x, int y)
    {
        return (x % 2) == (y % 2);
    }

    public int GetHashCode(int obj)
    {
        return (obj % 2).GetHashCode();
    }
}

Thanks for this, it works perfectly for what I want to do. – Greg Dec 08 '15 at 17:19 — Greg, Dec 08 '15 at 17:19

score 1 · Answer 2 · edited May 23 '17 at 12:15

1

FullOuterJoin as found here: LINQ - Full Outer Join

public static Dictionary<int, int> CompareLists(List<int> listA, List<int> listB)
{
  return listA.FullOuterJoin(listB,
    a=>a, // What to compare from ListA
    b=>b, // What to compare from ListB
    (a,b,key)=>
      new {key=key,value=0}, // What to return if found in both
      new {key=key,value=-1},// What to return if found only in A
      new {key=key,value=1}) // What to return if found only in B
    .ToDictionary(a=>a.key,a=>a.value); // Only because you want a dictionary
}

edited May 23 '17 at 12:15

Community

1
1

answered Dec 08 '15 at 16:57

Robert McKee

21,305
1
43
57

I struggle to see how this is more succinct/better that the solution posted by user2023861? Happy to be re-educated, just don't see it? – Greg Dec 08 '15 at 17:21
Well, this method works with all types of collections, lists, is generic in that it allows you to determine what to compare, and what to keep. Sure, in your case all you want is 0,-1,or 1 back, but what if you want the entire record? or you want to compare say the properties of a list of objects (Or multiple properties)? – Robert McKee Dec 08 '15 at 17:49
It is also a well defined, and known method (although the implementation is fairly new) that's been around for 30 years: https://en.wikipedia.org/wiki/Join_(SQL)#Full_outer_join – Robert McKee Dec 08 '15 at 17:59
Using FullOuterJoin, the list types don't have to be of the same type. They can also be full blown objects, and you don't have to write a custom IEqualityComparer, just list what properties of the objects you want to compare. It fits well with the other LINQ methods, and can be translated to SQL very easily (FULL OUTER JOIN). You can also return partial objects, or comine parts of each. It's about as generic as you can get, and everyone should understand what a full outer join is doing from the name. – Robert McKee Dec 08 '15 at 18:12
Thanks for the additional detail. It was more the new implementation that I was struggling with as opposed to the concept of a full outer join. It's benefits make more sense now. – Greg Dec 08 '15 at 19:09

score 0 · Answer 3 · answered Dec 08 '15 at 16:52

You can do this using Generics:

public static Dictionary<T, int> CompareLists<T>(List<T> listA, List<T> listB)
{
    // 0        Match
    // <= -1    listB only
    // >= 1     listA only
    var recTable = new Dictionary<T, int>();

    foreach (T value in listA)
    {
        if (recTable.ContainsKey(value))
            recTable[value]++;
        else
            recTable[value] = 1;
    }

    foreach (T value in listB)
    {
        if (recTable.ContainsKey(value))
            recTable[value]--;
        else
            recTable[value] = -1;
    }

    return recTable;

}

score 0 · Answer 4 · answered Dec 08 '15 at 17:11

These are my two cents:

public static Dictionary<T, int> CompareLists<T>(List<T> left, List<T> right, IEqualityComparer<T> comparer)
{
    Dictionary<T, int> result = left.ToDictionary(l => l, l => right.Any(r => comparer.Equals(l, r)) ? 0 : -1);
    foreach (T r in right.Where(t => result.Keys.All(k => !comparer.Equals(k, t))))
        result[r] = 1;
    return result;
}

The method takes Lists of any type T and an IEqualityComparer for that type T. It then at first generates a dictionary of those elements contained in the "left" List, thereby checking if they are also in the "right" List and setting the value accordingly.

The second step adds the elements that are only contained in the "right" List with value 1.

If this is a sensible implementation depends on what you are trying to achieve with it. I think it's a short but still readable one, relying on proper implementation of the LINQ methods. Though there might be faster possibilities one could think about if this is for really big lists or an very often called method.

This of course won't work for your added requirement of multiple occurrences of the same key in one or both of the lists. — René Vogt, Dec 08 '15 at 20:47

Generic list comparison in C#

4 Answers4