0

I have an example class

public class Item 
{
    public int Id;
    public string Name;
    public int ItemParentId;
}

then I put many Items into the database, there they have an Id, Name and ItemParentId, but I also create a list of new Items, where they have Name, ItemParentId, but Id = 0;

I do select all items from database to list1. I create new list2 with new Items.

I want to make something like this:

list1.Union(list2); // need to combine only with different ItemParentId

but the problem is that I need to combine only those items, which ItemParentId are not equal. Linq Union only let to create IEqualityComparer, but this one is not suitable. Also I tried IComparer, but Union doesn't let use it. Any help would be appreciated.

Example of lists and what result I want:

var list1 = { 
       Item { Id = 1, Name = "item1", ItemParentId = 100 },
       Item { Id = 2, Name = "item2", ItemParentId = 200 },
       Item { Id = 3, Name = "item3", ItemParentId = 300 },
       Item { Id = 4, Name = "item4", ItemParentId = 400 }
  } 

var list2 = new List<Item>{ 
       new Item { Id = 0, Name = "item5", ItemParentId = 500 },
       new Item { Id = 0, Name = "item6", ItemParentId = 300 },
       new Item { Id = 0, Name = "item7", ItemParentId = 400 },
  }

result list should contain 3 items, which names are "item1", "item2", "item3", "item4" and "item5"

UPDATE:

thanks guys, with your help I managed to compare items by single property, but now I have to do that by two of them. Actually my lass has now 10 properties, but I have to compare only by two, the comparer looks fine, only thing I want to know is what do the HashCode used for?

K V
  • 578
  • 2
  • 5
  • 24
  • 1
    Why is IEqualityComparer "not suitable"? – Jon Hanna Aug 07 '15 at 12:51
  • because it compares two objects, but I need to compare only object parameter, the Id's will never be equal between list1 and list2 – K V Aug 07 '15 at 12:54
  • 1
    @GrandaS You can write an `IEqualityComparer ` to do whatever you want, including just comparing the `ItemParentId`. But note that `Union` will still return one of the duplicates. – juharr Aug 07 '15 at 12:55
  • It compares two objects in the way it's coded to compare them, you need to compare two Item objects according to their ItemParentId, so create an equality comparer that does that. – Jon Hanna Aug 07 '15 at 12:59
  • @GrandaS Hm. Why should the result _not_ include "item3"? Seems like it should. – Alex Booker Aug 07 '15 at 12:59
  • @Petrichor The OP seems to want a full outer join. item3 is not included because it's `ItemParentId` is 300 and that value is present in the other list as well. – juharr Aug 07 '15 at 13:01
  • woops, my mistake, yes, it should be there too – K V Aug 07 '15 at 13:02
  • @juharr Apparently not :-P – Alex Booker Aug 07 '15 at 13:03
  • @GrandaS I updated my answer to touch a little on `GetHashCode`. Hopefully that helps. – Alex Booker Aug 07 '15 at 14:31

3 Answers3

4

Seems is like IEqualityComparer is what you want after all:

public class Comparer : IEqualityComparer<Item>
{
    public bool Equals(Item x, Item y)
    {
        return x.ItemParentId == y.ItemParentId;
    }

    public int GetHashCode(Item obj)
    { 
        return obj.ItemParentId;
    }
}

Calling code:

var result = list1.Union(list2, new Comparer())

Update: If you want to compare multiple properties, you can alter the comparer:

public class Comparer : IEqualityComparer<Item>
{
    public bool Equals(Item x, Item y)
    {
        return x.ItemParentId == y.ItemParentId 
            || x.Name == y.Name;
    }

    public int GetHashCode(Item obj)
    {
        unchecked 
        {
            int hash = 17;
            hash = hash * 23 + obj.ItemParentId.GetHashCode();
            hash = hash * 23 + obj.Name.GetHashCode();
            return hash;
        }
    }
}

To learn more about the GetHashCode implementation see this answer.

And to learn more about the GetHashCode in general see these answers.

You probably noticed that if you just return 1 or something from the GetHashCode method that your code still works. If you do not implement GetHashCode Union will call Equals which will work but is slower than GetHashCode. Implementing GetHashCode increases performance.

Alex Booker
  • 10,487
  • 1
  • 24
  • 34
  • 1
    The `GetHasCode` and `Equals` should absolutely be based off the same thing. Basically if two items are equal they should have the same hash code. – juharr Aug 07 '15 at 13:00
  • How would I go about doing that if I do not have access to `x` and `y`? – Alex Booker Aug 07 '15 at 13:01
  • @juharr I am looking [here](https://stackoverflow.com/questions/6694508/how-to-use-the-iequalitycomparer) and [here](http://www.blackwasp.co.uk/IEqualityComparer.aspx) and they seem to have the same approach as me. – Alex Booker Aug 07 '15 at 13:02
  • You have access to `obj` and your `GetHashCode` should be based off of the `ItemParentId` of `obj`, not the `Id` because you compare the `ItemParentId` of `x` and `y`. Both of your examples use the same property in both the `Equals` and `GetHasCode` methods. – juharr Aug 07 '15 at 13:03
  • @juharr Whops. I just confused the properties is all. Thanks for pointing that out. I updated my answer. – Alex Booker Aug 07 '15 at 13:05
  • @juharr Yeah, I noticed that, too. I wrote this before the OP updated his question. Working on a fix now. – Alex Booker Aug 07 '15 at 13:07
  • Actually based on the change your answer is correct. – juharr Aug 07 '15 at 13:09
  • @juharr not yet, they're using `!=` where the should use `==`. – Jon Hanna Aug 07 '15 at 13:13
  • @JonHanna That would not produce the desired output. – Alex Booker Aug 07 '15 at 13:15
  • @JonHanna Oh yeah, I missed that. Petrichor you do have to change it. `Union` will remove items that are equal and you are inverting that. – juharr Aug 07 '15 at 13:19
  • Yes it will. They want to find unique cases, which `Union` will give them by filtering out those equal (by the given critera) to those already seen. You have your equality wrong so that won't happen (it'll likely not work well at all because your hashcode will be equal where your equality is not). – Jon Hanna Aug 07 '15 at 13:20
  • @JonHanna Sorry about the confusion. – Alex Booker Aug 07 '15 at 13:22
  • Also, you're incorrect to suggest they should move that hashcode into `Item.GetHashCode` (unless they wanted to make this the default way that item are seen as equal, which is unlikely and if so they should just move all of it there). – Jon Hanna Aug 07 '15 at 13:23
  • @JonHanna Very true. Updated my answer. – Alex Booker Aug 07 '15 at 13:24
  • @JonHanna There was also need for me to ` *37`. Geesh. Not my most elegant answer. – Alex Booker Aug 07 '15 at 13:25
  • Yeah, the `*37` there just increased your collision count slightly for no gain. I'm guessing you started with a mult-and-add of two or more properties and then simplified? – Jon Hanna Aug 07 '15 at 13:26
  • @GrandaS No problem mate. – Alex Booker Aug 08 '15 at 07:40
2

If it's linq-to-objects then you can indeed use an IEqualityComparer:

public class ByParentIdComparer : IEqualityComparer<Item>
{
  public bool Equals(Item x, Item y)
  {
    return x.ItemParentId == y.ItemParentId;
  }
  public int GetHashCode(Item obj)
  { 
    return obj.ItemParentId;
  }
}

Then:

list1.Union(list2, new ByParentIdComparer())

Will work.

This though won't translate well into SQL. If you might be doing the unioning on a database then you're better off with:

list1.Concat(list2).GroupBy(item => item.ItemParentId).Select(grp => grp.First())

Which takes both lists (not yet filtering out duplicates), then groups them by the ItemParentId and then takes the first element from each group, and as such gives the equivalent results.

This will also work in linq-to-objects, but the version using an equality comparer will be faster.

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
  • what is that HashCode for? At the moment I have to compare by two properties, but HashCode returns only single, could this be the problem why I don't get what I want? – K V Aug 07 '15 at 13:15
  • Hashcode is used in maintaining a hash-based set of the elements already processed so that duplicates can be identified and discarded. Why do you compare by two properties, your question says you want to be distinct only by one (`ItemParentId`)? – Jon Hanna Aug 07 '15 at 13:22
  • yes, but it worked fine with single, now I had to add one more property, ItemChildId, and do the same, only make union of all items, where ItemChildId && ItemParentId are not equal, so in the method Equal I can do the job, but how about HashCode, is it important? – K V Aug 07 '15 at 13:32
  • If you don't change the `GetHashCode` it'll work but be sub-optimal (`GetHashCode()` **must** return equal results for objects we are considering equal, but **should** return as many different results for different objects as possible, for more efficient hash-based stores). Try `return unchecked(ItemChildId * 31 + ItemParentId);` for one that should be reasonable enough. – Jon Hanna Aug 07 '15 at 13:37
0

Maybe this helps:

        var list1 = new List<Item>
        {
            new Item { Id = 1, Name = "item1", ItemParentId = 100 },
            new Item { Id = 2, Name = "item2", ItemParentId = 200 },
            new Item { Id = 3, Name = "item3", ItemParentId = 300 },
            new Item { Id = 1, Name = "item4", ItemParentId = 400 }
        };

        var list2 = new List<Item>
        {
            new Item { Id = 0, Name = "item5", ItemParentId = 500 },
            new Item { Id = 0, Name = "item6", ItemParentId = 300 },
            new Item { Id = 0, Name = "item7", ItemParentId = 400 },
        };

        var listMerge = list1.Union(list2.Where(l2 => !list1.Select(l1 => l1.ItemParentId).Contains(l2.ItemParentId))).ToList();

I would personally split that expression in two parts:

        var list2new = list2.Where(l2 => !list1.Select(l1 => l1.ItemParentId).Contains(l2.ItemParentId));
        var listMerge = list1.Union(list2new).ToList();
Qutory
  • 89
  • 6