3

I need to compare two hashsets that are of the same type, but have differing values for only some properties. I essentially need a more specific ExceptWith.

I've tried using ExceptWith but that doesn't allow you to specify a property to compare on, as far as I know.

We should pretend that I cannot add or remove any properties to the Person class.

   class Program
{
    private class Person
    {
        public string Name { get; set; }
        public string Id { get; set; }
    }

    static void Main(string[] args)
    {
        var people1 = new[]
        {
            new Person
            {
                Name = "Amos",
                Id = "123"
            },
            new Person
            {
                Name = "Brian",
                Id = "234"
            },
            new Person
            {
                Name = "Chris",
                Id = "345"
            },
            new Person
            {
                Name = "Dan",
                Id = "456"
            }
        };

        var people2 = new[]
        {
            new Person
            {
                Name = "Amos",
                Id = "098"
            },
            new Person
            {
                Name = "Dan",
                Id = "987"
            }
        };

        var hash1 = new HashSet<Person>(people1);

        var hash2 = new HashSet<Person>(people2);

        var hash3 = new HashSet<Person>(); // where hash3 is hash1 without the objects from hash2 solely based on matching names, not caring about Id matching

        foreach (var item in hash3) // should print out Brian, Chris
        {
            Console.WriteLine($"{item.Name}");
        }
    }
}
CBeams
  • 55
  • 7

3 Answers3

3

In your Person class, you should define your own GetHashCode method, so that it only uses the person's name and not the ID.

If you do that, you also have to define your own Equals method: Why is it important to override GetHashCode when Equals method is overridden?

miara
  • 847
  • 1
  • 6
  • 12
  • This seems like the best way for this use case, but in my actual scenario I cannot edit the class. I'll add that to the question. Thanks. – CBeams Aug 01 '19 at 21:25
2

You can do this with linq:

var hash3 = hash1.Where(x=>!hash2.Select(y=>y.Name).Contains(x.Name)).ToHashSet();

This creates a collection of just the names from hash2, then takes all the Persons from hash1 where the Name is not in that collection.

iakobski
  • 1,000
  • 7
  • 8
  • 1
    `hash3` is a `IEnumerable` here and not a `HashSet`. – juharr Aug 01 '19 at 18:39
  • 2
    Ah, you didn't say that in the question, you just said "should print out Brian, Chris". All that's needed is `.ToHashSet()` on the end, I'll edit the answer. – iakobski Aug 01 '19 at 18:45
  • 1
    Not my question. I was just pointing out that you had a variable named `hash3` that isn't actually a `HashSet`. – juharr Aug 01 '19 at 18:46
  • 1
    I think juharr means that in your solution `hash3` is an `IEnumerable` while they want it to be a `HashSet`. This is clearly indicated in their example code, `var hash3 = HashSet()` – miara Aug 01 '19 at 18:46
  • Should be noted that this is a polynomial algorithm as it will have to iterated `hash2` for every item in `hash1`. For small sets like in the example this will not be an issue, but it will not scale very well. – juharr Aug 02 '19 at 14:54
2

You could just hash the names from the second array to use in a Linq filter to create the final HashSet

var excludeName = new HashSet<string>(people2.Select(x => x.Name));
var hash3 = new HasSet<Person>(people1.Where(x => !exludeName.Contains(x.Name));

This can be especially useful if that list of values to exclude is very large as it will make the entire process run in linear time.

Or here's how you can set up the HashSets with IEqualityComparer<T>.

public class PersonByNameComparer : IEqualityComparer<Peron>
{
    public bool Equals(Person p1, Persion p2)
    {
        return p1.Name == p2.Name;
    }

    public int GetHashCode(Person p)
    {
        return p.Name.GetHashCode();
    }
}

Note: This means that the HashSets cannot contain two items with the same Name even if the Id is different. But it also means it cannot contain to different objects with the same values like your current setup.

And then uses it like this.

var comparer = new PersonByNameComparer();

var hash1 = new HashSet<Person>(people1, comparer);
var hash2 = new HashSet<Person>(people2, comparer);

// Note that ExceptWith will mutate the existing hash.  
hash1.ExceptWith(hash2); 

// Alternatively you can create the new hash like this
var hash3 = new HashSet<Persion>(hash1.Where(p => !hash2.Contains(p)));
juharr
  • 31,741
  • 4
  • 58
  • 93