-1

This is my class.

public class Report
{
    public string Name { get; set; }
    public string City { get; set; }
    public string Country { get; set; }
    public string Lat { get; set; }
    public string Long { get; set; }
    public string Type { get; set; }
    public DateTime CreateDate { get; set; }

    public override bool Equals(object obj)
    {
        if (obj == null)
            return false;
        var y = obj as Report;

        if (y == null)
            return false;

        return
            this.Name == y.Name &&
            this.Long == y.Long &&
            this.Lat == y.Lat;
    }

    public override int GetHashCode()
    {
        return (this.Name + this.Long + this.Lat).GetHashCode();
    }
}

So this is my code and somehow non-unique values snick in when I create a new HashSet? Any ideas? It looks like my post is mostly code so I need to add some more details.

I am creating the objects passed in the HashSet with this method (it is a console app made just for testing purposes, nothing fancy)

static Report CreateReport(dynamic report)
        {
            var result = new Report();

            result.City = report.city.ToString();
            result.Name = report.name.ToString();
            result.Country = report.country.ToString();
            result.Long = report.@long.ToString();
            result.Lat = report.lat.ToString();
            result.Type = report.type.ToString();
            result.CreateDate = DateTime.Now;

            return result;
        }
marto
  • 420
  • 1
  • 4
  • 15
  • 8
    Please show an example for some data where you don't get a unique hash set – Gilad Green Jun 25 '19 at 16:18
  • As a side note, no need for `if(obj == null)` condition and there are some questions about best practices for the `GetHashCode` implementation that you might want to have a look at :) – Gilad Green Jun 25 '19 at 16:24
  • 1
    This GetHashMethod function of yours isn't exactly safe. Imagine Name: aa, Long: bb and Lat: cc. It would be the same as Name: a, Long: ab, Lat: bcc. See [this question](https://stackoverflow.com/questions/263400/what-is-the-best-algorithm-for-an-overridden-system-object-gethashcode) if you want to improve that. – Joelius Jun 25 '19 at 16:26
  • What does "it would still work" mean? The function will fail it's sole purpose because of not being implemented correctly and allowing instance with different values to generate the same hashcode. If OP wants the behaviour they coded, I don't want to stop them but I assume they didn't purposefully implement it like that. – Joelius Jun 25 '19 at 16:29
  • Ok so I am creating the objects from method that uses a dynamic param. – marto Jun 25 '19 at 16:30
  • 4
    @Joelius: A hash computation is not required to have no collisions; collisions just decrease performance. A hash computation is required to have the property that equal objects have equal hash codes, and so this appears to be a safe hash function. I agree that there are easy ways to improve it; the real problem with this hash function is that it allocates memory! – Eric Lippert Jun 25 '19 at 17:07
  • 3
    Without a reproducer that demonstrates the problem, we can't tell you what you've done wrong. Write a *small, complete program that we can run* that demonstrates the problem, and post that. By writing such a program either you will find the problem yourself, or you'll post something that we can actually debug. – Eric Lippert Jun 25 '19 at 17:09
  • Thanks @EricLippert, I did not know that :) Could you point me to how collisions are handled then? For example in a Hashset. I tested it and it does add it and I assume that's because the Equals function returns false on two instances with properties described in my previous comment. The GetHashCode function however returns the same on both instances. When I change the Equals function to also work with concatenated strings, the instance won't be added since both Equals returns true and the HashCodes are equal. Is there anything else involved other than the Equals and GetHashCode function? – Joelius Jul 01 '19 at 21:56
  • @Joelius: Suppose you have a hash table and it contains n items that collide; what happens if you do a search of the table for one of those items? You call `Equals` on each of those n items until you find one that match, or you run out of items. So the hash set goes from being O(1) lookup to O(n). I once destroyed the performance of www.microsoft.com by creating a hash function that had way too many collisions. But the hash set does not become *wrong*, it just becomes *slow*. – Eric Lippert Jul 01 '19 at 22:16
  • @Joelius: Typically there is no need to have a mechanism more complex than bucket-by-hash, and then equality-in-bucket because typically hash collisions are rare. This article may be helpful to you: https://ericlippert.com/2011/02/28/guidelines-and-rules-for-gethashcode/ – Eric Lippert Jul 01 '19 at 22:17
  • @EricLippert Ahh I see. I did not know about these buckets. Also I did not know that the HashCode should be constructed from immutable fields. Thanks a lot! – Joelius Jul 01 '19 at 23:00

1 Answers1

0

Have you tried using an IEqualityComparer<T> where T is Report?

public class ReportComparer : IEqualityComparer<Report>
{
    public override bool Equals(object obj)
    {
        if (obj == null)
            return false;
        var y = obj as Report;

        if (y == null)
            return false;

        return
            this.Name == y.Name &&
            this.Long == y.Long &&
            this.Lat == y.Lat;
    }

    public int GetHashCode(Report obj)
    {
        return (this.Name + this.Long + this.Lat).GetHashCode();
    }
}

And then instantiate your HashSet with it:

HashSet<Report> reports = new HashSet<Report>(new ReportComparer())

Documentation:

MSDN EqualityComparer

JEV
  • 2,494
  • 4
  • 33
  • 47