Unit Testing - What do you do when your code is pretty much just a calculation (GetHashCode for example)?

Question

public class Foo
{
    public int X { get; set; }
    public int Y { get; set; }
    public int Z { get; set; }

    public override int GetHashCode()
    {
        var hash = 17;
        hash *= 23 + x.GetHashCode();
        hash *= 23 + y.GetHashCode();
        hash *= 23 + z.GetHashCode();
    }
}

When you go to Unit Test the GetHashCode, I'm torn between calculating the original components and repeating the function or using a predetermined value:

[TestMethod]
public void Test1
{
    var x = 1; y = 2; z = 3;
    var foo = new Foo() { X = x, Y = y, Z = z };

    var hash = 17;
    hash *= 23 + x.GetHashCode();
    hash *= 23 + y.GetHashCode();
    hash *= 23 + z.GetHashCode();

    var expected = hash;
    var actual = foo.GetHashCode();

    Assert.AreEqual(expected, actual);
}

[TestMethod]
public void Test2
{
    var x = 1; y = 2; z = 3;
    var foo = new Foo() { X = x, Y = y, Z = z };

    var expected = ? //Some predetermined value (calculated by hand?)
    var actual = foo.GetHashCode();

    Assert.AreEqual(expected, actual);
}

Or is there some other way?

score 5 · Accepted Answer · edited Jun 20 '20 at 09:12

5

Unit testing is for testing logic. Is the GetHashCode calculation itself logic? Not really, although opinions may vary.

The relevant logic here is that two objects that are equal have the same hash code, i.e. Equals and HashCode are compatible. Or, quoting from the docs:

If two objects compare as equal, the GetHashCode method for each object must return the same value. However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values.

The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object's Equals method. Note that this is true only for the current execution of an application, and that a different hash code can be returned if the application is run again.

So I would write unit tests that guarantee that those conditions are met, not that the internal implementation of GetHashCode matches a predetermined procedure.

Both of your example tests are highly brittle, and perfectly valid GetHashCode implementations would fail them. Recall that one of the major purposes of unit tests is to allow refactoring without fear. But, nobody can refactor GetHashCode without your unit tests breaking.

edited Jun 20 '20 at 09:12

Community

1
1

answered Jul 13 '11 at 18:54

Domenic

110,262
41
219
271

Ah, so just test that 2 **equal** Objects return the same hash code (per the rule of implementing GetHashCode). – myermian Jul 13 '11 at 18:56
I would strongly disagree. GetHashCode is indeed logic; even if how it gets there seems simple, it's still logic that deserves valid unit testing. In particular, consider the case where the hash of an object is being persisted, or as a test against some persistence technique (think Hibernate); identity equivalency isn't enough to confirm non-mutability. – Paul Sonier Jul 13 '11 at 19:06
@Paul: the docs **specifically** say you should not use `GetHashCode` for those purposes. If you use it for those purposes on BCL types, you will shoot yourself in the foot. A hash code is NOT equivalent to an object identifier. – Domenic Jul 13 '11 at 19:08
@domenic: the point is that by overriding GetHashCode, the OP is potentially breaking that contract; they may have a distinct need to do that, even if it's explicitly forbidden. If not, why would the default behaviour not be enough? – Paul Sonier Jul 13 '11 at 19:10
1

@Paul: Your suggestion violates the Liskov Substitution Principle. And, the default behavior would not be enough usually because you have custom equality-comparison semantics and you need to change `GetHashCode` to conform to the usual contract I quoted in my answer. – Domenic Jul 13 '11 at 19:23
@Domenic: After stepping back and thinking about it you are 100% correct. I overrode the GetHashCode function and I should (and have) kept it in a design to produce pseudo-unique values (not guaranteed). So I agree, I should really just be testing for the 2 original contract requirements. – myermian Jul 13 '11 at 19:40

score 1 · Answer 2 · answered Jul 13 '11 at 19:02

1

I compare the result against pre-computed values with lots of commenting to explain why I chose the values being tested.

For pure calculations, I tend to write unit tests so that they ensure the function keeps the conditions that are required/expected of the code that will use it.

Example conditions:

Are there any constraints on the generated hash value? min value, max value, odd, even, etc.
Does it properly hash positive and negative numbers?
Does it create unique values under the appropriate conditions?

answered Jul 13 '11 at 19:02

Manfre

1,235
10
10

Completely agreed. I agree that testing against the validity of the hashing algorithm (hash uniqueness) is very important, although I think it's perhaps outside of the scope of the OP's question. – Paul Sonier Jul 13 '11 at 19:08
1

The question used GetHashCode as an example and my main point was that the function's contract should be tested. "Does it create unique values under the appropriate conditions?" is just another example to convey that point. – Manfre Jul 13 '11 at 19:25

Unit Testing - What do you do when your code is pretty much just a calculation (GetHashCode for example)?

2 Answers2