0

For a NetCore Web API GET method I need to caluclate the ETag for a returned List<T>. T is the DTO in the form of a record that holds only primitive types.

I wanted to calculate a hash of the list. I was searching for information about how GetHashCode() is implemented, but couldn't find any information. The documentation of object.GetHashCode() doesn't state any information about lists or collections. By the results of the code I observed that on each run the same list data creates a different hash code. I concluded that GetHashCode() uses the pointer values for reference type items.

GetHashCode() of record calculates the hash code per member value. Therefore I created the list hash code by looping over the list items:

List<GetGroupsDTO> dtoList = commandResult.Value;
int hash = 17;
foreach(GetGroupsDTO dto in dtoList)
{
   hash = hash * 23 + dto.GetHashCode();
}
string eTagPayload = hash.ToString().SurroundWithDoubleQuotes();

I don't want to do this for every List<T>, of course. I thought to override GetHashCode(), but I'm struggling with it. I don't know how to override it for the generic List. I could derive a new class DTOList where I can override GetHashCode(). But this leads to more complexity in other places. Since the result of an EFCore Set query fills the List I would need a custom converter and then a custom serializer to return the List in Web API.

Therefore I wonder if I rather should create an extension method for List or just a function that takes List as an argument. Is there any other option to calculate the ETag? How can I calculate the ETag for a list of DTO objects efficently?

M. Koch
  • 525
  • 4
  • 20
  • You can use `HashCode` to easily cough up hashes for compound objects (e.g. `static int getCombinedHashCode(IEnumerable source) => source.Aggregate(typeof(T).GetHashCode(), (hash, t) => HashCode.Combine(hash, t));`; could be made an extension method as well). I wouldn't override `List` if all you're doing it for is this calculation, since it's not likely you'll need to use your type as a dictionary key or similar (and even if you did, those accept custom `IEqualityComparer`s). – Jeroen Mostert Feb 22 '22 at 13:11

1 Answers1

3

A little extension method and HashCode could help with this:

internal static class EnumerableExtensions {
    public static int GetCombinedHashCode<T>(this IEnumerable<T> source) => 
        source.Aggregate(typeof(T).GetHashCode(), (hash, t) => HashCode.Combine(hash, t));
}

Seeding the hash with typeof(T).GetHashCode is a rather arbitrary, but ensures that empty collections of different types do not all "look equal", since they would not normally compare equal either. Whether this matters or is even desirable will depend on your scenario.

Of course the result of this is only usable if T has a meaningful GetHashCode implementation, but that's true of hashes in general. For extra peace of mind a where T : IEquatable<T> constraint could be added, although that's not the standard approach for methods involving hashes. Adding the ability to use a custom IEqualityComparer<T> for the hash is left as an exercise.

Jeroen Mostert
  • 27,176
  • 2
  • 52
  • 85
  • Great solution. For my case, T is a record of primitive types that automatically provides a meaningful `GetHashCode`. I created an empty marker interface IDTO that the records derive from: `public record GetGroupsDTO (...) : IDTO;` The extension method than restricts on that interface `where T: IDTO`. – M. Koch Feb 22 '22 at 14:49
  • @M.Koch: if you're going to be that specific you probably only need it in one spot, and then it's good enough to keep this method a `private` member of the controller (base class) that needs it, instead of an extension method. Just making it take an `IEnumerable` is another option, since the specific type isn't needed except to give differently typed empty collections different hashes, which you probably don't need either. – Jeroen Mostert Feb 22 '22 at 14:54
  • Thanks for that hint. I have several API methods in different projects of the solution that share a common controller base. Yes, with the interface I could just use a private method in the controller interface. I added the interface later to restrict the extension method. I will see if I need the HashCode calculation for other types of lists. – M. Koch Feb 22 '22 at 15:08
  • 1
    Be aware also that if running on modern .NET, some types (e.g. string) aim to ensure that they generate different hash codes each time a program restarts. So these Etags won't even be stable across an application restart if any strings are involved. – Damien_The_Unbeliever Feb 22 '22 at 15:10
  • @Damien_The_Unbeliever: that's a very good point, and in fact even if the types involved produced reproducible hash codes, `HashCode` is explicitly documented (and implemented!) to not do so. An ETag changing even if the resource does not is normally not fatal (as opposed to it not changing if the resource does) but it's worth keeping in mind as invalidating all caching could have a lot of performance impact on a restart. If this is a concern neither `HashCode` nor any `GetHashCode` method should be used and instead ETags should be generated separately and explicitly. – Jeroen Mostert Feb 22 '22 at 15:15