1

I'm trying to create a hashcode method. I have code like below :

    private static object GetValue<T>(object item, string propertyName)
    {
        ParameterExpression arg = Expression.Parameter(item.GetType(), "x");
        Expression expr = Expression.Property(arg, propertyName);
        UnaryExpression unaryExpression = Expression.Convert(expr, typeof(object));
        var propertyResolver = Expression.Lambda<Func<T, object>>(unaryExpression, arg).Compile();
        return propertyResolver((T)item);
    }


    private static int GetHashCode<T>(T obj, List<string> columns)
    {
        unchecked
        {
            int hashCode = 17;

            for (var i = 0; i < columns.Count; i++)
            {
                object value = GetValue<T>(obj, columns[i]);
                var tempHashCode = value == null ? 0 : value.GetHashCode();
                hashCode = (hashCode * 23) + tempHashCode;
            }

            return hashCode;
        }
    }

    private static void TestHashCode()
    {
        var t1 = new { ID = (long)2044716, Type = "AE", Method = (short)1022, Index = 3 };
        var t2 = new { ID = (long)12114825, Type = "MEDAPE", Method = (short)1700, Index = 2 };

        var e1 = t1.GetHashCode();
        var e2 = t2.GetHashCode();

        var columns = new[] { "ID", "Type", "Method", "Index" }.ToList();
        var k1 = GetHashCode(t1, columns);
        var k2 = GetHashCode(t2, columns);
    }

The e1 value is -410666035, The e2 value is 101205027. The k1 value is 491329214. The k2 value is 491329214.

HashCode Steps:

hashCode = 17
tempHashCode = 2044716
hashcode = 2045107
tempHashCode = 1591023428
hashcode = 1638060889
tempHashCode = 66978814
hashcode = -912326403
tempHashCode = 3
hashcode = 491329214

How can k1 and k2 be the same value ? Because default .net gethashcode method gives two different values. I want to create a hashcode method that can get column list. I want to create a hash code by particular properties. I'm trying to get a unique value for object by particular properties.

How can I identify object by particular properties if GetHashCode doesn't guarantee unique value ?

Sinan AKYAZICI
  • 3,942
  • 5
  • 35
  • 60
  • 1
    Possible duplicate of [What is hashCode used for? Is it unique?](https://stackoverflow.com/questions/7425142/what-is-hashcode-used-for-is-it-unique) – Jagadeesh Govindaraj Jan 05 '19 at 08:28
  • When you debugged through the code, for each of the inputs please share the value of `hashCode` at the end of each for loop iteration. – mjwills Jan 05 '19 at 10:34
  • What type and value is returned by your `GetValue()` call in your `for` loop of your `GetHashCode()` method? Please edit your question to includes the types (`GetType()`) and the `ToString()` result of the `value` variable for each loop iteration. – Progman Jan 05 '19 at 10:51
  • @Progman Types and values are clear. They were defined in objects. – Sinan AKYAZICI Jan 05 '19 at 10:57
  • @sinanakyazici What are the actual types and values of your `value` variable inside the `for` loop? Please edit your question to include the types and values for each loop iteration as well. Also include the result for all the values (`hashcode`, `tempHashCode`, type and value of `value`) not only for the object `t1` but for the object `t2` as well to compare the result/output for the different objects. – Progman Jan 05 '19 at 11:05
  • 1
    You need to use IEquatable which has a Compare() method so get unique values when hash gives duplicates. See : https://learn.microsoft.com/en-us/dotnet/api/system.iequatable-1.equals?view=netframework-4.7.2 – jdweng Jan 05 '19 at 14:06
  • @sinanakyazici Any luck with my suggestion? – mjwills Jan 06 '19 at 01:42

3 Answers3

0

I suspect the problem comes is caused by value.GetHashCode() in your GetHashCode<T> method. That value variable is an object there, I think GetHashCode() there is not returning what you would expect. Try to debug to find out what is happening.

You may want to try to keep your code, but instead of Object.GetHashCode(), use RuntimeHelpers.GetHashCode() (from namespace System.Runtime.CompilerServices).

Full reference here: https://learn.microsoft.com/en-us/dotnet/api/system.runtime.compilerservices.runtimehelpers.gethashcode?redirectedfrom=MSDN&view=netframework-4.7.2#System_Runtime_CompilerServices_RuntimeHelpers_GetHashCode_System_Object_

Good luck!

johey
  • 1,139
  • 1
  • 9
  • 25
0

GetHashCode returns a value that is implementation dependent. Its particular design is suitable for the "standard" use and is meaningful only during the life of an application. The default algorithm is not designed to avoid collisions.

The GetHashCode method is not designed to be unique for each instance.

Your approach relies on the composition of the hash of each column. An hash code has to satisfy certain requirements, for example the distribution in the domain. Though, is not guaranteed that the composition preserves such properties and requirements: the more columns you add the "stranger" the collisions could be.

Also, you are invoking value.GetHashCode() which hinders a boxing operation. As suggested by johey, you should use the RuntimeHelpers.GetHashCode() method because it interprets the object as value before computing the hash.

The .NET data structures are designed to handle collisions internally, for example, IDictionary uses the hash to select a bucket, and than scans sequentially the bucket.

Yennefer
  • 5,704
  • 7
  • 31
  • 44
0

I want to write here my solution. All of what said is true but not exactly. I want to collect topic here.

GetHashCode always gives the same value for object that are the same. The values of GetHashCode always may not belong to the different objects.

So the values of GetHashCode are compared firstly to improve performance, then go next step to compare objects if there are the same value of GetHashCode.

I created a IEqualityComparer.

private class CustomEqualityComparer<T> : IEqualityComparer<T>
    {

        private readonly List<string> _columns;
        private readonly bool _enableHashCode;
        private readonly ConcurrentDictionary<string, Func<T, object>> _cache;
        public CustomEqualityComparer(List<string> columns, ConcurrentDictionary<string, Func<T, object>> cache, bool enableHashCode = false)
        {
            _columns = columns;
            _enableHashCode = enableHashCode;
            _cache = cache;
        }

        public bool Equals(T x, T y)
        {
            for (var i = 0; i < _columns.Count; i++)
            {
                object value1 = GetValue(x, _columns[i], _cache);
                object value2 = GetValue(y, _columns[i], _cache);
                if (!value1.Equals(value2)) return false;
            }

            return true;
        }

        public int GetHashCode(T obj)
        {
            return _enableHashCode ? GetHashCode(obj, _columns, _cache) : 0;
        }

        private object GetValue(object item, string propertyName, ConcurrentDictionary<string, Func<T, object>> cache)
        {
            if (!cache.TryGetValue(propertyName, out Func<T, object> propertyResolver))
            {
                ParameterExpression arg = Expression.Parameter(item.GetType(), "x");
                Expression expr = Expression.Property(arg, propertyName);
                UnaryExpression unaryExpression = Expression.Convert(expr, typeof(object));
                propertyResolver = Expression.Lambda<Func<T, object>>(unaryExpression, arg).Compile();
                cache.TryAdd(propertyName, propertyResolver);
            }

            return propertyResolver((T)item);
        }

        private int GetHashCode(T obj, List<string> columns, ConcurrentDictionary<string, Func<T, object>> cache)
        {
            unchecked
            {
                var hashCode = 17;

                for (var i = 0; i < columns.Count; i++)
                {
                    object value = GetValue(obj, columns[i], cache);
                    var tempHashCode = value == null ? 0 : value.GetHashCode();
                    hashCode = hashCode * 23 + tempHashCode;
                }

                return hashCode;
            }
        }
    }
Sinan AKYAZICI
  • 3,942
  • 5
  • 35
  • 60