0

I want to calculate unique hashcode for Jobject. So based on so post i am using JTokenEqualityComparer.GetHashCode(JToken token) method to calculate the hashcode

    [Fact]
    public void GetHashcode()
    {
        JTokenEqualityComparer comp = new JTokenEqualityComparer();

        // arrange
        var obj1 = new JObject();
        obj1.Add("FirstName", "foo");
        obj1.Add("LastName", "bar");

        var hashCode = comp.GetHashCode(obj1);
    }

However, every-time i run this unit test it creates different hashcode. So it looks like in addition to property name and property value it use something else to calculate the hash-code.

I am using this code in ASP.NET core application. It creates unique hashcode for the JObjects with the same properties and values, however as soon as app pool gets recycled it creates new hashcode for the same JObject.

How do i create unique Hashcode for a JObject that has same properties and values?

So if there are 3 instances of JObjects that have same properties and values their hashcode should be same, regardless of machine, time and class that calcualtes the hashcode.

LP13
  • 30,567
  • 53
  • 217
  • 400
  • 2
    I am not sure if it is possible without creating own hashcode implementation. The question is, why do you need the same hashcode as in _previous run_? In .NET Core, hash code compute algorithm is seeded to random value. So that why you get different hash codes (even for pure strings), see [this post](https://andrewlock.net/why-is-string-gethashcode-different-each-time-i-run-my-program-in-net-core/) In that blog post, there are also tips for implementation for deterministic hash code - the same every time. – Stano Peťko Jul 11 '19 at 19:38
  • I have list of JObjects and i want to compare them to find percentage of JObjects are are same. So basically i need the unique (preferably integer or long) for each jobject – LP13 Jul 11 '19 at 20:14
  • Not sure you should be assuming this. See [Guidelines and rules for GetHashCode](https://blogs.msdn.microsoft.com/ericlippert/2011/02/28/guidelines-and-rules-for-gethashcode/) by Eric Lippert which states, **Rule: Consumers of GetHashCode cannot rely upon it being stable over time or across appdomains**. – dbc Jul 11 '19 at 20:19
  • Actually I can't reproduce this on .Net full framework; see https://dotnetfiddle.net/s0ZE40 which always returns -1347780462 every time I run. What about string hash codes? Is `"foo".GetHashCode()` stable across app domain and machine for you?... Ah I see the [previously linked article](https://andrewlock.net/why-is-string-gethashcode-different-each-time-i-run-my-program-in-net-core/) says that hash codes are randomly seeded in .Net core release builds. (I think that was always true in .Net full framework debug builds.) – dbc Jul 11 '19 at 20:24
  • Hmm if `string.GetHashCode()` is indeed unstable you would need to create your own (recursive) version of `GetDeepHashCode()`. Note that JSON objects are considered to be *unordered* so the hash code for `JObject` needs to be invariant with respect to permutation of propery order. [`JContainer.ContentsHashCode()`](https://github.com/JamesNK/Newtonsoft.Json/blob/master/Src/Newtonsoft.Json/Linq/JContainer.cs#L890) does this by simply XOR-ing the item hash codes together. – dbc Jul 11 '19 at 20:40
  • @dbc so you are saying if i have multiple jobjects with same properties and value and same order, the `ToString()` method is not guaranteed to return same string?? I was thinking of calculating deterministic hascode using approach defined in article – LP13 Jul 11 '19 at 21:03
  • Just the opposite: I'm saying that if you have multiple jobjects with the same properties and values but *in different order*, then they should be considered equal according to the [JSON standard](https://json.org/). Currently `JTokenEqualityComparer.Equals()` does this, so you don't want your hash codes for objects considered equals to differ. Using `ToString()` and taking the invariant hash code will produce different hash codes for objects with permuted properties, which is not desirable. – dbc Jul 11 '19 at 21:09

1 Answers1

0

You can use something like this. Though I am not sure, if it is definitive from the JSON perspective - did not try arrays for example. But I believe you will be able to adapt it. :)

[Fact]
public void GetHashcode() {
    var o1 = new JObject();
    o1.Add("FirstName", "foo");
    o1.Add("LastName", "bar");

    var o2 = new JObject();
    o2.Add("Lorem", 123);
    o2.Add("Ipsum", DateTime.Now.Date);

    o1.Add("inner", o2);

    var hashCode = ComputeHashCode(o1);
}

private static int ComputeHashCode(JToken value)
{
    if (value is null)
    {
        return 0;
    }

    var queue = new Queue<JProperty>();
    foreach (JProperty prop in value)
    {
        queue.Enqueue(prop);
    }
    if (queue.Count == 0)
    {
        return 0;
    }

    int hash = 17;
    while (queue.Count > 0)
    {
        JProperty item = queue.Dequeue();
        if (item.Value.HasValues)
        {
            foreach (JProperty prop in item.Value)
            {
                queue.Enqueue(prop);
            }
        }
        else
        {
            // Hash code combination taken from here: https://stackoverflow.com/a/263416/8088324
            unchecked
            {
                hash = hash * 23 + ComputeHashCodeCore(item.Name);
                hash = hash * 23 + ComputeHashCodeCore(item.Value.ToString());
            }
        }
    }
    return hash;
}

// Stable hash code for string taken from here: https://stackoverflow.com/a/36845864/8088324
private static int ComputeHashCodeCore(string str)
{
    unchecked
    {
        int hash1 = 5381;
        int hash2 = hash1;

        for (int i = 0; i < str.Length && str[i] != '\0'; i += 2)
        {
            hash1 = ((hash1 << 5) + hash1) ^ str[i];
            if (i == str.Length - 1 || str[i + 1] == '\0')
                break;
            hash2 = ((hash2 << 5) + hash2) ^ str[i + 1];
        }

        return hash1 + (hash2 * 1566083941);
    }
}
Stano Peťko
  • 166
  • 6