0

In service A I have a string that get hashed like this:

fun String.toHash(): Long {
    var hashCode = this.hashCode().toLong()
    if (hashCode < 0L) {
        hashCode *= -1
    }
    return hashCode
}

I want to replicate this code in service B written in Golang so for the same word I get the exact same hash. For what I understand from Kotlin's documentation the hash applied returns a 64bit integer. So in Go I am doing this:

func hash(s string) int64 {
    h := fnv.New64()
    h.Write([]byte(s))
    v := h.Sum64()
    return int64(v)
}

But while unit testing this I do not get the same value. I get:

func Test_hash(t *testing.T) {
    tests := []struct {
        input  string
        output int64
    }{
        {input: "papafritas", output: 1079370635},
    }
    for _, test := range tests {
        got := hash(test.input)
        assert.Equal(t, test.output, got)
    }
}

Result:

7841672725449611742

Am I doing something wrong?

Matias Barrios
  • 4,674
  • 3
  • 22
  • 49
  • If you don't want to use a standard hash, you can implement the java version yourself: https://stackoverflow.com/questions/15518418/whats-behind-the-hashcode-method-for-string-in-java – JimB Feb 06 '23 at 20:34
  • @JimB Thanks so much! That was the impulse I needed and a constructive answer. – Matias Barrios Feb 06 '23 at 21:16

1 Answers1

3

Java and therefore Kotlin uses different hash function than Go.

Possible options are:

  1. Use a standard hash function.
  2. Reimplement Java hashCode for Strings in Go.
Grzegorz Żur
  • 47,257
  • 14
  • 109
  • 105
  • Yes, I can see that. Is there any way I can replicate "Kotlin's" way in Go? I really do not want to change anything on then Kotlin service – Matias Barrios Feb 06 '23 at 19:55
  • 2
    The Java/Kotlin hash algorithm isn't given in the specification, is it? So you can't rely on it — it coud change between JVM implementations (or even different versions of the same JVM). If you need a specific algorithm, I think it's far safer to implement it yourself. – gidds Feb 07 '23 at 08:21
  • 1
    @gidds: I'm not sure where java draws the documented versus specified line, but the [algorithm is documented](https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#hashCode--) (but I do agree using a standardized hash would make more sense) – JimB Feb 07 '23 at 18:09
  • @JimB Ah, so it _is_ documented! Thanks for the link. (I wasn't sure, hence the question :-) So it would be safe to re-implement it. Though I guess if you have to implement the hash in Go anyway, it's probably clearer (and easier to debug) if you pick your own algorithm and implement it in both languages. – gidds Feb 07 '23 at 22:20