1

I'm not able to get the same djbhash in JavaScript that I was getting in Swift.

extension String {
    public func djbHash() -> Int {
        return self.utf8
            .map {return $0}
            .reduce(5381) {
                let h = ($0 << 5) &+ $0 &+ Int($1)
                print("h", h)
                return h
            }
    }
}
var djbHash = function (string) {
    var h = 5381; // our hash
    var i = 0; // our iterator

    for (i = 0; i < string.length; i++) {
        var ascii = string.charCodeAt(i); // grab ASCII integer
        h = (h << 5) + h + ascii; // bitwise operations
    }
    return h;
}

I tried using BigInt, but the value for the string "QHChLUHDMNh5UTBUcgtLmlPziN42" I'm getting is 17760568308754997342052348842020823769412069976n, compared to 357350748206983768 in Swift.

Martin R
  • 529,903
  • 94
  • 1,240
  • 1,382
j.doe
  • 23
  • 7
  • it would be helpful to post some reference results string->hash from your swift code, so that posters can test their js solutions. – gog Oct 31 '22 at 07:55
  • test string "QHChLUHDMNh5UTBUcgtLmlPziN42" – j.doe Oct 31 '22 at 07:56

2 Answers2

3

The Swift &+ operator is an “overflow operator”: It truncates the result of the addition to the available number of bits for the used integer type.

A Swift Int is a 64-bit (signed) integer on all 64-bit platforms, and adding two integers would crash with a runtime exception if the result does not fit into an Int:

let a: Int = 0x7ffffffffffffff0
let b: Int = 0x7ffffffffffffff0
print(a + b) //  Swift runtime failure: arithmetic overflow

With &+ the result is truncated to 64-bit:

let a: Int = 0x7ffffffffffffff0
let b: Int = 0x7ffffffffffffff0
print(a &+ b) // -32

In order to get the same result with JavaScript and BigInt one can use the BigInt.asIntN() function:

var a = 0x7ffffffffffffff0n
var b = 0x7ffffffffffffff0n
console.log(a + b) // 18446744073709551584n
console.log(BigInt.asIntN(64, a+b)) // -32n

With that change, the JavaScript function gives the same result as your Swift code:

var djbHash = function (string) {
    var h = 5381n; // our hash
    var i = 0; // our iterator

    for (i = 0; i < string.length; i++) {
        var code = string.charCodeAt(i); // grab UTF-16 code point
        h = BigInt.asIntN(64, (h << 5n) + h + BigInt(code)); // bitwise operations
    }
    return h;
}

console.log(djbHash("QHChLUHDMNh5UTBUcgtLmlPziN42")) // 357350748206983768n

As mentioned in the comments to the other answer, charCodeAt() returns UTF-16 code points, whereas your Swift function works with the UTF-8 representation of a string. So this will still give different results for strings containing any non-ASCII characters.

For identical results for arbitrary strings (umlauts, Emojis, flags, ...) its best to work with the Unicode code points. In Swift that would be

extension String {
    public func djbHash() -> Int {
        return self.unicodeScalars
            .reduce(5381) { ($0 << 5) &+ $0 &+ Int($1.value) }
    }
}

print("äöü€".djbHash()) // 6958626281456

(You may also consider to use Int64 instead of Int for platform-independent code, or Int32 if a 32-bit hash is sufficient.)

The corresponding JavaScript code is

var djbHash = function (string) {
    var h = 5381n; // our hash
    
    for (const codePoint of string) {
        h = BigInt.asIntN(64, (h << 5n) + h + BigInt(codePoint.codePointAt(0))); // bitwise operations
    }
    return h;
}

console.log(djbHash("äöü€")) // 6958626281456n
Martin R
  • 529,903
  • 94
  • 1,240
  • 1,382
0

I've had a similar issue in which I've used the & in combination with the used operator. I think the code below should work. It's still under review but you can checkout my post

var djbHash = function (string) {
    var h = 5381; // our hash
    var i = 0; // our iterator

    for (i = 0; i < string.length; i++) {
        var ascii = string.charCodeAt(i); // grab ASCII integer
        h = (h << 5) + h &+ ascii; // bitwise operations
    }
    return h;
}
  • I am getting 0 for this – j.doe Oct 31 '22 at 07:54
  • its returning 0, do u have any idea why's that ? @Wouter – j.doe Oct 31 '22 at 08:20
  • Well, not entirely sure. But I believe it has to do with the types of encoding you're using. In Javascript CharcodeAt uses utf16 by default. I've tried your code in playgrounds and I returns the same result. CharcodeAt doesnt return ASCII. Problem is I don't really understand why you would need such a value so having trouble reproducing and understanding your issue completely :( – Wouter Dijks Oct 31 '22 at 08:24
  • well my project uses firebase realtime db and node.js server, I need to create that hash to query realtime from both swift and node. the hash has to be the same on both sides. does this make sense ? im basically looking for table entries with that hash ? – j.doe Oct 31 '22 at 08:26
  • does it return the same hash for "QHChLUHDMNh5UTBUcgtLmlPziN42" on both swift and node for you – j.doe Oct 31 '22 at 08:27
  • there's no such thing as `&+` in javascript. When you write `a &+ b` it actually means `a & (+b)` that is, just `a & b`. – gog Oct 31 '22 at 08:30
  • ```swift print(gethash(string: "QHChLUHDMNh5UTBUcgtLmlPziN42")) //returns "357350748206983768" func gethash(string: String) -> Int { var hash = 5381; // our hash let count = string.count; let nameAsUnicode = string.utf16 for unicode in nameAsUnicode { var bitShift = ((hash << 5) &+ hash ) hash = Int(unicode) + bitShift; } return hash; } ``` Maybe I've read your article wrong but keep getting this as a result. – Wouter Dijks Oct 31 '22 at 08:49
  • Any idea @gog how I would get the same hash for JS – j.doe Oct 31 '22 at 09:25