1

I want to port this simple hash algorithm to VB6.

I have come up with:

Public Function simpleHash(hashString As String) As Long
Dim hash As Long
Dim c As Long
Dim i As Integer

    On Local Error Resume Next

    hash = 5381
    For i = 1 To Len(hashString)
        c = AscW(Mid$(hashString, i, 1))
        hash = hash * 33 + c
    Next i

    simpleHash = hash
End Function

The problem is, despite my On Error statement which suppresses the Error 6: Overflow exceptions, the variable hash is not updated any more if an overflow occurs.

How can I fix that or implement this algorithm differently?

Community
  • 1
  • 1
Felix Dombek
  • 13,664
  • 17
  • 79
  • 131
  • Did you try DOUBLE instead of long for the variable hash? I'm not sure of the accuracy though - it's been a long time since I used VB6 :) – itsols Mar 13 '14 at 16:13
  • The problem is not the overflow, because the C implementation actually depends on unsigned numerical overflow. I just wonder how I can mimic this behaviour (by having 2's complement like overflow) in VB6. – Felix Dombek Mar 13 '14 at 16:15
  • You mean you want to use the result as an unsigned int? – itsols Mar 13 '14 at 16:24
  • Yes. When *interpreted* as an unsigned int, the result must be the same as the C function's. – Felix Dombek Mar 13 '14 at 16:26
  • Hmmm... I'm a little out of touch there. Perhaps you're looking at using the ABS function. BTW, how is this return value different to C's value? – itsols Mar 13 '14 at 16:30
  • In line `hash = hash * 33 + c`, when the expression `hash * 33 + c` produces an overflow, the assignment is skipped. So the VB6 version only calculates the hash for the first few letters of the argument string. – Felix Dombek Mar 13 '14 at 16:35
  • If DOUBLE can't solve this, maybe a struct should hold two longs - the high and low byte pairs. This however may require you to modify the algorithm to work on each part separately. – itsols Mar 13 '14 at 16:50
  • I did something similar, see my answer. – Felix Dombek Mar 13 '14 at 16:59
  • Do you have a known-working program written in C or something else that you can use to validate the results of any VB6 implementation? – Bob77 Mar 13 '14 at 21:49
  • I checked this against the C implementation linked in my question. – Felix Dombek Mar 13 '14 at 23:27

2 Answers2

3

I would question whether it makes sense to hash UTF-16LE ("Unicode") this way. It might make a lot more sense to convert your VB String to UTF-8 and then hash that.

While I can't find any test vectors for djb2 to validate my own implementation it seems to run quite quickly:

Private Type CURRENCY_CURRENCY
    Value As Currency
End Type

Private Type CURRENCY_2LONGS
    ValueLo As Long
    ValueHi As Long
End Type

Public Function djb2(ByVal StringToHash As String) As Long
    Dim C2L As CURRENCY_2LONGS
    Dim CC As CURRENCY_CURRENCY
    Dim I As Long

    C2L.ValueLo = 5381
    For I = 1 To Len(StringToHash)
        LSet CC = C2L
        CC.Value = CC.Value * 33@ + CCur(AscW(Mid$(StringToHash, I, 1))) / 10000@
        LSet C2L = CC
        C2L.ValueHi = 0
    Next I

    djb2 = C2L.ValueLo
End Function
Bob77
  • 13,167
  • 1
  • 29
  • 37
  • Never thought of *VB6 string to UTF8*. A good one and if it's a mere couple of hashes, I don't see any performance issues. – itsols Mar 14 '14 at 01:35
  • Performance is a consideration. What may outweigh all of this is whether the hash just needs to be a good hash or must actually match the hash done by another program or platform. – Bob77 Mar 14 '14 at 14:25
  • +1 I didn't know about the `LSet` statement and I like your idea of simply setting the high bits to 0 as a fast `Mod` workaround. – Felix Dombek Mar 14 '14 at 15:11
2

I found a workaround using Currency and doing the overflow math myself (and it turns out that VB6 Mod operator also won't work on Currency):

Const pow2_32 As Currency = 2@ ^ 32
Const signed32Max As Currency = 2@ ^ 31 - 1

Public Function simpleHash(hashString As String) As Long
Dim hash As Currency
Dim i As Long

    hash = 5381
    For i = 1 To Len(hashString)
        hash = hash * 33@ + CCur(AscW(Mid$(hashString, i, 1)))
        While hash >= pow2_32
            hash = hash - pow2_32
        Wend
    Next i

    If hash > signed32Max Then
        hash = hash - pow2_32
    End If

    simpleHash = CLng(hash)
End Function

However, see Bob77's answer for a solution that is almost twice as fast as mine.

Community
  • 1
  • 1
Felix Dombek
  • 13,664
  • 17
  • 79
  • 131
  • That should work too but how about the accuracy when using currency? Did you verify it? – itsols Mar 13 '14 at 17:04
  • Yes, mod may not work unless you first use INT but I've never tried this with currency either. – itsols Mar 13 '14 at 17:07
  • Currency is internally an integral data type, so the accuracy is not an issue. It works like the C version, it was just more work than expected. – Felix Dombek Mar 13 '14 at 17:29