3

I need a VBA routine to calculate the MD5 hash of a file's contents. I located some examples (e.g., here) but I found that they crashed when the filename contained certain Unicode characters, so I am trying to tweak the code to avoid that.

This code does not result in an error, but it also doesn't return the correct MD5 hash. What's wrong?

Public Function FileToMD5Hex(sFileName As String) As String
    Dim enc
    Dim bytes
    Dim outstr As String
    Dim pos As Integer
    Set enc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
    'Convert the string to a byte array and hash it
    bytes = GetFileBytes(sFileName)
    bytes = enc.ComputeHash_2((bytes))
    'Convert the byte array to a hex string
    For pos = 1 To LenB(bytes)
        outstr = outstr & LCase(Right("0" & Hex(AscB(MidB(bytes, pos, 1))), 2))
    Next
    FileToMD5Hex = outstr
    Set enc = Nothing
End Function

Private Function GetFileBytes(path As String) As Byte()
    Dim fso As Object
    Set fso = CreateObject("scripting.FileSystemObject")

    Dim fil As Object
    Set fil = fso.GetFile(path)

'    Dim fpga As Variant
    GetFileBytes = fil.OpenAsTextStream().Read(fil.Size)

    Set fil = Nothing
    Set fso = Nothing
End Function
Community
  • 1
  • 1
user3791372
  • 4,445
  • 6
  • 44
  • 78

1 Answers1

3

There are some chars sequences that Scripting.FileSystemObject can't process properly as TextStream.

Use ADODB.Stream ActiveX to retrieve array of bytes from file. It works perfectly with both text and binary types of data, also it allows to change charset of the string (FSO only works with ASCII and Unicode, and only with files).

Function GetFileBytes(strPath As String) As Byte()
    With CreateObject("ADODB.Stream")
        .Type = 1 ' adTypeBinary
        .Open
        .LoadFromFile (strPath)
        GetFileBytes = .Read()
    End With
End Function

Another one ActiveX processing binary data is SAPI.spFileStream. One of the most significant advantages - it allows to load only the part of the file to the memory (in some cases when comparing large files it can help drastically increase performance, checking md5 by chunks).

Function GetFileBytes(strPath As String) As Byte()
    Dim arrContent As Variant
    With CreateObject("SAPI.spFileStream")
        .Open strPath, 0
        .Read arrContent, CreateObject("Scripting.FileSystemObject").GetFile(strPath).Size
        .Close
    End With
    GetFileBytes = arrContent
End Function
omegastripes
  • 12,351
  • 4
  • 45
  • 96
  • Nice code (I will be using the `ADODB.Stream` instead of Open #F for Binary Read` and a `Get` from now on). **However, there's a missing component:** do you have an example of a System.Security.Cryptography function loading successive chunks of bytes into a single hash calculation? – Nigel Heffernan Dec 29 '17 at 11:31
  • Update: user Florent B. posted an answer with data passed in chunks to the MD5 hashing service in [this StackOverflow answer](https://stackoverflow.com/a/36331066/362712) - This would work very nicely with your `SAPI.spFileStream` implementation. – Nigel Heffernan Dec 29 '17 at 15:40