1

I would like to know exact bytes that make up the string.
Is that possible in VBA?

Something like:

> Debug.Print toHex("@E")
0x40, 0x45


Reason for this question:
I have some problems with encodings while using ServerXMLHTTP.
(not sure at which point data gets interpreted incorrectly)
For debuging purposes, I want to see what are the actual bytes in strings, so I can narrow the source of the problem.

industryworker3595112
  • 3,419
  • 2
  • 14
  • 18
  • Can you show us the code for `toHex("@E")` the description is not clear, could you give another shot? – PaulFrancis Sep 15 '14 at 13:07
  • 1
    You might find this helpful [Convert string to unicode][1] [1]: http://stackoverflow.com/questions/23810324/vba-convert-string-to-unicode – Andy Brazil Sep 15 '14 at 15:35
  • Strings in VBA are natively Unicode, not ASCII. Do you want to see the actual bytes of the Unicode or do you want to convert it to ASCII and see one byte per character? – Blackhawk Sep 15 '14 at 16:34
  • @Blackhawk I would like to see internal representation of the string. If it is unicode, then I would like to see bytes making up the unicode encoded string. – industryworker3595112 Sep 16 '14 at 05:35
  • @AndyBrazil Thanks. That actualy solved my actual problem. (my problem was that I used responseText instead of responseBody). – industryworker3595112 Sep 16 '14 at 07:01

1 Answers1

2

I see that you found the answer to your actual problem in the comments, but just to answer your specific question:

You can convert a string to raw bytes using the toHex method below. I've included an example of usage in Main and the comments should explain what's going on:

Public Sub Main()
    Dim str As String
    str = "This is a String"

    Debug.Print toHex(str)
End Sub

Public Function toHex(str As String) As String
    'dim an dynamic Byte array
    Dim arrBytes() As Byte

    'When you assign the string to the undimensioned Byte array,
    'VBA automatically resizes it and makes a copy of the individual
    'bytes of the String. Each character is two bytes
    '(I believe VBA uses UTF-16).
    arrBytes = str

    'This prints out the bytes in the way you describe in your question.
    Dim strOut As String
    If UBound(arrBytes) > 0 Then
        strOut = "0x" & arrBytes(0)
        For i = 1 To UBound(arrBytes)
            strOut = strOut & ", 0x" & Hex(arrBytes(i))
        Next
    End If
    toHex = strOut
End Function

EDIT:

Assigning a String to a Byte array will copy the bytes exactly. Natively, VBA uses UTF-16. HOWEVER, if you pull in data from another source it may be ASCII or UTF-8. VBA will still attempt to display the string as if it was UTF-16 - that is, it will attempt to display every 2 bytes (16 bits) as a single character. You can see this behavior by manually building an ASCII string in a Byte array and assigning it to a String, then attempting to display it:

Public Sub Main()
    Dim strMessage As String

    strMessage = "Hello World!"
    Debug.Print strMessage 'displays "Hello World!" in the immediate window
    Debug.Print toHex(strMessage) 'displays:
    '0x72, 0x0, 0x65, 0x0, 0x6C, 0x0, 0x6C, 0x0, 0x6F, 0x0, 0x20, 0x0, 0x57, 0x0, 0x6F, 0x0, 0x72, 0x0, 0x6C, 0x0, 0x64, 0x0, 0x21, 0x0
    'Note the null bytes because each 2 bytes is a UTF-16 pair

    strMessage = StrConv("Hello World!", vbFromUnicode) 'Converts the immediate string to ASCII and stores it in the VBA String variable
    Debug.Print strMessage 'displays "??????" in the immediate window - 6 unprintable characters because it interprets each two ASCII bytes as a single unprintable UTF-16 character
    Debug.Print toHex(strMessage) 'displays:
    '0x72, 0x65, 0x6C, 0x6C, 0x6F, 0x20, 0x57, 0x6F, 0x72, 0x6C, 0x64, 0x21
    'Note that these are the ASCII bytes of the individual letters

End Sub
Blackhawk
  • 5,984
  • 4
  • 27
  • 56
  • Just to be sure. During `arrBytes = str`, bytes are copied without *any* changes from `str`, or some adjustments might still be made. (e.g. maybe strings some times can be UTF-8 and when assigning to `byte()` it is converted to UTF-16)? – industryworker3595112 Sep 17 '14 at 05:55
  • @industryworker3595112 the answer is that it copies the bytes directly without any conversion. See my EDIT above. – Blackhawk Sep 18 '14 at 15:36