If I understand the question correctly, what you ultimately need to do is determine if a character falls into the ANSI character range (can be represented by one byte - 0 to 255). Excel doesn't make this easy as noted in the comments. Neither does VBA, which represents all strings internally as UTF-16. This is compounded by the problem that the behaviour of the VBA Len
and LenB
changed after VB4. Prior to that change, they would have returned different results for Unicode or ANSI input. Now, they will both return the same result because LenB
returns the in-memory length of the string which is always 2 bytes per character. What distinguishes the ANSI range is that the 2nd byte will always be zero.
The StrConv
function does provide a way to check whether a string contains non-ANSI characters - you can convert to a Byte
array and check to see if any of the high bytes are set. For example, the string "ABCD" is stored in VBA's memory as:
65 0 66 0 67 0 68 0
You can use a quirk of VBA's "Unicode conversion" to expand these to 2 bytes again with StrConv("ABCD", vbUnicode)
, which results in this:
65 0 0 0 66 0 0 0 67 0 0 0 68 0 0 0
For comparison, if you picked up the string "ΑΒΓΔ" from somewhere (which you would have to because there's no way to type it in the IDE), it could result in this depending on the encoding:
24 32 3 0 25 32 3 0 28 32 3 0 29 32 3 0
So, once you have a byte array, all you need to do is check every other byte - if you find a non-zero value, it can't be narrowed to ANSI:
Private Function IsANSI(test As String) As Boolean
Dim bytes() As Byte, i As Long
bytes = StrConv(test, vbUnicode)
For i = 1 To UBound(bytes) Step 2
If bytes(i) <> 0 Then
IsANSI = False
Exit Function
End If
Next i
IsANSI = True
End Function
If all you care about is UTF-16 v ANSI, you can grab the "byte length" easily after you determine it can be narrowed to 8 bits:
Private Function ByteLength(test As String) As Long
If IsANSI(test) Then
ByteLength = Len(test)
Else
ByteLength = LenB(test)
End If
End Function
Note that byte length always depends on the encoding. If you need the length of the string in a specific encoding, VBA isn't going to help you much natively unless it's a fixed width encoding (i.e. UTF-32, which VBA likely already butchered), you'll have to reach into the Windows API and explicitly convert it with WideCharToMultiByte and see what you get back. You can find a VBA example here.