2

I'm working on writing a script for generating reports which require 80 byte line format.
Currently, my script formats all the fields correctly, concatenates them to a single column, and deletes the rest. This concatenated column has spaces separating the fields which can't be dropped when saved. All this is being done on a 64 bit version of Excel 2016 on Windows 10.
How can I make the file save as a UTF-8 encoded text file?

Community
  • 1
  • 1
hmnd
  • 51
  • 1
  • 6
  • Make sure "ScriptUtils.ByteArray" is in reference list goto Tools-->References and look for ScriptUtils.ByteArray – Techie Aug 26 '16 at 04:45
  • 1
    @HA560 thanks for suggesting that. I got the script from [here](http://www.motobit.com/help/scptutl/cm119.htm) and just realized that I'm supposed to install their [ScriptUtilities](http://www.motobit.com/help/scptutl/default.htm) software. EDIT: Shit. It's costs money. – hmnd Aug 26 '16 at 04:58
  • You can create byte arrays in vba without a purchased utility. Lots of examples using StrConv, vbUnicode and arrays of Byte type. – dbmitch Aug 26 '16 at 05:03
  • @dbmitch how would I go about saving a txt file using UTF-8 encoding instead of default UTF-16? – hmnd Aug 26 '16 at 05:04
  • http://stackoverflow.com/questions/2524703/save-text-file-utf-8-encoded-with-vba – Tim Williams Aug 26 '16 at 05:08
  • Sure. Check out pages at di-mgt.com.au for examples of using win api to convert to UTF-8 – dbmitch Aug 26 '16 at 05:10
  • Sorry. Don't have copy paste functionality on tablet. Fixed my website typo – dbmitch Aug 26 '16 at 05:14
  • @dbmitch Found it by Google Search, thanks. I assume you're referring to [this](http://www.di-mgt.com.au/howto-convert-vba-unicode-to-utf8.html)? I see it doesn't work with 64 bit Excel; any alternative way? – hmnd Aug 26 '16 at 05:18
  • That's important factor to have not included in your question. 64 bit Office has many issues in VBA but most can be overcome. Update your question with specific details and example usage – dbmitch Aug 26 '16 at 05:21
  • Where does it mention it doesn't work in 64 bit Excel? If it mentions that, it should mention the simple changes that need to be applied to api declares and variables. What version of office and o/s are you using? – dbmitch Aug 26 '16 at 05:29
  • See updated question. I'm getting this error when I run it: http://d.pr/14P5u – hmnd Aug 26 '16 at 05:34
  • Can you post the concatenation code? You just want to export the resulting column? – dbmitch Aug 26 '16 at 14:34

2 Answers2

0

I ended up writing an AHK script that opens Notepad++, changes the encoding, saves the file, and closes it. Not as elegant as I would've hoped, but it gets the job done.

hmnd
  • 51
  • 1
  • 6
0

Here's code from that sample spreadsheet modified for Office 64 bit

UTFTest.bas

' Converting a VBA string to an array of bytes in UTF-8 encoding

' $Date: 2015-06-30 10:05Z $
' $Original Author: David Ireland $

' Copyright (C) 2015 DI Management Services Pty Limited
' <http://www.di-mgt.com.au> <http://www.cryptosys.net>

Option Explicit
Option Base 0

''' Extract a set of VBA "Unicode" strings from Excel sheet, encode in UTF-8 and display details
Public Sub ShowStuff()
    Dim strData As String

    ' Plain ASCII
    ' "abc123"
    ' U+0061, U+0062, U+0063, U+0031, U+0032, U+0033
    ' EXCEL: Get value from cell A1
    strData = Worksheets("Sheet1").Cells(1, 1)
    Debug.Print vbCrLf & Worksheets("Sheet1").Cells(1, 2)
    ProcessString (strData)

    ' Spanish
    ' LATIN SMALL LETTER[s] [AEIO] WITH ACUTE and SMALL LETTER N WITH TILDE
    ' U+00E1, U+00E9, U+00ED, U+00F3, U+00F1
    ' EXCEL: Get value from cell A3
    strData = Worksheets("Sheet1").Cells(3, 1)
    Debug.Print vbCrLf & Worksheets("Sheet1").Cells(3, 2)
    ProcessString (strData)

    ' Japanese
    ' "Hello" in Hiragana characters is KO-N-NI-TI-HA (Kon'nichiwa)
    ' U+3053 (hiragana letter ko), U+3093 (hiragana letter n),
    ' U+306B (hiragana letter ni), U+3061 (hiragana letter ti),
    ' and U+306F (hiragana letter ha)
    ' EXCEL: Get value from cell A5
    strData = Worksheets("Sheet1").Cells(5, 1)
    Debug.Print vbCrLf & Worksheets("Sheet1").Cells(5, 2)
    ProcessString (strData)

    ' Chinese
    ' CN=ben (U+672C), C= zhong guo (U+4E2D, U+570B), OU=zong ju (U+7E3D, U+5C40)
    ' EXCEL: Get value from cell A7
    strData = Worksheets("Sheet1").Cells(7, 1)
    Debug.Print vbCrLf & Worksheets("Sheet1").Cells(7, 2)
    ProcessString (strData)

    ' Hebrew
    ' "abc" U+0061, U+0062, U+0063
    ' SPACE U+0020
    ' [NB right-to-left order]
    ' U+05DB HEBREW LETTER KAF
    ' U+05E9 HEBREW LETTER SHIN
    ' U+05E8 HEBREW LETTER RESH
    ' SPACE "f123" U+0066 U+0031 U+0032 U+0033
    ' EXCEL: Get value from cell A9
    strData = Worksheets("Sheet1").Cells(9, 1)
    Debug.Print vbCrLf & Worksheets("Sheet1").Cells(9, 2)
    ProcessString (strData)

End Sub

Public Function ProcessString(strData As String)
    Dim abData() As Byte
    Dim strOutput As String

    Debug.Print strData ' This should show "?" for non-ANSI characters

    strOutput = Utf8BytesFromString(strData)

    abData = strOutput
    ' Reset array width to Actual Number of Bytes
    ReDim Preserve abData(Len(strOutput) - 1)

    Debug.Print bv_HexFromBytesSp(abData)

    Debug.Print "Strlen=" & Len(strData) & " chars; utf8len=" & Len(strOutput) & " bytes"

End Function

''' Returns hex-encoded string from array of bytes (with spaces)
''' E.g. aBytes(&HFE, &HDC, &H80) will return "FE DC 80"
Public Function bv_HexFromBytesSp(aBytes() As Byte) As String
    Dim i As Long

    If Not IsArray(aBytes) Then
        Exit Function
    End If

    For i = LBound(aBytes) To UBound(aBytes)
        If (i > 0) Then bv_HexFromBytesSp = bv_HexFromBytesSp & " "
        If aBytes(i) < 16 Then
            bv_HexFromBytesSp = bv_HexFromBytesSp & "0" & Hex(aBytes(i))
        Else
            bv_HexFromBytesSp = bv_HexFromBytesSp & Hex(aBytes(i))
        End If
    Next

End Function

And Win64 converted API calls

' basUtf8FromString

' Written by David Ireland DI Management Services Pty Limited 2015
' <http://www.di-mgt.com.au> <http://www.cryptosys.net>

Option Explicit

' CodePage constant for UTF-8
Private Const CP_UTF8 = 65001

#If Win64 Then

    Private Declare PtrSafe Function GetACP Lib "Kernel32" () As LongPtr

    Private Declare PtrSafe Function MultiByteToWideChar Lib "Kernel32" (ByVal CodePage As LongPtr, _
        ByVal dwflags As LongPtr, ByVal lpMultiByteStr As LongPtr, ByVal cchMultiByte As LongPtr, _
        ByVal lpWideCharStr As LongPtr, ByVal cchWideChar As LongPtr) As LongPtr

    Private Declare PtrSafe Function WideCharToMultiByte Lib "Kernel32" (ByVal CodePage As LongPtr, _
        ByVal dwflags As LongPtr, ByVal lpWideCharStr As LongPtr, ByVal cchWideChar As LongPtr, _
        ByVal lpMultiByteStr As LongPtr, ByVal cchMultiByte As LongPtr, ByVal lpDefaultChar As LongPtr, _
        lpUsedDefaultChar As LongPtr) As LongPtr

#Else

    Private Declare PtrSafe Function GetACP Lib "Kernel32" () As Long

    Private Declare PtrSafe Function MultiByteToWideChar Lib "Kernel32" (ByVal CodePage As Long, _
        ByVal dwflags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, _
        ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long

    Private Declare PtrSafe Function WideCharToMultiByte Lib "Kernel32" (ByVal CodePage As Long, _
        ByVal dwflags As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long, _
        ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpDefaultChar As Long, _
        lpUsedDefaultChar As Long) As Long

#End If

''' Return byte array with VBA "Unicode" string encoded in UTF-8
Public Function Utf8BytesFromString(strInput As String) As String
    Dim nBytes      As LongPtr
    Dim pwz         As LongPtr
    Dim pwzBuffer   As LongPtr

    Dim sBuffer     As String

    ' Get length in bytes *including* terminating null
    pwz = StrPtr(strInput)
    nBytes = WideCharToMultiByte(CP_UTF8, 0&, pwz, -1, 0&, 0&, ByVal 0&, ByVal 0&)

    sBuffer = String$(nBytes + 1, vbNullChar)
    pwzBuffer = StrPtr(sBuffer)

    nBytes = WideCharToMultiByte(CP_UTF8, 0&, pwz, -1, pwzBuffer, Len(sBuffer), ByVal 0&, ByVal 0&)
    Utf8BytesFromString = Left$(sBuffer, nBytes - 1)
End Function

As extracted from http://www.di-mgt.com.au/howto-convert-vba-unicode-to-utf8.html

dbmitch
  • 5,361
  • 4
  • 24
  • 38