1

My system is Window 10 English-US. I need to write some non-printable ASCII characters to a text file. So for eg for the ASCII value of 28, I want to write \u001Cw to the file. I don't have to do anything special when coded in Java. Below is my code in VBS

Dim objStream
Set objStream = CreateObject("ADODB.Stream")
objStream.Open
objStream.Type = 2
objStream.Position = 0
objStream.CharSet = "utf-16"

objStream.WriteText ChrW(28)  'Need this to appear as \u001Cw in the output file

objStream.SaveToFile "C:\temp\test.txt", 2
objStream.Close
Melwyn Dsouza
  • 21
  • 1
  • 3
  • 1
    What's wrong with it? – Cid Sep 14 '18 at 21:38
  • It puts the un-printable ASCII character for 28 in the file instead of \u001Cw If I do the same thing from Java, I get \u001Cw in the file. Below is my Java code, where I'm not doing any special encoding. FileWriter log = new FileWriter(file); log.write(text); log.close(); – Melwyn Dsouza Sep 16 '18 at 15:31

1 Answers1

5

You need a read-write stream so that writing to it and saving it to file both work.

Const adModeReadWrite = 3
Const adTypeText = 2
Const adSaveCreateOverWrite = 2

Sub SaveToFile(text, filename)
  With CreateObject("ADODB.Stream")
    .Mode = adModeReadWrite
    .Type = adTypeText
    .Charset = "UTF-16"
    .Open
    .WriteText text
    .SaveToFile filename, adSaveCreateOverWrite
    .Close
  End With
End Sub


text = Chr(28) & "Hello" & Chr(28)
SaveToFile text, "C:\temp\test.txt"

Other notes:

  • I like to explicitly define with Const all the constants in the code. Makes reading so much easier.
  • A With block save quite some typing here.
  • Setting the stream type to adTypeText is not really necessary, that's the default anyway. But explicit is better than implicit, I guess.
  • Setting the Position to 0 on a new stream is superfluous.
  • It's unnecessary to use ChrW() for ASCII-range characters. The stream's Charset decides the byte width when you save the stream to file. In RAM, everything is Unicode anyway (yes, even in VBScript).
  • There are two UTF-16 encodings supported by ADODB.Stream: little-endian UTF-16LE (which is the default and synonymous with UTF-16) and big-endian UTF-16BE, with the byte order reversed.

You can achieve the same result with the FileSystemObject and its CreateTextFile() method:

Set FSO = CreateObject("Scripting.FileSystemObject")

Sub SaveToFile(text, filename)
  ' CreateTextFile(filename [, Overwrite [, Unicode]])
  With FSO.CreateTextFile(filename, True, True)
    .Write text
    .Close
  End With
End Sub


text = Chr(28) & "Hello" & Chr(28)
SaveToFile text, "C:\temp\test.txt"

This is a little bit simpler, but it only offers a Boolean Unicode parameter, which switches between UTF-16 and ANSI (not ASCII, as the documentation incorrectly claims!). The solution with ADODB.Stream gives you fine-grained encoding choices, for example UTF-8, which is impossible with the FileSystemObject.


For the record, there are two ways to create an UTF-8-encoded text file:

  • The way Microsoft likes to do it, with a 3-byte long Byte Order Mark (BOM) at the start of the file. Most, if not all Microsoft tools do that when they offer "UTF-8" as an option, ADODB.Stream is no exception.
  • The way everyone else does it - without a BOM. This is correct for most uses.

To create an UTF-8 file with BOM, the first code sample above can be used. To create an UTF-8 file without BOM, we can use two stream objects:

Const adModeReadWrite = 3
Const adTypeBinary = 1
Const adTypeText = 2
Const adSaveCreateOverWrite = 2

Sub SaveToFile(text, filename)
  Dim iStr: Set iStr = CreateObject("ADODB.Stream")
  Dim oStr: Set oStr = CreateObject("ADODB.Stream")

  ' one stream for converting the text to UTF-8 bytes
  iStr.Mode = adModeReadWrite
  iStr.Type = adTypeText
  iStr.Charset = "UTF-8"
  iStr.Open
  iStr.WriteText text

  ' one steam to write bytes to a file
  oStr.Mode = adModeReadWrite
  oStr.Type = adTypeBinary
  oStr.Open

  ' switch first stream to binary mode and skip UTF-8 BOM
  iStr.Position = 0
  iStr.Type = adTypeBinary
  iStr.Position = 3

  ' write remaining bytes to file and clean up
  oStr.Write iStr.Read
  oStr.SaveToFile filename, adSaveCreateOverWrite
  oStr.Close
  iStr.Close
End Sub
Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • Thank you for the very detailed reply and the helpful notes. I really appreciate your effort in detailing out the answer. Unfortunately, I get the same output as my original code. I found a solution here on StackOverflow, which solves my problem to some extent https://stackoverflow.com/questions/2241130/char-to-utf-code-in-vbscript However the only issue is that I need to use this only for the un-printable ASCII characters (ASCII value < 32 and > 126). I basically just need the HEX value to come into the file for the non-printable ASCII characters. This is what Java does by default. – Melwyn Dsouza Sep 16 '18 at 15:36
  • I admit I have no idea what you want. My code sample above outputs an UTF-16 encoded text file with the "file separator" character (28), encoded as a Unicode character. The way I read your question, this is what you are asking for. – Tomalak Sep 17 '18 at 09:20
  • Do you want to write **the actual string** `"\u001C"` to the text file instead? – Tomalak Sep 17 '18 at 09:23
  • I'm guessing `.Charset = "UTF-8"` will work and do the obvious thing if you want to write UTF-8 but I have no way to verify this. Can somebody who has this environment please confirm (or refute)? – tripleee Aug 07 '19 at 11:11