121

I am trying to create a text file using VB.Net with UTF8 encoding, without BOM. Can anybody help me, how to do this?
I can write file with UTF8 encoding but, how to remove Byte Order Mark from it?

edit1: I have tried code like this;

    Dim utf8 As New UTF8Encoding()
    Dim utf8EmitBOM As New UTF8Encoding(True)
    Dim strW As New StreamWriter("c:\temp\bom\1.html", True, utf8EmitBOM)
    strW.Write(utf8EmitBOM.GetPreamble())
    strW.WriteLine("hi there")
    strW.Close()

        Dim strw2 As New StreamWriter("c:\temp\bom\2.html", True, utf8)
        strw2.Write(utf8.GetPreamble())
        strw2.WriteLine("hi there")
        strw2.Close()

1.html get created with UTF8 encoding only and 2.html get created with ANSI encoding format.

Simplified approach - http://whatilearnttuday.blogspot.com/2011/10/write-text-files-without-byte-order.html

VJOY
  • 3,752
  • 12
  • 57
  • 90

10 Answers10

211

In order to omit the byte order mark (BOM), your stream must use an instance of UTF8Encoding other than System.Text.Encoding.UTF8 (which is configured to generate a BOM). There are two easy ways to do this:

1. Explicitly specifying a suitable encoding:

  1. Call the UTF8Encoding constructor with False for the encoderShouldEmitUTF8Identifier parameter.

  2. Pass the UTF8Encoding instance to the stream constructor.

' VB.NET:
Dim utf8WithoutBom As New System.Text.UTF8Encoding(False)
Using sink As New StreamWriter("Foobar.txt", False, utf8WithoutBom)
    sink.WriteLine("...")
End Using
// C#:
var utf8WithoutBom = new System.Text.UTF8Encoding(false);
using (var sink = new StreamWriter("Foobar.txt", false, utf8WithoutBom))
{
    sink.WriteLine("...");
}

2. Using the default encoding:

If you do not supply an Encoding to StreamWriter's constructor at all, StreamWriter will by default use an UTF8 encoding without BOM, so the following should work just as well:

' VB.NET:
Using sink As New StreamWriter("Foobar.txt")
    sink.WriteLine("...")
End Using
// C#:
using (var sink = new StreamWriter("Foobar.txt"))
{
    sink.WriteLine("...");
}

Finally, note that omitting the BOM is only permissible for UTF-8, not for UTF-16.

stakx - no longer contributing
  • 83,039
  • 20
  • 168
  • 268
  • Not always wise: for example `My.Computer.FileSystem.WriteAllText` writes the BOM if no encoding is specified. – beppe9000 Jun 04 '16 at 15:02
  • `My.Computer.FileSystem.WriteAllText` is an exception in this regard, guessing for backwards VB compatibility perhaps? [`File.WriteAllText`](http://referencesource.microsoft.com/#mscorlib/system/io/file.cs,10d1f3f4dbac8234) defaults to UFT8NoBOM. – jnm2 Jun 06 '16 at 10:13
  • This is especially helpful if you want to write a `*.m3u8` playlist file for VLC. VLC is still not capable to read UTF8 playlist files WITH BOM! This seems to be fixed according to https://trac.videolan.org/vlc/ticket/21860, but will only be included in VLC v4. – PeterCo Oct 09 '20 at 10:52
29

Try this:

Encoding outputEnc = new UTF8Encoding(false); // create encoding with no BOM
TextWriter file = new StreamWriter(filePath, false, outputEnc); // open file with encoding
// write data here
file.Close(); // save and close it
Roman Nikitin
  • 291
  • 2
  • 2
6

Just Simply use the method WriteAllText from System.IO.File.

Please check the sample from File.WriteAllText.

This method uses UTF-8 encoding without a Byte-Order Mark (BOM), so using the GetPreamble method will return an empty byte array. If it is necessary to include a UTF-8 identifier, such as a byte order mark, at the beginning of a file, use the WriteAllText(String, String, Encoding) method overload with UTF8 encoding.

Joe.wang
  • 11,537
  • 25
  • 103
  • 180
5

If you do not specify an Encoding when creating a new StreamWriter the default Encoding object used is UTF-8 No BOM which is created via new UTF8Encoding(false, true).

So to create a text file without the BOM use of of the constructors that do not require you to provide an encoding:

new StreamWriter(Stream)
new StreamWriter(String)
new StreamWriter(String, Boolean)
JG in SD
  • 5,427
  • 3
  • 34
  • 46
  • What if I need to specify `leaveOpen`? – binki Nov 27 '15 at 15:42
  • @binki in that case you can not use the default encoding that `StreamWriter` uses. You'll need to specify `new UTF8Encoding(false, true)` for your encoding to be able to specify `leaveOpen` and not have the BOM. – JG in SD Nov 30 '15 at 15:38
4

Interesting note with respect to this: strangely, the static "CreateText()" method of the System.IO.File class creates UTF-8 files without BOM.

In general this the source of bugs, but in your case it could have been the simplest workaround :)

Tao
  • 13,457
  • 7
  • 65
  • 76
3

I think Roman Nikitin is right. The meaning of the constructor argument is flipped. False means no BOM and true means with BOM.

You get an ANSI encoding because a file without a BOM that does not contain non-ansi characters is exactly the same as an ANSI file. Try some special characters in you "hi there" string and you'll see the ANSI encoding change to without-BOM.

jos
  • 31
  • 1
1

XML Encoding UTF-8 without BOM
We need to submit XML data to the EPA and their application that takes our input requires UTF-8 without BOM. Oh yes, plain UTF-8 should be acceptable for everyone, but not for the EPA. The answer to doing this is in the above comments. Thank you Roman Nikitin.

Here is a C# snippet of the code for the XML encoding:

    Encoding utf8noBOM = new UTF8Encoding(false);  
    XmlWriterSettings settings = new XmlWriterSettings();  
    settings.Encoding = utf8noBOM;  
        …  
    using (XmlWriter xw = XmlWriter.Create(filePath, settings))  
    {  
        xDoc.WriteTo(xw);  
        xw.Flush();  
    }    

To see if this actually removes the three leading character from the output file can be misleading. For example, if you use Notepad++ (www.notepad-plus-plus.org), it will report “Encode in ANSI”. I guess most text editors are counting on the BOM characters to tell if it is UTF-8. The way to clearly see this is with a binary tool like WinHex (www.winhex.com). Since I was looking for a before and after difference I used the Microsoft WinDiff application.

0

For VB.Net visual basic, this is how to make it work:

My.Computer.FileSystem.WriteAllText("FileName", Data, False, System.Text.Encoding.ASCII)
Fred Kerber
  • 155
  • 3
  • 13
-1

It might be that your input text contains a byte order mark. In that case, you should remove it before writing.

-1
Dim sWriter As IO.StreamWriter = New IO.StreamWriter(shareworklist & "\" & getfilename() & ".txt", False, Encoding.Default)

Gives you results as those you want(I think).

Mark Hall
  • 53,938
  • 9
  • 94
  • 111
Mwenyeji
  • 7
  • 3