I have an HTTPHandler that is reading in a set of CSS files and combining them and then GZipping them. However, some of the CSS files contain a Byte Order Mark (due to a bug in TFS 2005 auto merge) and in FireFox the BOM is being read as part of the actual content so it's screwing up my class names etc. How can I strip out the BOM characters? Is there an easy way to do this without manually going through the byte array looking for ""?
Asked
Active
Viewed 1.7k times
14
-
Is the BOM appearing in the actual text itself, or just at the very start? I'd be surprised to see it anywhere other than at the start of the data - in which case simply ignoring the first 3 bytes (assuming UTF-8) should do the trick. – Jon Skeet Nov 13 '08 at 20:14
-
FWIW, you could open the files in [Notepad++](http://notepad-plus.sourceforge.net/uk/site.htm) and save them without the Byte Order Mark. It's what I had to do in [this question](http://stackoverflow.com/questions/291455/xml-data-at-root-level-is-invalid). – George Stocker Nov 16 '08 at 22:56
-
2I wrote the [following post](http://andrewmatthewthompson.blogspot.com/2011/02/byte-order-mark-found-using-net.html) after coming across this issue. Essentially instead of reading in the raw bytes of the file's contents using the BinaryReader class, I use the StreamReader class with a specific constructor which automatically removes the byte order mark character from the textual data I am trying to retrieve. – Andrew Thompson Feb 20 '11 at 21:06
5 Answers
8
Expanding on Jon's comment with a sample.
var name = GetFileName();
var bytes = System.IO.File.ReadAllBytes(name);
System.IO.File.WriteAllBytes(name, bytes.Skip(3).ToArray());
-
7Quote OP: *However, some of the CSS files contain a Byte Order Mark*. .. ** some ** .. so the code above doesn't check if there's a BOM, before it skips it... – Pure.Krome Aug 10 '14 at 11:24
-
6
Expanding JaredPar sample to recurse over sub-directories:
using System.Linq;
using System.IO;
namespace BomRemover
{
/// <summary>
/// Remove UTF-8 BOM (EF BB BF) of all *.php files in current & sub-directories.
/// </summary>
class Program
{
private static void removeBoms(string filePattern, string directory)
{
foreach (string filename in Directory.GetFiles(directory, file Pattern))
{
var bytes = System.IO.File.ReadAllBytes(filename);
if(bytes.Length > 2 && bytes[0] == 0xEF && bytes[1] == 0xBB && bytes[2] == 0xBF)
{
System.IO.File.WriteAllBytes(filename, bytes.Skip(3).ToArray());
}
}
foreach (string subDirectory in Directory.GetDirectories(directory))
{
removeBoms(filePattern, subDirectory);
}
}
static void Main(string[] args)
{
string filePattern = "*.php";
string startDirectory = Directory.GetCurrentDirectory();
removeBoms(filePattern, startDirectory);
}
}
}
I had need that C# piece of code after discovering that the UTF-8 BOM corrupts file when you try to do a basic PHP download file.

Olivier de Rivoyre
- 473
- 2
- 6
- 9
3
var text = File.ReadAllText(args.SourceFileName);
var streamWriter = new StreamWriter(args.DestFileName, args.Append, new UTF8Encoding(false));
streamWriter.Write(text);
streamWriter.Close();
-
Looking at this code, ideally it should work. But, I am surprised that it is saving file in ANSI format. – VJOY Mar 13 '10 at 07:42
-
`new UTF8Encoding(false)` the parameter indicates whether to add the BOM or not. – Guy Lowe Apr 04 '14 at 01:18
1
Another way, assuming UTF-8 to ASCII.
File.WriteAllText(filename, File.ReadAllText(filename, Encoding.UTF8), Encoding.ASCII);

Tim Bailey
- 571
- 1
- 3
- 15
0
For larger file, use the following code; memory efficient!
StreamReader sr = new StreamReader(path: @"<Input_file_full_path_with_byte_order_mark>",
detectEncodingFromByteOrderMarks: true);
StreamWriter sw = new StreamWriter(path: @"<Output_file_without_byte_order_mark>",
append: false,
encoding: new UnicodeEncoding(bigEndian: false, byteOrderMark: false));
var lineNumber = 0;
while (!sr.EndOfStream)
{
sw.WriteLine(sr.ReadLine());
lineNumber += 1;
if (lineNumber % 100000 == 0)
Console.Write("\rLine# " + lineNumber.ToString("000000000000"));
}
sw.Flush();
sw.Close();

Ashokan Sivapragasam
- 2,033
- 2
- 18
- 39