61

I've got an xhtml page validating under xhtml strict doctype -- but, I getting this warning which I trying to understand -- and correct.

Just, how do I locate this errant "Byte-Order Mark". I'm editing my file using Visual Studio--not sure if that helps.

Warning Byte-Order Mark found in UTF-8 File.

The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files is known to cause problems for some text editors and older browsers. You may want to consider avoiding its use until it is better supported.

Community
  • 1
  • 1
rsturim
  • 6,756
  • 15
  • 47
  • 59

6 Answers6

88

The location part of your question is easy: The byte-order mark (BOM) will be at the very beginning of the file.

When editing the file, in the bottom status bar toward the right VS Code shows you what encoding is being used for the current file:

Status bar showing "UTF-8 with BOM"

Click it to open the command palette with the options "Reopen with encoding" and "Save with encoding":

The command palette showing the options

Click "Save with Encoding" to get a list of encodings:

Command palette showing list of file encodings such as UTF-8, UTF-16 LE, UTF-16 BE

Choosing an encoding saves the file with that encoding.

See also this note in the Unicode site's FAQ about the BOM and UTF-8 files. It has no function other than to call out that the file is, in fact, UTF-8. In particular, it has no effect on the byte order (the main reason we have BOMs), because the byte order of UTF-8 is fixed.

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
  • 1
    I meant to vote this up, but I voted this down accidentally and didn't realise for 2 days, and now it's locked and it won't let me change it... but this helped me fix w3c warnings that were bugging me, so thanks anyway. – andygoestohollywood Nov 12 '13 at 09:20
  • 2
    @andygoestohollywood: :-) Thanks for explaining, I did wonder at the time why someone had downvoted the answer. I could edit the answer so you could undo it, but I don't see anything constructive to change and I don't like to push it onto the "active" list just to reverse a vote. I'm glad this helped! – T.J. Crowder Nov 12 '13 at 09:50
  • 2
    For anyone viewing this in the current decade, ```File -> Advanced Save Options``` was removed from VSC some time ago. Tap 'UTF-8 with bom' in the bottom-right, click 'Save with encoding' and select UTF-8. – Gamma032 Mar 05 '21 at 02:47
29

Here's how I fixed this:

  1. Download and install Notepad++

  2. Open the file with Notepad++

  3. In the menu select "Encoding" and set it to "Encode in UTF-8 without BOM"

  4. Save the file and the BOM will be gone.

eldarerathis
  • 35,455
  • 10
  • 90
  • 93
9

For someone using Visual Studio seeing the annoying red dot on bitbucket on 2018, just go to Visual Studio to "File" -> "File.cshtml Save As..." and select "Save with Encoding...":

Save with encoding

Then it will popup a screen so that you can change the Encoding, try and look all the way down in the list until you see "Unicode (UTF-8 without signature) - Codepage 65001":

enter image description here

After this, just overwrite your file and upload it to your repo and the BOM will be gone.

Hope it helps. Leo.

Leo
  • 956
  • 8
  • 24
4

for intellij idea editor just go to File and File Properties enter image description here

Hamidreza Sadegh
  • 2,155
  • 31
  • 33
1

In Linux:

Open the file with Geany.

In the menu "Dokument" uncheck "Write Unicode BOM".

Save the file.

japetko
  • 354
  • 4
  • 14
0

BOM sometimes is located INSIDE text, not at the beginning - if a file has been assembled some time by php from other files using for example include_once(). To remove it, delete area between at least one character before BOM and at least one character after BOM (just in case). Position of BOM can be located in F12 Developer Tools of the Internet Explorer and probably Edge. It is visualised as a black diamond / rhombus.

Visual Studio and WebMatrix can save files with or without signature (at the beginning).

BOM causes errors during validation ( https://validator.w3.org/#validate_by_upload ) or in consoles - </HEAD> can be treated as orphaned element without <HEAD>, when apparently is present !:

Error: Stray end tag head.

<BODY> as second one <BODY>, when only one <BODY> exists and everything is correct:

Error: Start tag body seen but an element of the same type was already open.

And entire document can be seen lacking DOCTYPE, when BOM or two BOMS occupy first line and DOCTYPE is in second line, with a message similar to this one:

Error: Non-space characters found without seeing a doctype first. Expected e.g. <!DOCTYPE html>.

Error: Element head is missing a required instance of child element title.

Error: Stray doctype.

Error: Stray start tag html.

Error: Stray start tag head.

Error: Attribute name not allowed on element meta at this point.

Error: Element meta is missing one or more of the following attributes: itemprop, property.

Error: Attribute http-equiv not allowed on element meta at this point.

Error: Element meta is missing one or more of the following attributes: itemprop, property.

Error: Attribute name not allowed on element meta at this point.

Error: Element meta is missing one or more of the following attributes: itemprop, property.

Error: Element link is missing required attribute property.

Error: Attribute name not allowed on element meta at this point.

Error: Element meta is missing one or more of the following attributes: itemprop, property.

Error: Attribute name not allowed on element meta at this point.

Error: Element meta is missing one or more of the following attributes: itemprop, property.

Error: Attribute name not allowed on element meta at this point.

Error: Element meta is missing one or more of the following attributes: itemprop, property.

Error: Element title not allowed as child of element body in this context. (Suppressing further errors from this subtree.)

Error: Element style not allowed as child of element body in this context. (Suppressing further errors from this subtree.)

Error: Stray end tag head.

Error: Start tag body seen but an element of the same type was already open.

Fatal Error: Cannot recover after last error. Any further errors will be ignored.

( https://validator.w3.org/#validate_by_uri )

And stream of messages in IE F12 Developer Tools console:

HTML1527: DOCTYPE expected. Consider adding a valid HTML5 doctype: "<!DOCTYPE html>".

HTML1502: Unexpected DOCTYPE. Only one DOCTYPE is allowed and it must occur before any elements.

HTML1513: Extra "<html>" tag found. Only one "<html>" tag should exist per document.

HTML1503: Unexpected start tag. HTML1512: Unmatched end tag.

Everything caused by one BOM at the beginning. And Debugger shows one black rhombus in the first line.

Files saved with signature, but not assembled by php don't cause such errors and black diamonds are not vissible in IE debugger. So perhaps php transforms BOM somehow. It seems that main php file must be saved with signature to see this.

Those strange characters occur at the beginning and/or on the borders of files merged with include_once() and are not visible when files are saved before without signature. This is why it points at BOM involvement.

I have noticed this everything day before yesterday when started converting my website to HTML5 and validating.

BOM can also create a small indent at the beginning of line. Two files containing identical text but one with indent.

darekk
  • 71
  • 1
  • 13