I recently ran an HTML file I was writing through this on-line HTML validator, and one of the diagnostics I got said,
The character encoding was not declared. Proceeding using "windows-1252".
When I create a webpage, I write it in a text editor, which saves it as DOS-text (with CR-LF line endings). When I upload the file to my web-hosting provider, it gets converted (I think) on the server to Unix text (LF line endings). My text editor can also save files as Unicode including UTF-8, but I rarely find that necessary.
The standard online advice about specifying the character encoding in a web document is to include, just under the <head>
tag, <meta charset="utf-8">
. There is also advice that you should ensure that what you specify does not conflict with the information sent by the server in the HTTP headers when serving the document. Using Rex Swain's [online] HTTP viewer, I see that in the HTTP headers it just says,
Content-Type:·text/html
Should I follow the standard advice to specify the charset as UTF-8, even though the html file is never saved as such, or should I specify it as windows-1252, as assumed by that online validator, or as ISO-8859-1 as per one of the example values on W3Schools? Also, some examples of the charset metatag show it terminated as />
. Which is the preferred syntax, and should there be a space before the slash?