12

I am trying to, simply put, remove a hidden zero-width-line-break character (U+FEFF) in my script. The webpage that it has appeared on is at http://cynicode.co.uk (Please note, the index page has been tinkered with and is the only one which functions properly at the moment)

By looking at the html elements on the page, this shows up:

enter image description here

The key point being the red dot between < body > and < !--5-- >. This, when hovered over, shows that it is a \ufeff character. The problem is that, when I look through the script, there is no such character that exists.

I am using PHP and HTML to construct this page, and the items between < !--4-- > and < !--5-- > consists of the following. Firstly, on the actual index page itself:

<?php
  echo "<!--4-->";
  echo "<head><meta charset='utf-8' /><link rel='shortcut icon' type='image/ico' href='./images/CyniCode.ico'>
    <title>CyniCode :: Index</title>
    <meta name='description' content='The Cynic's paradise! Home of Cynical.' />
    <meta name='author' content='Cynical' />
    <meta name='keywords' content='Cynical;Blog;Code' />
    <link type='text/css' rel='stylesheet' href='./css/mystyle.css' />
    <link rel='shortcut icon' type='image/ico' href='./images/CyniCode.ico'>
    <link href='http://fonts.googleapis.com/css?family=Muli' rel='stylesheet' type='text/css' />
    <script type='text/javascript' src='http://static.proofiction.net/jquery/jquery-1.9.1.min.js'></script>
    <script type='text/javascript' src='http://static.proofiction.net/jquery/loginwait.js'></script>
    <script type='text/javascript' src='http://static.proofiction.net/jquery/googleAnalytics.js'></script>
    <script type='text/javascript' src='./http://static.proofiction.net/jquery/jquery.bxslider.js'></script>
    <script type='text/javascript' src='./http://static.proofiction.net/jquery/jquery.bxslider.min.js'></script>
    <script type='text/javascript' src='https://www.google.com/jsapi'></script>
    <script type='text/javascript'><!--
      google_ad_client = 'ca-pub-xxxxxxxxxxxx';
      /* BiggerNavBox */
      google_ad_slot = '3977705372';
      google_ad_width = 300;
      google_ad_height = 600;
      //-->
    </script>
    <script>
      (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
      (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
      m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
      })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

      ga('create', 'UA-xxxxxxxx-1', 'cynicode.co.uk');
      ga('send', 'pageview'); 
    </script>

    </head>";
  require_once './functions/page.php';

This constitutes for the Index page. The page.php script referenced is my cheat for setting the page up quickly and cleanly. However... there is a small remaining difference between the two comments on the page. This is the remaining difference between the two comments.

<?php
  echo "<!--5-->";

Any help that anyone can offer would be much appreciated. All code pieces were direct copy and pastes from my scripts.

Sebas
  • 21,192
  • 9
  • 55
  • 109
J4Numbers
  • 428
  • 2
  • 7
  • 14
  • May I ask which editor is the one in the snapshot? – VSB Aug 10 '16 at 15:24
  • 1
    @VSB there are no editors there. The top screenshot was taken from Google Chrome's element inspector. I tend to use SublimeText or Vim for text editor stuff these days, and JetBrain's PHPStorm/WebStorm for webdev stuff, though Notepad++ still ticks all the boxes. – J4Numbers Aug 12 '16 at 16:01

2 Answers2

23

Some of your .php files (probably ./functions/page.php contains Byte Order Mark. If you are using some IDE, check this file's encoding properties and with luck you will be able to remove it.

Edit If you use *nix, Elegant way to search for UTF-8 files with BOM? should help.

Community
  • 1
  • 1
Cthulhu
  • 1,379
  • 1
  • 13
  • 25
  • 1
    Hi. Thanks for the answer. I am using Notepad++ and I have set the encoding of all files to _UTF-8 without BOM_ (I had a large problem a while back with BOM and had to tinker with many things). There is also the `meta charset='utf-8'` which has worked in the past. In fact, this problem has only sprung up recently. I'm really not sure what has changed to cause this to happen. – J4Numbers Aug 03 '13 at 06:35
  • Well, simply check this file's properties specifically. If it does not have BOM the problem is something else. – Cthulhu Aug 03 '13 at 06:39
  • 2
    @M477h3w1012 Yoy can check with this http://www.bryntyounce.com/filebomdetector.htm . – Cthulhu Aug 03 '13 at 06:44
  • @Cthulu first of all, thanks for the file, I'm sure this will be useful for other things :) Secondly... huh. This is going to be an annoyance. It appears that the BOM was still intact. Thankfully this program can remove it (yay). In other words, thanks a bunch! – J4Numbers Aug 03 '13 at 06:50
  • @M477h3w1012 No problems. – Cthulhu Aug 03 '13 at 06:52
  • Thanks a lot for this "encoding" tip. Got the same thing with my *.css file generated from *.less. Changed *.less encoding for "Unicode (UTF-8 without signature) - Codepage 65001" and verything begun to work. – Kamilius Sep 18 '13 at 11:58
12

Here is a solution to a similar issue: Extra BOM character in HTML invalidating DOCTYPE tag

Basically the source of the issue are PHP files encoded in UTF8 with BOM, encoding them in UTF-8 without BOM solves the issue.

Notepad++ UTF-8 encoding without BOM

diegoubi
  • 1,101
  • 13
  • 17
  • you saved my day. experienced the same error but with my robots.txt – rethabile Apr 13 '18 at 09:08
  • Great thanks to author! You saved my ass :) In my case this issue happened in functions.php file of Wordpress theme. – PokatilovArt Sep 28 '18 at 23:22
  • Changing the encoding did not help to me, so I replace all occurrences of \ufeff to emptiness, and notepad++ made two replacements. Don't forget to enable extended search when doing this. – Max Nov 26 '19 at 03:04