37

I have an unorder list, and � often (but not always!) appears where I have have two spaces between characters. What is causing this, and how do I prevent it?

Eric Laoshi
  • 237
  • 2
  • 13
user1032531
  • 24,767
  • 68
  • 217
  • 387

4 Answers4

43

This specific character � is usually the sign of an invalid (non-UTF-8) character showing up in an output (like a page) that has been declared to be UTF-8. It happens often when

  • a database connection is not UTF-8 encoded (even if the tables are)

  • a HTML or script source file is stored in the wrong encoding (e.g. Windows-1252 instead of UTF-8) - make sure it's saved as a UTF-8 file. The setting is often in the "Save as..." dialog.

  • an online source (like a widget or a RSS feed) is fetched that isn't serving UTF-8

Pekka
  • 442,112
  • 142
  • 972
  • 1,088
  • No database and no RSS and using UTF-8. – user1032531 Mar 07 '13 at 16:30
  • @user so what's your situation then? Where does the character come from? – Pekka Mar 07 '13 at 16:34
  • I think I long ago cut and pasted from a MS Word document. It doesn't appear if I look at the text using a basic text editor (i.e. Note) – user1032531 Mar 07 '13 at 16:36
  • @user make sure the file is stored as UTF-8. You may have a select field to that effect in the "save as..." dialog – Pekka Mar 07 '13 at 16:37
  • But if I cut and past from the browsers source code to Note, it shows them. In regards to my previous post, when I open the file directly using MS Note, it does not show them. – user1032531 Mar 07 '13 at 16:38
  • @user make sure the file is stored as UTF-8. You may have a select field to that effect in the "save as..." dialog – Pekka Mar 07 '13 at 16:39
  • 3
    Brilliant! It was saved as Windows 1252. Please add this to your answer. Thanks! – user1032531 Mar 07 '13 at 16:40
  • I was trying to UrlFetch in Apps Script. It turns out it was in 1252 format in the source html. `ret = u.getContentText("Windows-1252");` worked. – Adrian Dec 29 '21 at 18:07
5

I had the same issue ....

You can fix it by adding the following line in your template !

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
yivi
  • 42,438
  • 18
  • 116
  • 138
pradip
  • 197
  • 4
  • 11
4

It's a character-set issue. Get a tool that inspects the response headers of the server (like the Firebug extension if you're using Mozilla Firefox) to see what character set the server response is sending with the content. If the server's character-set and the HTML character set of the actual content don't match up, you will see some strange looking characters like those little black diamond squares.

topcat3
  • 2,561
  • 6
  • 33
  • 57
1

I had the same issue when getting an HTML output from an XSLT. Along with Pradip's solution I was also able to resolve the issue using UTF-32.

<meta http-equiv="Content-Type" content="text/html; charset=UTF-32" />
Nathanael Istre
  • 101
  • 1
  • 4