0

I am getting output as

FBI believed he had a ‘doomsday device’ 

instead of

FBI believed he had a ‘doomsday device’ 

when i am using

iconv("UTF-8", "ISO-8859-1//IGNORE", $topic);

output is

FBI believed he had a âdoomsday deviceâ

I am not using any header or charset in my file.

Update

Got why is this happening

when the UTF-8 series of numbers is interpreted as if it were ISO-8859-1 the output is

’

Explaination

0xE28099 breaks down as 0xE2 (â), 0x80 (€) and 0x99 (™). What was one character in UTF-8 (’) gets mistakenly displayed as three (’) when misinterpreted as ISO-8859-1.

Still no solution to convert it

Navneet Pandey
  • 135
  • 1
  • 9
  • 6
    Why not fix the root of the problem instead? See [UTF-8 all the way through](http://stackoverflow.com/q/279170) – Pekka Dec 06 '12 at 16:02
  • 1
    Are these string coming from a mysql database? try running this before running the select queries : `SET NAMES 'UTF-8'` – Dale Dec 06 '12 at 16:05
  • yes these are coming from mysql database. setting to utf8 using procedural style prints blank no data is displayed – Navneet Pandey Dec 06 '12 at 16:09
  • @Pekka the output I am getting, what type of encoding is that? – Navneet Pandey Dec 06 '12 at 16:17
  • 1
    It's probably UTF-8 shown in a single-byte encoding like ISO-8859-1. – Pekka Dec 06 '12 at 16:18
  • 1
    Should probably be noted that browsers don't really support ISO-8859-1 and that characters like `€`, `’` and `‘` are unrepresentable in ISO-8859-1. There is no reason to use ISO-8859-1 over Windows-1252 ever (in this context :P I'm sure it has uses because all the 256 characters are first 256 characters in unicode as well) because it just has useless control characters in place of characters like `€` – Esailija Dec 06 '12 at 16:42

1 Answers1

2

Well the output page is being interpreted in Windows-1252, not ISO-8859-1..

I recommend setting your header charset to utf-8:

In apache config:

AddDefaultCharset utf-8

Php.ini:

default_charset utf-8

Manually in php:

header("Content-Type: text/html; charset=utf-8");

If you cannot do anything of the above because of some weird reasons, you should then convert into Windows-1252 instead:

iconv("UTF-8", "Windows-1252//IGNORE", $topic);
Esailija
  • 138,174
  • 23
  • 272
  • 326