20

I've seen this asked several times, but not with a good resolution. I have the following string:

$string = "<p>Résumé</p>";

I want to print or echo the string, but the output will return <p>R�sum�</p>. So I try htmlspecialchars() or htmlentities() which outputs &lt;p&gt;R&eacute;sum&eacute;&lt;p&gt; and the browser renders &lt;p&gt;R&eacute;sum&eacute;&lt;p&gt;. I want it, obviously, to render this:

Résumé

And I'm using UTF-8:

header("Content-type: text/html; charset=UTF-8");

What am I missing here? Why do echo and print output a for any special character? To clarify, the string is actually an entire HTML file stored in a database. The real-world application is not just that one small line.

Mukyuu
  • 6,436
  • 8
  • 40
  • 59
Phil Tune
  • 3,154
  • 3
  • 24
  • 46
  • 5
    in what encoding its your source file? – Jarry Oct 02 '12 at 22:05
  • @Jarry I set the header to UTF-8. Is that what you mean? – Phil Tune Oct 02 '12 at 22:13
  • no, no the header, the enconding of the text file. ihad a similar problem, and it turned out it was because the file encoding was latin-1, and i was setting the header to UTF-8 – Jarry Oct 02 '12 at 22:15
  • You'll have to be more specific. I'm not sure what you mean by the text file. The file itself is PHP. I'm sending that header to the browser. – Phil Tune Oct 02 '12 at 22:20

10 Answers10

50

After much banging-head-on-table, I have a bit better understanding of the issue that I wanted to post for anyone else who may have had this issue.

While the UTF-8 character set will display special characters on the client, the server, on the other hand, may not be so accomodating and would print special characters such as à and è as and .

To make sure your server will print them correctly, use the ISO-8859-1 charset:

<?php
    /*Just for your server-side code*/
    header('Content-Type: text/html; charset=ISO-8859-1');
?>
<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8"><!-- Your HTML file can still use UTF-8-->
        <title>Untitled Document</title>
    </head>
    <body>
        <?= "àè" ?>
    </body>
</html>

This will print correctly: àè


Edit (4 years later):

I have a little better understanding now. The reason this works is that the client (browser) is being told, through the response header(), to expect an ISO-8859-1 text/html file. (As others have mentioned, you can also do this by updating your .ini or .htaccess files.) Then, once the browser begins to parse that given file into the DOM, the output will obey any <meta charset=""> rule but keep your ISO characters intact.

Community
  • 1
  • 1
Phil Tune
  • 3,154
  • 3
  • 24
  • 46
  • This 2009 article also summarizes this issue nicely: http://blog.salientdigital.com/2009/06/06/special-characters-showing-up-as-a-question-mark-inside-of-a-black-diamond/. I think particularly my issue may be in using WAMPServer's PHP.ini default encoding. Which is probably why many people reading this and other posts don't understand what we're talking about. I haven't a clue when it comes to encoding arguments, and truthfully, I and you don't want to have to deal with it when our efforts are being put towards just making cool stuff. – Phil Tune Oct 13 '12 at 18:19
  • `charset=ISO-8859-1'` worked for me. Thanks a lot :-). Upvoted – Sudharshan Nair Mar 12 '19 at 06:40
  • php header tag worked for me. Thanks a lot. – Braham Dev Yadav Jan 02 '23 at 09:16
5

In PHP there is a pretty good function utf8_encode() to solve this issue.

echo utf8_encode("Résumé");

//will output Résumé instead of R�sum�

Check the official PHP page.

Vikas Kandari
  • 1,612
  • 18
  • 23
4

You can have a mix of PHP and HTML in your PHP files... just do something like this...

<?php
$string = htmlentities("Résumé");
?>

<html>
<head></head>
<body>
<p><?= $string ?></p>
</body>
</html>

That should output Résumé just how you want it to.

If you don't have short tags enabled, replace the <?= $string ?> with <?php echo $string; ?>

Justin Wood
  • 9,941
  • 2
  • 33
  • 46
1

So I try htmlspecialchars() or htmlentities() which outputs <p>Résumé<p> and the browser renders <p>Résumé<p>.

If you've got it working where it displays Résumé with <p></p> tags around it, then just don't convert the paragraph, only your string. Then the paragraph will be rendered as HTML and your string will be displayed within.

NightHawk
  • 3,633
  • 8
  • 37
  • 56
  • Thank you, but how might one keep from converting the <> characters? The string I would be passing is actually an entire html page. – Phil Tune Oct 02 '12 at 22:15
  • 1
    Take a look at this post: http://stackoverflow.com/questions/1364933/htmlentities-in-php-but-preserving-html-tags – NightHawk Oct 02 '12 at 22:19
  • +1 @NightHawk, that post helped. It seems like there is probably a predefined PHP function out there for this, but that works beautifully. Thanks! – Phil Tune Oct 02 '12 at 22:29
  • So I'm still having issues with this é character. When I `UPDATE` my database it turns out as é. Can't figure out why it's doing that or how to fix it. – Phil Tune Oct 08 '12 at 01:44
0

$str = "Is your name O\'vins?";

// Outputs: Is your name O'vins? echo stripslashes($str);

Vincy Oommen
  • 75
  • 1
  • 2
0

This works for me:

Create/edit .htaccess file with these lines:

AddDefaultCharset UTF-8
AddCharset UTF-8 .php

If you prefer create/edit php.ini:

default_charset = "utf-8"

Sources:

Community
  • 1
  • 1
quantme
  • 3,609
  • 4
  • 34
  • 49
0

Try This

Input:

<!DOCTYPE html>
<html>
<body>

<?php
$str = "This is some <b>bold</b> text.";
echo htmlspecialchars($str);
?>

<p>Converting &lt; and &gt; into entities are often used to prevent browsers from using it as an HTML element. <br />This can be especially useful to prevent code from running when users have access to display input on your homepage.</p>

</body>
</html>

Output:

This is some <b>bold</b> text.

Converting < and > into entities are often used to prevent browsers from using it as an HTML element. This can be especially useful to prevent code from running when users have access to display input on your homepage.
Milan Gajjar
  • 701
  • 2
  • 14
  • 24
0

This works for me. Try this one before the start of HTML. I hope it will also work for you.

<?php header('Content-Type: text/html; charset=iso-8859-15'); ?>
<!DOCTYPE html>

<html lang="en-US">
<head>
Hafiz Ameer Hamza
  • 462
  • 1
  • 4
  • 14
0

The following worked for me when having a similar issue lately:

$str = iconv('iso-8859-15', 'utf-8', $str);
Paul Roub
  • 36,322
  • 27
  • 84
  • 93
seantunwin
  • 1,698
  • 15
  • 15
0

One of the best ways to do this is, change Collation in my SQL database.

step 1: Go to the Mysql database

step 2: Select the Text-based Row you want to get displayed (Eg., post or comments)

step 3: edit the row and select collation as below.

utf8mb4_unicode_ci

Make sure to change the collation of text rows whichever you want to display the special characters.

Sometimes htmlspecialchars_decode() or any other entity() doesn't convert your special chars to normal. So, the above method will definitely help.

Mahesh
  • 11
  • 9