348

I have several PHP pages echoing out various things into HTML pages with the following code.

<meta http-equiv="Content-type" content="text/html; charset=utf-8" />

However, when I validate using the W3C validator it comes up with:

The character encoding specified in the HTTP header (iso-8859-1) is different from the value in the element (utf-8).

I am quite new to PHP, and I was wondering if I could and should change the header for the PHP files to match the HTML files.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
manycheese
  • 3,605
  • 2
  • 18
  • 11

7 Answers7

948

Use header to modify the HTTP header:

header('Content-Type: text/html; charset=utf-8');

Note to call this function before any output has been sent to the client. Otherwise the header has been sent too and you obviously can’t change it any more. You can check that with headers_sent. See the manual page of header for more information.

Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • 4
    I would only add that when you set the HTTP header correctly like this, you do not need the `` tag at all anymore. – Jon Nov 25 '10 at 16:55
  • 4
    @Jon: I would use both. The HTTP-equivalent `META` is used when the HTML document is not loaded via HTTP (e.g. from disk). – Gumbo Nov 25 '10 at 16:59
  • 7
    This will only work if your executing php, to do it for static pages, you should save your html file AS utf-8. Doing so will add the BOM character utf-8 encoded to the beginning of the file. bytes 0xEF, 0xBB, 0xBF added to the beginning of the file. Most web servers will notice this and apply the appropriate header. In fact saving your php file as utf-8, would accomplish the same thing. – Rahly Nov 25 '10 at 16:59
  • 1
    @Jeremy Walton: That the UTF-8 BOM is added does not happen necessarily. In fact, it’s not even necessary for UTF-8 as it only has one byte order (but it could be used to identify UTF-8). – Gumbo Nov 25 '10 at 17:01
  • 1
    @Gumbo: sure, I am simplifying here and targeting the by far most common web scenario (the question seems to talk about this scenario). Taking into account the apparent level of the question, why do something when you don't even understand what the advantages it may someday provide are? – Jon Nov 25 '10 at 17:04
  • I'd strongly advice in NOT saving files in UTF-8-format, (with the UTF8-BOM described by @Rahly) . When serving a PDF, image or other binary data back to client, you will get into trouble. The leading BOM of your php-scripts may* distort the output. (*may = web server dependent). Simply: Always avoid UTF8-boms in your php-files. – Teson Sep 08 '16 at 08:57
35

First make sure the PHP files themselves are UTF-8 encoded.

The meta tag is ignored by some browser. If you only use ASCII-characters, it doesn't matter anyway.

http://en.wikipedia.org/wiki/List_of_HTTP_header_fields

header('Content-Type: text/html; charset=utf-8');
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
KingCrunch
  • 128,817
  • 21
  • 151
  • 173
16

This is a problem with your web server sending out an HTTP header that does not match the one you define. For instructions on how to make the server send the correct headers, see this page.

Otherwise, you can also use PHP to modify the headers, but this has to be done before outputting any text using this code:

header('Content-Type: text/html; charset=utf-8');

More information on how to send out headers using PHP can be found in the documentation for the header function.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
EdoDodo
  • 8,220
  • 3
  • 24
  • 30
13

You can also use a shorter way:

<?php header('Content-Type: charset=utf-8'); ?>

See RFC 2616. It's valid to specify only character set.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Jason OOO
  • 3,567
  • 2
  • 25
  • 31
  • I like this option, because (I assume) it would allow you to set the other part of the content type separately (for example, you have some text/plain pages, and some text/html pages, but they are all UTF8.) Is my understanding correct? – Eric Seastrand Jan 29 '15 at 15:55
  • 1
    I cannot find the part of RFC 2616 that says it's valid to specify that way. `Content-Type = "Content-Type" ":" media-type` and `media-type = type "/" subtype *( ";" parameter )` – AI0867 Apr 13 '16 at 12:27
  • 1
    It’s not valid to only specify the charset. It’s not valid per RFC 2616 (which is anyway obsolete) nor per RFC 7231 (which is not obsolete) nor per any other RFC. See http://stackoverflow.com/questions/41994062/content-type-with-charset-only/41994400#41994400 – sideshowbarker Feb 02 '17 at 05:31
  • Are you sure you're not confusing this with HTML5's meta charset attribute? – PHP Guru Nov 19 '20 at 06:36
11

For a correct implementation, you need to change a series of things.

Database (immediately after the connection):

mysql_query("SET NAMES utf8");

// Meta tag HTML (probably it's already set): 
meta charset="utf-8"
header php (before any output of the HTML):
header('Content-Type: text/html; charset=utf-8')
table-rows-charset (for each row):
utf8_unicode_ci
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
UnChien Andalou
  • 111
  • 1
  • 3
  • 5
    The coalition of the database does not influence the output generated by PHP because the data is encoded to the native format configured for use with PHP before it's ever returned to the user. Secondly OP hasn't mentioned he's using MySQL. Thirdly MyISAM is outdated and should not be recommended unless you know what you're doing There is a reason InnoDB became the new default. – EWit Aug 18 '14 at 22:32
  • finally a complete list of all places to set character encoding. – Filip OvertoneSinger Rydlo May 19 '15 at 11:05
  • mysql_query("SET NAMES utf8"); before my select query fixed the issue for me . thanks :) – Deepak Goswami Mar 04 '16 at 05:59
9

PHP sends headers automatically if set up to use internal encoding:

ini_set('default_charset', 'utf-8');
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Nikl
  • 1,064
  • 10
  • 10
3

As explained on http://php.net/default-charset,

the "UTF-8" is the default value and its value is used as the default character encoding for htmlentities(), html_entity_decode() and htmlspecialchars() if the encoding parameter is omitted.

It is set on default php.ini as "UTF-8" on the "Data handling" section as:

; PHP's default character set is set to UTF-8.
; http://php.net/default-charset
default_charset = "UTF-8"

Also, you can set, before the content, the header to another encoding as needed:

header('Content-Type: text/html; charset=utf-8');

or

header('Content-Type: text/html; charset=iso-8859-1');

or any other charset you need to declare.