151

I am using a hosting company and it will list the files in a directory if the file index.html is not there. It uses ISO 8859-1 as the default encoding.

If the server is Apache, is there a way to set UTF-8 as the default instead?

I found out that it is actually using a DOCTYPE of HTML 3.2 and then there is not charset at all... so it is not setting any encoding. But is there a way to change it to use UTF-8?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
nonopolarity
  • 146,324
  • 131
  • 460
  • 740
  • This question is very old but currently (in 2021), at least in my case (Debian 10), the utf-8 characters are served properly and it seems that it's not needed to uncomment or change the setting `AddDefaultCharset` to utf-8 at all (On Debian, it's in `/etc/apache2/conf-available/charset.conf`). – aderchox Dec 21 '21 at 03:38

13 Answers13

191

In httpd.conf add (or change if it's already there):

AddDefaultCharset utf-8
MartinodF
  • 8,157
  • 2
  • 32
  • 28
62

Add this to your .htaccess:

IndexOptions +Charset=UTF-8

Or, if you have administrator rights, you could set it globally by editing httpd.conf and adding:

AddDefaultCharset UTF-8

(You can use AddDefaultCharset in .htaccess too, but it won’t affect Apache-generated directory listings that way.)

Mathias Bynens
  • 144,855
  • 52
  • 216
  • 248
  • 4
    This is a great solution and less invasive than modifying the httpd.conf file. – Andy Swift Jun 26 '12 at 16:20
  • 1
    on my server, the `.htaccess` can affect all the subdirectories as well, probably apache will look for any `.htaccess` up the parent directory all the way to the root directory of the website folder – nonopolarity Sep 27 '12 at 13:14
  • 2
    Yes, that’s how `.htaccess` works on all servers — it affects all subdirectories as well. However, Apache-generated directory listing pages can’t be forced to UTF-8 by using `.htaccess` (AFAIK). – Mathias Bynens Sep 27 '12 at 16:42
  • 9
    Please note changing **serverwide** settings via `.htaccess` files is generally bad practice. Bugs become harder to track when server settings are distributed across various files. There's a slight performance hit too: with each requested file, Apache has to read the directory's `.htaccess` file and all `.htaccess` files of parent directories. `.htaccess` should therefore only be used for either directory specific settings (e.g. preventing access to a specific directory) or when there is absolutely no possibility to gain administrator rights. – Robbert Sep 07 '13 at 12:58
  • 3
    Up voted, the IndexOptions +Charset=UTF-8 did the trick for me, thanks! – mTorres Sep 14 '13 at 10:14
  • on debian you want to change apache2.conf rather than httpd.conf source: http://www.control-escape.com/web/configuring-apache2-debian.html – sivi Sep 14 '14 at 16:14
  • With `Options -Indexes` that I keep on for security reasons on Apache, the `IndexOptions +Charset=UTF-8` does not work even for single files – Marco Demaio Jan 19 '19 at 18:17
  • The other answer didn't have a plus: IndexOptions Charset=UTF-8. I wonder which one(s) are right. – Dan Jacobson Apr 01 '23 at 02:27
31

See AddDefaultCharset Directive, AddCharset Directive, and this article.

AddDefaultCharset utf-8

But I have to use Chinese characters now and then. Previously, I translated Chinese characters to Unicode code and include it in the document using the &# hack. But it is only useful for page having a few characters.

There is a better way to do that: encode the charset information in the filename, and apache will output the proper encoding header based on that. This is possible thanks to the AddCharset lines in the conf file, such as the line below:

conf/httpd.conf:

AddCharset UTF-8 .utf8

So if you have a file whose names ends in .html.utf8, apache will serve the page as if it is encoded in UTF-8 and will dump the proper character-encoding directive in the header accordingly.

Eugene Yokota
  • 94,654
  • 45
  • 215
  • 319
28

In file .htaccess, add this line:

AddCharset utf-8 .html .css .php .txt .js

This is for those that do not have access to their server's configuration file. It is just one more thing to try when other attempts failed.

As far as performance issues regarding the use of file .htaccess, I have not seen this. My typical page load times are 150-200 ms with or without file .htaccess.

What good is performance if your page does not render correctly? Most shared servers do not allow user access to the configuration file which is the preferred place to add a character set.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Misunderstood
  • 5,534
  • 1
  • 18
  • 25
25

On Ubuntu 12.04, it's sufficient to uncomment the line AddDefaultCharset UTF-8 in /etc/apache2/conf.d/charset. If you're using upstream Apache, the file may be called httpd.conf, and you may have to insert the line.

  • 4
    There is no such file as `/etc/apache2/conf.d/charset`. It is a custom include file by your distribution. As is any other file that’s not `httpd.conf`. – Evi1M4chine Jul 20 '15 at 15:25
  • 3
    Its `/etc/apache2/conf-enabled/charset.conf` on my distribution(Ubuntu 16.4).Also didnt work. – Alator Mar 04 '18 at 18:54
  • Can you [update your answer](https://stackoverflow.com/posts/15253250/edit), e.g. with Linux distribution information, incl. version. E.g., what was the original Linux distribution and version? (But ***without*** "Edit:", "Update:", or similar - the answer should appear as if it was written today.) – Peter Mortensen Aug 15 '21 at 12:44
12

For completeness, on Apache2 on Ubuntu, you will find the default charset in charset.conf in conf-available.

Uncomment the line

AddDefaultCharset UTF-8
David Glance
  • 619
  • 5
  • 8
11

I'm not sure whether you have access to the Apache config (httpd.conf) but you should be able to set an AddDefaultCharset Directive. See:

http://httpd.apache.org/docs/2.0/mod/core.html

Look for the mod_mime.c module and make sure the following is set:

AddDefaultCharset utf-8 

or the equivalent Apache 1.x docs (http://httpd.apache.org/docs/1.3/mod/core.html#adddefaultcharset).

However, this only works when "the response content-type is text/plain or text/html".

You should also make sure that your pages have a charset set as well. See this for more info:

http://www.w3.org/TR/REC-html40/charset.html

Jonathan Holloway
  • 62,090
  • 32
  • 125
  • 150
10

This is untested, but it will probably work.

In your .htaccess file, add:

<Files ~ "\.html?$">  
     Header set Content-Type "text/html; charset=utf-8"
</Files>

However, this will require mod_headers on the server.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
MiffTheFox
  • 21,302
  • 14
  • 69
  • 94
5

Just a hint if you have long filenames in UTF-8 format: by default they will be shortened to 20 bytes, so it may happen that the last character might be "cut in half" and therefore unrecognized properly. Then you may want to set the following:

IndexOptions Charset=UTF-8 NameWidth=*

NameWidth setting will prevent shortening your file names, making them properly displayed and readable.

As other users already mentioned, this should be added either in httpd.conf or apache2.conf (if you do have admin rights) or in .htaccess (if you don't).

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
pstryk
  • 1,991
  • 1
  • 14
  • 11
3

Where all the HTML files are in UTF-8 and don't have meta tags for content type, I was only able to set the needed default for these files to be sent by Apache 2.4 by adding both directives:

AddLanguage ru .html
AddCharset UTF-8 .html
hon2a
  • 7,006
  • 5
  • 41
  • 55
Alex
  • 31
  • 1
3

Just leave it empty: 'default_charset' in WHM :::::: default_charset =''

P.S.: In WHM, go → HomeService ConfigurationPHP Configuration Editor → click 'Advanced Mode' → find 'default_charset' and leave it blank. Just nothing, not UTF-8 and not ISO.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
grrow
  • 31
  • 3
1

<meta charset='utf-8'> overrides the Apache default charset (cf /etc/apache2/conf.d/charset)

If this is not enough, then you probably created your original file with the ISO 8859-1 encoding character set. You have to convert it to the proper character set:

iconv -f ISO-8859-1 -t UTF-8 source_file.php -o new file.php
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Gaby
  • 11
  • 1
  • 4
0

In my case I added this to file .htaccess:

AddDefaultCharset off
AddDefaultCharset windows-1252
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ruslan Novikov
  • 1,320
  • 15
  • 21