0

I need to create a website (in PHP) that has filenames that include international characters.

For example: transportører.php (notice the 'o' with the diagonal line through it).

So I happily create the file, save it, and upload it to the web server. Whenever I LINK to this file, however, it all goes wrong. I'll have the usual link syntax:

<a href="transportører.php">My Link Text</a>

Upon clicking such a link, the web browser attempts to navigate to a non-existent page:

The requested URL /transportører.php was not found on this server.

Notice how the filename has been mutated? The "ø" character in "transportører.php" has been changed into the bizarre "ø" symbol (that's not a comma after the "A", by the way, but an actual component of the symbol itself).

There's obviously some sort of translation going on here, but what, why, and how do I prevent it?

unor
  • 92,415
  • 26
  • 211
  • 360
Nic
  • 1
  • 1

3 Answers3

0

I think, it's two possible reasons:

html encoding

Possibly the encoding of the html file is wrong, so the link is actually pointing to a wrong path. Add

<meta charset="UTF-8">

in the head section of your file.

server settings

If the server is resolving the link wrongly (you can check this by typing the address of your norwegian-named.php in the browser and see if it is replaced), you need to know which server you are using and investigate in this direction. For apache, How to change the default encoding to UTF-8 for Apache? looks promising.

Jonathan Scholbach
  • 4,925
  • 3
  • 23
  • 44
  • Thanks for the response Jonathan. However, I don't think this is it. I'm actually getting a "404" response (because the browser is attempting to open a non-existent file - transportører.php - instead of the linked file transportører.php. Hence the HTTP headers are not even being read. If you want to reproduce the fault, try browsing to www.cclnorway.co.uk, and click on the "transportører" link in the horizontal navigation bar just below the banner ? – Nic Jan 30 '18 at 11:06
  • @Nic OK, I checked that, looks like your server is resolving the address into ascii. I think you need to change settings on your server. I updated my answer accordingly. – Jonathan Scholbach Jan 30 '18 at 11:20
0

As the URL isn’t percent-encoded in the hyperlink, browsers assume¹ UTF-8 for percent-encoding it, where ø becomes %C3%B8.

However, your server seems to expect/use ISO 8859-1 (instead of UTF-8), where ø becomes %F8.

A quick fix would be to link to the ISO 8859-1 percent-encoded URL:

<a href="transport%F8rer.php">transportører</a>

(A better fix would be to let your server use UTF-8 for everything, and then to use the UTF-8 percent-encoded URL in the hyperlink.)


¹ Either by default, or because the linking page seems to use UTF-8 (at least according to the HTTP header Content-Type: text/html; charset=UTF-8).

unor
  • 92,415
  • 26
  • 211
  • 360
0

Well, this is embarrassing. Everything was - in actual fact - working correctly. The 404 error made the filename LOOK "wrong" - e.g. transportører.php. However, this is actually correct. That is how HTML seems to reference the file "behind the scenes". So to the browser, "transportører.php" is synonymous with "transportører.php"

What was happening was that FileZilla (my FTP client) objects to international characters. It was changing the filename during upload.... replacing the international characters with "something else". The filenames LOOKED correct on the screen (when I viewed the website folder with Linux Mint's native FTP client), but the underlying character coding was NOT correct. The web-browsers could tell the difference, and hence didn't associated my links with the (mutated) file names, hence triggering an error 404.

The solution in a nutshell: I used Linux Mint native FTP to upload my files, overwriting the ones uploaded by FileZilla, and everything just sprang into life.

Thanks to everyone who offered advice... it was all good stuff, just not the solution in this particular case.

Nic
  • 1
  • 1