6

I use special characters (swedish letters åäö).

Now, I have some folders, which contains images for classifieds. The folders are named by category.

for ($i=1; $i<=5; $i++){
    if (file_exists($big_images.$i.'.jpg')){ echo "Inne";
        unlink($big_images.$i.'.jpg');
    }
    if (file_exists($thumb_images.$i.'.jpg')){
        unlink($thumb_images.$i.'.jpg');
        }
    }

I allow up to 5 images on my site, each ending with a nr 1-5. However, my problem is this, whenever the folder-name has a special character, the file_exists returns false, ie it doesn't find the file. Even though it is there.

All documents are in utf-8 format.

This works when there is no special characters in the folder names.

If you need more input let me know

1 Answers1

23

What's the server OS?

If it's Windows, you'll not be able to access files under a UTF-8-encoded filename, because the Windows implementation of the C IO libraries used by PHP will only talk in the system default code page. For Western European installs, that's code page 1252. You can convert a UTF-8 string to cp1252 using iconv:

$winfilename= iconv('utf-8', 'cp1252', $utffilename);

(utf8_decode could also be used, but it would give the wrong results for Windows's extension characters that map to the range 0x80-0x9F in cp1252.)

Files whose names include characters outside the repertoire of the system codepage (eg. Greek on a Western box) cannot be accessed at all by PHP and other programs using the stdio. There are scripting languages that can use native-Unicode filenames through Win32 APIs, but PHP5 isn't one of them.

And of course the step above shouldn't be used when deployed on a different OS where the filesystem is UTF-8-encoded. (ie. modern Linux.)

If you need to seamlessly cross-server-compatible with PHP, you'll have to refrain from using non-ASCII characters in filenames. Sorry.

bobince
  • 528,062
  • 107
  • 651
  • 834
  • I am encountering this issue on a system that, due to use case requirements, *must* be hosted on a Windows machine, and *must* contend with files that have characters outside of the 1252 codepage. I've found that the `DirectoryIterator` and `SplFileInfo` libraries are able to interact with these files. `iconv` throws warnings about invalid characters, and `file_exists` will fail in many circumstances. +1 from me for getting me on the track that eventually lead to our half-measure solution, and again note that `SplFileInfo` seems to be a more universal solution. – Chris Baker Jan 07 '14 at 16:58
  • alternatively url character replacing special characters in your filenames can be a good solution. – vdegenne Apr 03 '15 at 21:57
  • Can't emphasize enough: "**…If you need be to seamlessly cross-server-compatible with PHP, you'll have to refrain from using non-ASCII characters in filenames…**" – Uwe Keim Mar 05 '17 at 18:55