1

I want to save some data in files with file names which would have unicode characters in it (like chinese, cyrillic, arabic, ...) with PHP's file_put_contents() functions. I don't want to encode them separately with something like urlencode() because no human would be able to read the file names if they contained only non-latin chars. The 3 biggest OSes Windows, MacOS / OS X and Linux support UTF-8 or UTF-16 chars in file names and can display them without problems but it seems that it's not that easy as just calling something like that:

file_put_contents(__DIR__ . DIRECTORY_SEPARATOR . "こんにちは.txt", "");

On Windows 7 (German localization) the file is stored as:

ã“ã‚“ã«ã¡ã¯.txt

The PHP file itself is saved in UTF-8 encoding. Is there an uniform way to save files with unicode in the file name on this 3 systems?

StanE
  • 2,704
  • 29
  • 38
  • Have you seen [this](http://stackoverflow.com/a/25395755/1459926) ? – Alexandru Guzinschi May 09 '15 at 17:08
  • urlify can't help with chinese etc. Use iconv('utf', 'local charset//translit', ...), or use PHP COM objects. – Deadooshka May 09 '15 at 18:44
  • @AlexandruGuzinschi Yes, but this doesn't addresses my issue. – StanE May 09 '15 at 21:31
  • @Deadooshka This wont work, since it doesn't solves my problem. I don't want to modify the original file name which must be used. Afaik, iconv converts only from one encoding to another encoding. In my example I used only japanese hiragana - sorry if I was not clearly: it is about unicode in general, so the file names can contain mixed chars (like cyrillic and chinese). – StanE May 09 '15 at 21:36
  • try e.g. [WinFsStreamWrapper](https://github.com/chintanbanugaria/92five/blob/master/vendor/patchwork/utf8/class/Patchwork/Utf8/WinFsStreamWrapper.php) and prefix `win://`. It uses COM with Scripting.FileSystemObject – Deadooshka May 09 '15 at 21:48
  • @jmiller If you have access to the Windows machine (_i.e: VPS, dedicated_), you can try installing [php-wfio](https://github.com/kenjiuno/php-wfio) PHP extension. – Alexandru Guzinschi May 10 '15 at 07:49
  • @AlexandruGuzinschi Unfortunately, the installation of an additional php extension is not a possible option. It is a big application, which is used by many users (and many of them don't have access to their virtual system). Otherwise I would write my own extension. ;-) But thank you for this reference. – StanE May 12 '15 at 09:50

2 Answers2

1

file_put_contents(__DIR___.iconv("UTF-8", "cp1251", $filename), $data); - works perfect!

Artem Mostyaev
  • 3,874
  • 10
  • 53
  • 60
nuker
  • 11
  • 2
  • Unfortunately not working. Try `$filename = "平がなприветSpäß.txt";`. iconv() says "Detected an illegal character in input string" and discards the chars. Using "//TRANSLIT" will transliterate them (I want the original chars), while "//IGNORE" will discard them silently. – StanE Dec 15 '17 at 15:53
0

As answered here It's a known Windows problem (glob() can't find file names with multibyte characters on Windows?).

You could try URLify to alleviate the problem only on Windows, files should have readable names then...

Community
  • 1
  • 1
Quim Calpe
  • 124
  • 4
  • I don't want other file names (they must not by different). URLify changes chars to completely other chars or omits them completely. On OS X 10.9.5 I have the same problem, but here the file name from the example above results in ?????.txt – StanE May 09 '15 at 21:30
  • Works fine for me in PHP 5.6.3 and Yosemite (OSX 10.10.3). On Windows your only chance is using a COM object... citation from a comment in this (http://stackoverflow.com/a/9882557/3950202) threat: "[...] this was quickly confirmed by one of the core devs in that same thread. Basically your only chance is to use COM or accept the fact that PHP on windows is not able to do this." – Quim Calpe May 11 '15 at 07:26