5

Can't read UTF-8 file names from Windows file system (main Windows language is English)

<?php

$path_to_read = 'D:\music';

class AudioFilterIterator extends FilterIterator
{
    public function accept()
    {
        return (strpos(parent::current(), '.mp3'));
    }
}

$object = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($path_to_read));

$iterator = new AudioFilterIterator($object);

echo "<pre>";

$files = array();

foreach($iterator as $file)
{
    echo $file . "\n";
}

So, as example, I have file named "10 Hört auf.mp3", but as output I get "10 Hort auf.mp3"

How can I fix it?

Danny
  • 444
  • 4
  • 19
  • 1
    Have you tried utf8_encode() ? Anyway It would make more sense for you to save your files with non-special-characters so this problem wouldn't exist. – Jonast92 Mar 28 '13 at 14:25
  • Why not use a `readdir()` and a `preg_match()` instead? This code is quite heavy for what it's doing. –  Mar 28 '13 at 14:30
  • 1
    Aren't filenames in Windows encoded in UTF16 ? see http://stackoverflow.com/a/2051018/393701 – SirDarius Mar 28 '13 at 14:31
  • Please check if you had added `` in the header. – tuxnani Mar 28 '13 at 14:40
  • 1
    @CamilStaps - I disagree. `readdir()` will require you to implement your own logic to recursively iterate through a directory structure, and using a regular expression to match file names when a simple `strpos()` suffices is extraneous overhead. I think the sample code the OP provided is lightweight and the approach I would recommend. – nickb Mar 28 '13 at 14:53

2 Answers2

0

There is no way to get through this bug due PHP working under WinAPI which have no reasonable UTF-8 support.

Danny
  • 444
  • 4
  • 19
  • you mean there is no problem on linux? – hpaknia Sep 23 '13 at 06:00
  • MS Windows is built on UTF-16, which can be transcoded to UTF-8 without any loss. It even comes with functions to transcode freely between UTF-16 and UTF-8. The rest is just a matter of implementation quality. – Ulrich Eckhardt Feb 10 '15 at 19:31
0

You can do it easily by using iconv function. Example for windows:

iconv("WINDOWS-1250", "UTF-8", $name);
  • 1
    Do not confuse people. WinAPI can't retrieve UTF8 names at all, and iconv will help with.. nothing. – Danny Feb 10 '15 at 18:45
  • 1
    As I said it works perfectly for me on Windows 7 and xampp. You can try it too before posting a comment. – Sonic03 Feb 12 '15 at 10:54