I want to detect a operating system of filesystem's encoding as default, like Windows OS in different language version it will use different encoding (iso-8859-1, ms950, big5, gb2312..etc) So how can I detect the different operating system of encoding in PHP? Any idea? Thanks.
-
Have you checked the other questions here on SO regarding encoding identification? Look at this one for example: http://stackoverflow.com/questions/910793/php-detect-encoding-and-make-everything-utf-8 Or this one: http://stackoverflow.com/questions/505562/detect-file-encoding-in-php – Till Helge Nov 30 '11 at 14:04
-
I'm not sure the file system delegates an encoding... mb_list_encodings will return a list of supported encodings. – Incognito Nov 30 '11 at 14:07
-
That is not I want answer and That is differnt question with my – Jasper Nov 30 '11 at 14:09
4 Answers
Linux does not have an encoding, filenames are stored in binary strings and may contain anything. Interpreting that in a specific encoding is up to the application. Most often this will simply be UTF-8. But yea, it depends on the 'viewer' of filenames.
Accessing the filesystem on OS/X will use UTF-8 normalization form D.
Unfortunately, I can not answer what it is on windows. Internally it's stored as a variation of UTF-16 but accessing it through PHP on my machine the api is CP-1252, but yea, this does depend on the language.

- 93,428
- 18
- 118
- 189
Try
print_r( explode(";", setlocale(LC_ALL, 0)));
Then need convert code page to encoding

- 5,316
- 3
- 40
- 50
FileSystem doesn't have a kinds of encoding, each file can use different kinds of encoding, so all you need is find a right encoding to process the filename string.
To detect a filename's encoding, you can just "try" to convert that filename to all you known encode list, and compare the original filename string with the converted string, if equals, then that encoding is what you are looking for.
Convert a string to a kinds of encoding i use This way. So to do this work, you can see the following code for a example.
function getActuallEncoding($text) {
$encodingList = array('UTF-8', 'gb2312', 'ISO-8859-1', 'big5'); // Add more if you need.
foreach($encodingList as $oneEncode) {
$oneResult = iconv(mb_detect_encoding($text, mb_detect_order(), true), $oneEncode, $text);
if(md5($oneResult) == md5($text)) return $oneEncode;
}
return "UNKNOWN"; // This return value may cause problem, just let you know.
}
Hope that helps.

- 340
- 4
- 13