PHP 5 and earlier versions have no native Unicode support. PHP 6 or 7, where the Unicode support has been promised, may take years. To bridge the gap, there exist several extensions like mbstring, iconv and intl.
Make sure you send the HTML Response with an appropriate content-type and encoding, e.g.
<?php header('Content-Type: text/html; charset=utf-8');?>
Also check that the XML file prolog contains the proper encoding, e.g.
<?xml version="1.0" encoding="UTF-8"?>
Assuming that is all correct, it appears that the xml file is claiming to be UTF-8 but is actually something else (likely latin1 or ISO-8859-1 or Mojibake.). You can manually open the XML file in your favorite editor (I like Sublime) and save the file explicitly with a UTF8 Encoding. Or you can use a function to attempt to modify the string before loading. Like the one from: Error: "Input is not proper UTF-8, indicate encoding !" using PHP's simplexml_load_string
function fix_latin1_mangled_with_utf8_maybe_hopefully_most_of_the_time($str)
{
return preg_replace_callback('#[\\xA1-\\xFF](?![\\x80-\\xBF]{2,})#', 'utf8_encode_callback', $str);
}
function utf8_encode_callback($m)
{
return utf8_encode($m[0]);
}
But at the end of the day, it's going to be messy and PHP still doesn't seem to handle Unicode as well as we would all like it to and it simply isn't built into the core.
We suggest you check out Portable UTF-8 - a Lightweight Library for Unicode Handling in PHP.