I'm using XMLReader to parse XML from a 3rd party. The files are supposed to be UTF-8, but I'm getting this error:
parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0x11 0x72 0x20 0x41 in C:\file.php on line 166
Looking at the XML file in notepad++ it's clear what's causing this: there is a control character DC1 contained in the problematic line.
The XML file is provided by a 3rd party who I cannot reliably get to fix this/ensure it doesn't happen in the future. Could someone recommend a good way of dealing with this? I'd like to just do away with the control character -- in this particular case just deleting it from the XML file is fine -- but am concerned that always doing this could lead to unforeseen problems down the road. Thanks.