I was given this UTF-16 XML file to work with. I converted this to UTF-8 (iconv -f UTF16 -t UTF8 'file-utf16.xml' > 'file-utf8.xml'
) but the result doesn't seem like it's normal text file. I'm using OS X, and when I open this converted file in Sublime Text 2, the following is shown, and simplexml_load_file
return false
.
<?xml version="1.0" encoding="UTF-16" standalone="no"?>
<Item itemno="0000004" desc="" qtyavail="0" unitprice="0" salesprice="0" block="Yes" dnr="No"/>
<Item itemno="000001" desc="" qtyavail="0" unitprice="199.99" salesprice="199.99" block="No" dnr="No"/>
...
When I open it with textEdit, the characters are all strange. It's a mixture of Chinese characters and some other things like below. There is absolutely no Chinese in the original XML file, just Roman alphabet letters, numbers, and other typical characters used in XML.
㼼浸敶獲潩㵮ㄢ〮•湥潣楤杮∽呕ⵆ㘱•瑳湡慤潬敮∽潮㼢ਾ䤼整瑩浥潮∽〰〰〰∴搠獥㵣∢焠祴癡楡㵬〢•湵瑩牰捩㵥〢•慳敬灳楲散∽∰戠潬正∽教≳搠牮∽潎⼢ਾ䤼整瑩浥潮∽〰〰•敤捳∽•瑱慹慶汩∽∰甠楮灴楲散∽㤱⸹㤹•慳敬灳楲散∽㤱⸹㤹•汢捯㵫丢≯搠牮∽潎⼢ਾ
Is there something wrong with the encoding? If so, how can I make this into a regular text file to be read via simplexml_load_file
. If not, what is the problem here? As it is, this simplexml_load_file
returns false
on this file.
UPDATE:
Just realized that when I change the string encoding="UTF-16"
to encoding="UTF-8"
in the XML file, everything works. Is iconv
not enough to convert this to UTF-8?