The DOMDocument::loadHTML()
method does not expect an UTF-8 encoded string. So you can say it is an exception to the many other methods in the DOM extension because all those expect an UTF-8 encoded string. Same btw. applies to all methods of the DOM extension that care about loading XML/HTML data from either a file, a remote-location or a string. They follow different and more complex rules for the encoding of the string.
Encoding for DOMDocument::loadHTML()
:
If the HTML string you pass in there does not contain any hinting on the encoding (e.g. inside meta-tags), then the encoding of the string must be Latin-1.
If the string does have a hint of the encoding, then it needs to be in that hinted encoding and that one needs to be one of the supported encodings.
Notes:
- I'm not aware if a list of supported encodings exists.
- As you don't show your HTML code you load in there, I can't say if it contains a hint on the encoding.
- I'm not aware if a list of all supported ways to hint the encoding with HTML for
DOMDocument::loadHMTL()
exists.
However: For an example on how to load a HTML document or fragment of a specific encoding see this related answer of mine:
It most likely will show you how you can load your HTML. It also explains this in more detail. Let me know if it doesn't solve your issue.