This question tells me
htmlentities is identical to htmlspecialchars() in all ways, except with htmlentities(), all characters which have HTML character entity equivalents are translated into these entities.
Sounds like htmlentities is the one I want.
Then this question tells me I need the "UTF-8" argument to get rid of this error:
Invalid multibyte sequence in argument
So, here is my encoding wrapper function (to normalise behaviour across different PHP versions)
function html_entities ($s)
{
return htmlentities ($s, ENT_COMPAT /* ENT_HTML401 */, "UTF-8");
}
I am still getting the "multibyte sequence in argument" error.
Here is a sample string which triggers the error, and it's hex encoding:
Jigue à Baptiste
4a 69 67 75 65 20 e0 20 - 42 61 70 74 69 73 74 65
I notice that the à is encoded as 0xe0 but as a single byte which is above 0x80.
What am I doing wrong?