The Issue
I've been having some trouble with what I think is a UTF-8 encoding issue where posts are not being saved to my database.
The issue occurs when a user copy and pastes text from MS Word. There seems to be a particular combination of characters causing this issue (I've not found any other variations which cause the same issue yet):
% b
% B
This means that, when I var_dump()
my input I get:
string(5) "70�ck"
Instead of:
string(5) "70% back"
Edit: The database error I get is:
Incorrect string value: '\xBAck an...' for column [...]
What I've tried
I'm using the Summernote JS plugin. I've tried a different plugin (WYSIHTML5) and I've tried with no plugin at all. I've tried pasting the clipboard text as plain text. I've even got an onPaste
callback on the summernote which strips all the stupid encoding/styling from MS Word (which is summernote specific issue I think).
Unfortunately I've not been able to get anywhere with searching 'encoding issue "% b"' and variations thereof... but I would presume that the combination of characters above is somehow getting translated into a character that is unsupported by the database...
- Database is
MySQL 5.7.10
and I'm usingutf8_general_ci
collation on all columns. - I've set the charset to UTF-8 within CodeIgniter:
$config['charset'] = 'UTF-8';
- Within CodeIgniter's database config I've specified
'char_set' => 'uft8', 'dbcollat' => 'utf8_general_ci'
- The page's meta tag is set to use utf-8:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
- The form has the
accept-charset="utf-8"
attribute
Update: I've also tried the solution suggested in this question
I think I've done all the usual troubleshooting and I'm a bit stuck. Does anyone know why this specific combination of characters causes issue? Perhaps I'm wrong and it's not an encoding issue at all? Does anyone have any other ideas?