PHP: 7.2.5 Laravel: 7.25
We have a bug where a very small number of users are trying to insert copy with the '' character included. I'm assuming this is because of a copy and paste from a PDF, I have seen them before with line breaks. This produces the following error:
SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF4\x8F\xB0\x80</...' for column 'body' at row 1 (SQL: update `post` set `body` = <p></p>, `body_raw` = , `post`.`updated_at` = 2020-10-06 10:34:22 where `id` = 1)
Character '':
- Decimal Character Codes: 56319, 56320
- Hexadecimal Character Codes: 0xdbff, 0xdc00
- HTML with named character references:
� �
Looking at Google, a suggestion is that you could update the DB encoding from utf8 to utf8mb4. This is probably the optimal solution, but we have a large database and I'm uneasy amending the encoding (though this may be very safe). I'm concerned about possible data loss/corruption.
As this issue is only appearing on this 1 character in our bug system, and its 100% not required, I'm inclined to just remove it before saving it in the database, to create the minimum changes.
I'm inclined to do the following:
str_replace("","", $post);
But if I paste the character '' into any of my code editors it disappears (I assuming utf8 encoding). What would the best way to accomplish this?