0

After a conversion process and migration, I ended up with some content being converted improperly where ' was converted into a â€. For example: the word :World's was converted into worldâ€s.
This happened for a very few select posts. In the database the characters were saved as: Strangely enough when I copy paste the value from the terminal it is world†but the output as I am looking at it is:

enter image description here

The table description is:


bundle  varchar(128)    NO  MUL     
deleted tinyint(4)  NO  PRI 0   
entity_id   int(10) unsigned    NO  PRI NULL    
revision_id int(10) unsigned    NO  MUL NULL    
langcode    varchar(32) NO  PRI     
delta   int(10) unsigned    NO  PRI NULL    
field_art_content_value longtext    NO      NULL    
field_art_content_format    varchar(255)    YES MUL NULL

I tried to do a select with field_art_content_value like â€` but the terminal is not returning anything. How do I properly query for those characters?

awm
  • 1,130
  • 3
  • 17
  • 37
  • Please [edit] your question to share a [mcve]. Looks like a [mojibake](https://en.wikipedia.org/wiki/Mojibake) case (example in Python): `"’ ‛".encode( 'utf-8').decode( 'cp1252')` returns `'’ ‛'`. The characters are `‘` (U+2018, *Left Single Quotation Mark*) and `’` (U+2019, *Right Single Quotation Mark*). Read and follow [UTF-8 Everywhere](https://utf8everywhere.org/) and [UTF-8 all the way through](https://stackoverflow.com/questions/279170/)… – JosefZ Mar 03 '22 at 15:53
  • Actually I was able to execute a like query in mysql. I copy and pasted the characters. MAC warned me that I am pasting with control character and I clicked proceed "paste". And it worked. I got all affected ids. the query was: `select entity_id from node__field_art_content where field_art_content_value like '%â€%' collate utf8_bin;`. – awm Mar 03 '22 at 16:09
  • `’` Mojibake for `’`; see https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored – Rick James Mar 04 '22 at 17:40
  • I am sure it's related. In my case, it was peculiar. In my screenshot you see the char `â` . However, to its right there appear to be 2 hidden characters. Not empty space but `control characters`. When I put the cursor on it, and move, I hit the right button 2 times for it to move. The other question definitly list steps to troubleshoot this by looking at the hex code. Thx – awm Mar 04 '22 at 19:28

0 Answers0