3

I am using a plugin in Wordpress that stores some html code in a field. This column is mediumtext utf8_general_ci

When I want to output this information all special characters such as " ´ ' ... and some others are displayed as black diamonds with a question mark inside

When the plugin shows this text it's perfect. This is in the head:

<meta charset="UTF-8" />

When I output elsewhere I have this in my head and all those characters are lost:

<meta http-equiv="content-type" content="text/html; charset=utf-8"/>

Characters are properly stored in MYSQL, and I output it via PHP like this

ob_start();?>
some html
<?php echo $row4['details'];?> <- this is the field im talking about
some html
<?php
$details = ob_get_clean();

and later just:

echo $details;

I have read quite enough and the charset is always mentioned, but I think mine is OK.

Thanks for the help!

Edit: Adding Full example

Text shown in WP. As I said in a comment this might have been copy/pasted from MS Word into Worpress by an editor.

8. Choose the verb to complete the sentence.

______ you ever _____ the proverb, “Time is gold”?
 Has… heard…
 Have… hear…
 Have… heard…

Stored in database like this (don't look into correct/user answer that works properly)

<span class='watupro_num'>8. </span>Choose the verb to complete the sentence.</p>
    <p>______ you ever _____ the proverb, “Time is gold”?</p></div>
    <ul>
    <li class='answer'><span class='answer'>Has… heard…</span></li>
    <li class='answer user-answer'><span class='answer'>Have… hear…</span></li>
    <li class='answer correct-answer'><span class='answer'>Have… heard…</span></li>
    </ul>

Text shown out of database (what im working on)

8. Choose the verb to complete the sentence.

______ you ever _____ the proverb, �Time is gold�?

    Has� heard�
    Have� hear�
    Have� heard�

Text shown after utf8_encode() thanks to Kirit Patel

8. Choose the verb to complete the sentence.

______ you ever _____ the proverb, Time is gold?

    Has heard
    Have hear
    Have heard

I've just discovered characters are still there. I can see a box with numbers while editing (not displayed in preview).

Something like this character

will
  • 71
  • 7
  • 1
    With "special character" you refer to stuff like good old acute accent ([U+00B4 ACUTE ACCENT](http://www.fileformat.info/info/unicode/char/b4/index.htm))? I don't think your app is using UTF-8 at all and declaring so in HTML tags will not change the fact. – Álvaro González Mar 11 '17 at 09:26
  • Sounds like you need to set the PHP header and connection to utf8. – Qirel Mar 11 '17 at 09:38
  • It happens to apostrophes, dash, ellipsis, quotation marks etc. How do I set the PHP header and connection? I dont know what you mean. – will Mar 11 '17 at 10:03
  • 1
    U+00B4 takes two bytes in UTF-8. If your app was using UTF-8 and something was wrong you'd be seeing the individual bytes (`´` in this case). I suspect your app uses Windows-1252 or ISO-8859-1 so MySQL will convert the UTF-8 database value to such single byte encoding. If you declare it as UTF-8 (which is not) you'll see `�`). – Álvaro González Mar 11 '17 at 13:28
  • 1
    My educated guess at this point is that you haven't configured Wordpress to use UTF-8. I know nothing about that software so I can't tell you how to fix it but I'll add an appropriate tag and hopefully call someone's attention. – Álvaro González Mar 11 '17 at 13:33
  • I think you might be pointing in right direction @ÁlvaroGonzález This code is generated from some users (WP editors) that might have copy pasted text from Word into WP plugin and from there into database. Does anyone know an easy way to tell editors to paste this text? Maybe pasting into wordpad and then into wordpress? – will Mar 12 '17 at 10:10

2 Answers2

4

You can use just like

<?php 

echo utf8_encode($row4['details']);
?>

Hope this helps you.

Amit Verma
  • 40,709
  • 21
  • 93
  • 115
Kirit Patel
  • 122
  • 9
  • It does look much better as diamonds are gone. But some of the characters arent displayed. I guess this is enough as it looks fine for end user, even if some characterws arent displayed. I think I will leave it like this but solution isnt complete. Thanks – will Mar 11 '17 at 10:11
  • Can you please give full text? – Kirit Patel Mar 11 '17 at 10:21
  • If this workaround solves the issue, my initial guess was correct and `$row4['details']` is encoded as Windows-1252, ISO-8859-1 or something similar (which one is hard to say without further details). – Álvaro González Mar 11 '17 at 13:31
  • Edited original question – will Mar 12 '17 at 10:38
0

"Black diamonds" are discussed here . The "Someting like this" shows 0093, which is an invalid utf8 code for some flavor of quote; on the other hand, hex 93, interpreted as latin1 is .

Can you simply switch to ascii quotes and apostrophes?

Otherwise, you need to use utf8 throughout -- starting with whatever is generating the hex 93.

Community
  • 1
  • 1
Rick James
  • 135,179
  • 13
  • 127
  • 222