2

Whenever I copy text from Ms-Word Doc and paste it in Text-Editor on my PHP Page and text contains Apostrophe After that when i fetch that details on page then Apostrophe is converted to � Diamond . It only happens when copy text from online or Ms-Word type of source

Below Code1 Is getting converted to Code 2

Code1

Budget – I am not Very Much Sure About the Budget . But I would like you to suggest a Fair Price 1. Timeline – Our Website is stil’l in’’ “ *

• developme’nt Mode So w*&^%$#@!()*& are make this gif Before and Our Website is going to be ready in 20 days so if Gif Can be get ready within this timeline it will be good

Code 2

Budget � I am not Very Much Sure About the Budget . But I would like you to suggest a Fair Price

1. Timeline � Our Website is stil'l in'' " *

� developme'nt Mode So w*&^%$#@!()*& are make this gif Before and Our Website is going to be ready in 20 days so if Gif Can be get ready within this timeline it will be good

Isaac Bennetch
  • 11,830
  • 2
  • 32
  • 43
  • htmlspecialchars uses UTF-8 `PHP 5.4 and 5.5 will use UTF-8 as the default. Earlier versions of PHP use ISO-8859-1. ` So while that may work in the past, you may need to use UTF-8 as the encoding, depending on your php version, you can also supply the encoding to htmlspecialchars http://php.net/manual/en/function.htmlspecialchars.php – ArtisticPhoenix Apr 14 '16 at 17:31
  • @ArtisiticPhoenix i figured out that thing and updated my problem please check –  Apr 14 '16 at 17:55
  • 1
    That's called a smart quote, or curly quote see http://stackoverflow.com/questions/1262038/how-to-replace-microsoft-encoded-quotes-in-php – ArtisticPhoenix Apr 14 '16 at 18:01
  • @ArtisiticPhoenix thanks for the link but there are still some problems i updated the question above –  Apr 14 '16 at 18:24
  • that's because the mdash, and bullet are not valid UTF-8 characters, same deal you'll have to replace them with an equivalent one – ArtisticPhoenix Apr 15 '16 at 15:56
  • You're not handling encodings correctly throughout your app, period. This problem becomes apparent when your app tries to handle non-ASCII characters. How to remedy that in particular we can't tell you without knowing your app is and what it does. – deceze May 03 '16 at 04:13

1 Answers1

-1

Microsoft Word uses CP1252 character set, not UTF-8. Microsoft uses an apostrophe (code 0x92) that is an invalid UTF-8 character. So to fix it, you may copy from Word and paste into a text editor than can convert. Make sure to mark your source code character encoding as UTF-8 first. In Eclipse, this is File Properties > Text File Encoding > Other > UTF-8. It then stores the character encoding on that file in .settings/org.eclipse.core.resources.prefs. If you paste the text then try to mark the file as UTF-8, Eclipse will incorrectly convert the characters.

microsoft word eclipse chrome browser

(Sorry you do not need to use Notepad++. You may try if you encounter a difficult character that does not convert. Paste it into Notepad++. Then use the Encoding > Encode in UTF-8 or Encoding > Convert to UTF-8 menu item to convert the text. Then copy the text from Notepad++ into your PHP editor.)

Chloe
  • 25,162
  • 40
  • 190
  • 357
  • Copy and pasting text is encoding independent. The OS is aware of text on a character basis, and that's how text is copied; not as binary stream. The OP is simply not handling encodings correctly throughout their app. – deceze May 03 '16 at 04:12
  • @deceze Sorry it is not. Microsoft Word copies its text to the clipboard in its own format, and the pasting app decides which format to take from the clipboard. I deal with this problem all the time. – Chloe May 03 '16 at 04:14