0

I am using a UTF8 library that fixes encoding issues with the text. It has been working great until I deal with a curly apostrophe .

Sometimes in my database, a column field will contain words with these curly apostrophes instead of straight ones. The UTF8 converter I am using for whatever reason will convert it from

you’ll

to

you?ll

Anyone have any wisdom that points me in the right direction as to why this is happening. I am not that familiar with encoding issues.

Cesar Bielich
  • 4,754
  • 9
  • 39
  • 81
  • 1
    I think you would be better off asking on GitHub. If you have a question about using this library the developers might know the answer best. – Dharman Aug 31 '19 at 20:50
  • Besides, that library is clearly about guess-fixing. So chances are it guesses wrong for some string serializations. – mario Aug 31 '19 at 20:51
  • yeah I posted on Github already, just hoping to get some luck here as well. I am trying to mess with the code and see if I can fix it, but this is not my expertise. – Cesar Bielich Aug 31 '19 at 20:58
  • I would wager that you're specifying that your input is ISO8859-1 when it's really cp1252. They are very similar in that 1252 overlaps with most of 8859-1's codepoints, but 1252 also contains a bunch of extra codepoints, eg: `’` and the other "fancy" quotes MS likes to use to ruin peoples' days. You should always _explicitly_ define your input encoding as it's virtually impossible to automatically detect with any level of accuracy. – Sammitch Aug 31 '19 at 21:43

0 Answers0