1

I have an application which has a always worked with no issues. Fast forward to today: all formatting is broken. Basically I am inserting a plain text emails to mysql db, something that has worked for more than 5 years because nothing has changed. In my php code the plain text looked like this:

hello [name],

How are you?

This is a test.

Thank you.

Ceo

Today I looked at the same php code containing the email, so this is just sitting there, like a file. Then I look at existing plain text of the email which has always been in the database and they both look like this:

hello [name],\r\n\r\n�How are you?\r\n\r\n�This is a test.\r\n\r\n�Thank you.\r\n\r\n�
Ceo

Now before I pull all my hair out, do you all know what happened in mysql db, on the browser, the server? (Oh and due to this, I am unable to get emails too.)

The glories of Monday.

Leigh
  • 28,765
  • 10
  • 55
  • 103
NULL
  • 1,559
  • 5
  • 34
  • 60
  • "*nothing has changed*" - has your webhost upgraded something, or changed some configuration? – eggyal Jul 22 '13 at 16:14
  • that is what I ased them, and they said no. I knew its them. – NULL Jul 22 '13 at 16:14
  • fancypants there is no encoding here, it is rather changing my plain text format to encoding. – NULL Jul 22 '13 at 16:17
  • From the look of it, your plain text is encoded in UTF-8, and your code is trying to represent it as ISO-8859-1 (Latin-1). Modify the output code so that it reads UTF-8, or (with more difficulty) the input code so that it converts UTF-8 to Latin-1, and you should see the problem go away. – Aaron Miller Jul 22 '13 at 16:19
  • What character set is your database/table/column using? – halfer Jul 22 '13 at 17:05
  • thanks for this, however i think i know what the problem is, except don't know how to fix. basically when client copy the "email text" directly from phpmyadmin and insert it,there is no problem. however, when the client tries to submit via a form, all the \n or
    or spaces are lost! any ideas why?
    – NULL Jul 22 '13 at 18:13
  • ok posted a solution but that didn't work either. – NULL Jul 22 '13 at 18:55

3 Answers3

3

"�" has the following characters from latin-1 (iso-8859-1):

   303  195  C3    Ã    LATIN CAPITAL LETTER A WITH TILDE
   257  175  AF    ¯    MACRON
   302  194  C2    Â    LATIN CAPITAL LETTER A WITH CIRCUMFLEX
   277  191  BF    ¿    INVERTED QUESTION MARK
   275  189  BD    ½    VULGAR FRACTION ONE HALF

The byte sequence is, then C3 AF C2 BF C2 BD. This "smells" like UTF-8. Decoding (per https://en.wikipedia.org/wiki/UTF-8), we turn these into bit-patterns:

  • 11000011
  • 10101111
  • 11000010
  • 10111111
  • 11000010
  • 10111101

That first one (110xxxxx) indicates it's the first byte in a two-byte character, and stripping the marker bits from 11000011 10101111 yields ...00011 ..101111 or 00000000 00000000 00000000 11101111 == U+000000EF.

Similarly, the next two make ...00010 ..111111 or U+000000BF.

Then ...00010 ..111101 or U+000000BD.

U+00EF U+00BF U+00BD (per https://en.wikibooks.org/wiki/Unicode/Character_reference/0000-0FFF) are "�", which is clearly not right.

However, this answer — https://stackoverflow.com/a/6544206/1105015 — seems to provide some insight. EF BF BD is the UTF-8 representation of the "replacement character" U+FFFD. So it looks like something way up the line got a character that confused your system, it was stored as the replacement character, and then eventually re-rendered as latin-1.

What i'd suggest looking closely at at this point is actually the encoding you use when inserting into the db. Maybe the only thing that changed is the MySQL client used for that?

Community
  • 1
  • 1
Rob Starling
  • 3,868
  • 3
  • 23
  • 40
  • thanks for this, however i think i know what the problem is, except don't know how to fix. basically when client copy the "email text" directly from phpmyadmin and insert it,there is no problem. however, when the client tries to submit via a form, all the \n or
    or spaces are lost! any ideas why?
    – NULL Jul 22 '13 at 18:12
  • `phpinfo()` ( http://php.net/manual/en/function.phpinfo.php ) might offer insight into how PHP is set up re: character sets, etc., but you'll want to clean up the output before posting to avoid leaking anything sensitive here by accident. – Rob Starling Jul 24 '13 at 15:11
  • i suspect what changed might have been the way the webserver serves the form itself and not the form-handler. most browsers submit form data in the same charset as the form. see http://stackoverflow.com/questions/153527/setting-the-character-encoding-in-form-submit-for-internet-explorer for ideas. if you "view page info" (in your browser) on the form, what's the encoding? – Rob Starling Jul 24 '13 at 15:17
  • @RobStarling: That is a nice oct/hex/dec/glyph/name breakdown you got there! How did you generate that? Is there a tool for that? – StackzOfZtuff Jun 03 '21 at 07:54
  • 1
    @StackzOfZtuff sorry - it was a few years ago and I can't remember. I think it was probably selected lines from a latin-1 reference somewhere. – Rob Starling Jun 04 '21 at 04:47
0

The database's (or table or column) encoding or collation has somehow been changed. If you want to verify, check that column's encoding, and compare it with the encoding of other columns without the problem.
Fortunately, it's easy to change the encoding to the proper format (within cPanel or PHPMyAdmin) without having to update the actual data.

I believe that latin1_swedish_ci is the default collation that causes no problems and utf-8 should be the encoding.

Hope this helps.

Kneel-Before-ZOD
  • 4,141
  • 1
  • 24
  • 26
  • thanks for this, however i think i know what the problem is, except don't know how to fix. basically when client copy the "email text" directly from phpmyadmin and insert it,there is no problem. however, when the client tries to submit via a form, all the \n or
    or spaces are lost! any ideas why?
    – NULL Jul 22 '13 at 18:12
  • okay; when you open the database, specify that the charset be UTF8. If you are using PDO, use something such as **new PDO("mysql:host=localhost;dbname=db_name;charset=utf8", username, password, [array(other attributes)]). That should resolve it. – Kneel-Before-ZOD Jul 22 '13 at 18:17
  • To clarify, the current problem now is that the newlines and spaces are lost, right? It's no longer displaying weird symbols and codes? – Kneel-Before-ZOD Jul 22 '13 at 18:57
  • correct, no weird symbols and codes, but i am not getting the line breaks. – NULL Jul 22 '13 at 19:10
  • Okay; it's a little hard to specify what the actual problem is; however, it's one of 2 things. Either the message is sent using **text/html** or the newlines isn't recognized by the OS. Trying saving the data directly into a .txt file first and see if the problem exists. That should narrow down the error locations. – Kneel-Before-ZOD Jul 22 '13 at 19:17
0

ok so i tried using this mysql_real_escape_string

now my email looks like this:

hello [name],\\n\\nHow are you?\\n\\nThis is a test.\\n\\nThank you.\\n\\nCeo

its adding extra slash to it.

my html/php code looks like this:

hello [name],\n\n

How are you?\n\n

This is a test.\n\n

Thank you.\n\n

Ceo
NULL
  • 1,559
  • 5
  • 34
  • 60