1

I'm inputting Romanian diacritics in some of my database fields and they are not showing up as expected.

  • Romanian diacritics are: Ă ă Â â Î î Ș ș Ț ț
  • The whole line above displays as: Ä‚ ă Â â ÃŽ î Ș È™ Èš È›
  • When I save a word containing ã it displays as ã

I'd like to know what collation should I set for the table for this to work

or

just stop using them all together and just save them as normal letters when they are inputed, in which case:

Inputing

  • ã would get saved as a
  • â would get saved as a
  • î would get saved as i
  • ș would get saved as s
  • ț would get saved as t

Any thoughts? I've tried setting multiple utf-8 as collation for the table, including utf_8_unicode_ci, as well as latin1 but that doesn't solve the problem.

Current collation set for the tables in question is utf_8_general_ci

Using MysQL.

Please do let me know in a comment if I left anything out and you require more details

t1f
  • 3,021
  • 3
  • 31
  • 61
  • Have you tried utf_8_unicode_ci? – Sloan Thrasher Mar 30 '17 at 00:30
  • @SloanThrasher Hi, thanks for the interest. Yes, I've tried it, same problem. – t1f Mar 30 '17 at 00:31
  • How about utf_16_unicode_ci? Also, what encoding is used by the web page that submits the data? – Sloan Thrasher Mar 30 '17 at 00:33
  • @SloanThrasher just tried it, nope. I'm submitting through phpmyadmin, in the cpanel of the hosting. Not sure how to answer your 2nd question, tables are set to `utf8_general_ci` while InnoDB line below the tables says this: `InnoDB latin1_swedish_ci` (Default) and `Server connection collation ` is set to `utf_8_general_ci` – t1f Mar 30 '17 at 00:36
  • checkout this topic: http://stackoverflow.com/questions/4288006/mysql-collation-to-store-multilingual-data-of-unknown-language – Sloan Thrasher Mar 30 '17 at 00:39
  • I just checked my copy of phpAdmin, and the page uses utf8, and so it may be storing it correctly, but messing it up on display. If you have PHP, write a simple page to retrieve and display the value, and set the page character set to unicode. – Sloan Thrasher Mar 30 '17 at 00:43
  • Thanks but neither utf8_general_ci nor utf8_unicode_ci solve the problem. I understand the difference stated there but they don't seem to work for romanian. I've also loaded the data in multiple browser, from phpmyadmin using the cpanel and in a software written in Delphi(in a TMemo) with no luck. I've also changed default for whole db from latin1 swedish to both of those utf8 options and that doesn't do it either. MysQL workbench displays it the same also. – t1f Mar 30 '17 at 00:45
  • 1
    I just did a simple test with a table with 2 columns, one as utf8 and unicode, and another as utf_16 and unicode. The first column stores the data correctly, but is displays as in your question. The second column displays correctly, but only when the web page encoding is set to utf16 and unicode. – Sloan Thrasher Mar 30 '17 at 00:51
  • @SloanThrasher Indeed. I've just set the database, table AND fields to utf16_unicode_ci and they both display and store properly. Thanks, if you don't mind writting this up as an answer so I can accept for you, that would be great! – t1f Mar 30 '17 at 00:59

3 Answers3

1

Set the table to utf16_unicode_ci, and that should do the trick.

I did a simple test with a table with 2 columns, one as utf8 and unicode, and another as utf_16 and unicode. The first column stores the data correctly, but is displays as in your question. The second column displays correctly, but only when the web page encoding is set to utf16 and unicode.

Sloan Thrasher
  • 4,953
  • 3
  • 22
  • 40
1

"ã it displays as ã" -- That's Mojibake; see Trouble with utf8 characters; what I see is not what I stored

Mojibake is a common problem; utf16 is not the solution. (It may have accidentally worked.)

Community
  • 1
  • 1
Rick James
  • 135,179
  • 13
  • 127
  • 222
1

Sounds like you are using MySQL. What about utf8_romanian_ci (introduced in MySQL 6.0.4)?

On the other hand, you might use utf8_bin, for WordPress (which seems glued to MySQL 5).

Collation charts: http://collation-charts.org

I found some answers here, too: utf8_bin vs utf8_unicode_ci