1

Say I have:

    ID                      Title
 ------------------------------------------------------
|   1   |                   ماهر زين                   |
 ------------------------------------------------------

Currently it's data type is set to VARCHAR(255) with collation=utf8-default collation.

Based On research I had I found that You have to have Table column with the data type set to NVARCHAR to be able to store unicode or arabic characters. So I tried to change the Data Type of my column to NVARCHAR But it gives this error:

Query:

ALTER TABLE `db`.`table` 
CHANGE COLUMN `NAME` `NAME` NVARCHAR(255) CHARACTER SET 'utf8' NULL DEFAULT NULL ;

Error:

Operation failed: There was an error while applying the SQL script to the database. Executing: ALTER TABLE db.table CHANGE COLUMN NAME NAME NVARCHAR(255) CHARACTER SET 'utf8' NULL DEFAULT NULL ;

ERROR 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CHARACTER SET 'utf8' NULL DEFAULT NULL' at line 2 SQL Statement: ALTER TABLE db.table CHANGE COLUMN NAME NAME NVARCHAR(255) CHARACTER SET 'utf8' NULL DEFAULT NULL

FYI: I'm doing this conversion with MySql workbench manually.

Ahmad Nabi
  • 43
  • 3
  • 11
  • @ÁlvaroGonzález I can see the `NVARCHAR` while going to `workbench -> table -> settings -> column -> Data Type ->` I have `NVARCHAR` available but it is not changable so far – Ahmad Nabi Jun 19 '17 at 15:16
  • I guess you are not allowed to provide a `CHARACTER SET` in this case because `utf8` is predefined but please read the link I posted below - `NVARCHAR` will NOT change anything – Hans Jun 19 '17 at 15:17
  • @Okay the link clearly states that `CHAR & NCHAR` is predefined as `utf8` But the Problem as I said bellow is: i'm requesting the data from eclipse java, which then it show the arabic characters as ?????? ??? – Ahmad Nabi Jun 19 '17 at 15:22
  • Guess you don't have to write `NAME` _twice_ in the command. Nonetheless, your problem seems to be on the client-side (eg lack of `SET NAMES 'UTF-8'`), so this ALTER TABLE wouldn't solve it. – Zsigmond Lőrinczy Jun 19 '17 at 15:31
  • @ZsigmondLőrinczy The query is being generated by MySql workbench. And by the problem do you mean at my eclipse Java code? – Ahmad Nabi Jun 19 '17 at 15:37
  • Without database, is your Java-program able to produce Arabian output? – Zsigmond Lőrinczy Jun 19 '17 at 16:53

2 Answers2

3

There is no need for NVARCHAR here as Mysql handles Unicode fine with VARCHAR. (Actually, NVARCHAR is just VARCHAR with predefined utf8 char set - see https://dev.mysql.com/doc/refman/5.7/en/charset-national.html)

Maybe you are confusing it with MSSQL?

Hans
  • 447
  • 3
  • 10
  • See Actuall thing is that i'm requesting the data from eclipse java, which then it show the arabic characters as ?????? ??? – Ahmad Nabi Jun 19 '17 at 15:13
  • Funny, I didn't knew these aliases existed. – Álvaro González Jun 19 '17 at 15:13
  • Do the number of characters match? I had this happen to me when the font (in MySQL workbench) couldn't display the characters – Hans Jun 19 '17 at 15:14
  • @ReneS The Actual row is as `ماهر زين ومصطفى جيجيلي - بِكَ مُلهِمي - Maher Zain | Mustafa Ceceli - Bika Moulh` and the output I get is: `???? ??? ?????? ?????? - ???? ??????? - Maher Zain | Mustafa Ceceli - Bika Moulh` – Ahmad Nabi Jun 19 '17 at 15:26
2

I see 3 Questions from you that all seem to boil down to Arabic turning into "question marks". Search for that in http://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored . It discusses the likely causes.

However, you have tried some ALTERs -- These may have made things worse. So let's do some diagnosing. Do SELECT HEX(...) as discussed in that link under "Test the data". The hex for 'ماهر', if correctly stored as utf8, should show as D985 D8A7 D987 D8B1. If you see anything different, the problem gets messier.

3F3F3F3F (hex for 4 question marks) is what you get if latin1 is involved. C399E280A6C398C2A7C399E280A1C398C2B1 would be "double encoding".

Anyway, the likely cause of question marks is

  • The bytes to be stored are not encoded as utf8/utf8mb4. Fix this. -- Dump the Arabic text in HEX from Java.
  • The column in the database is CHARACTER SET utf8 (or utf8mb4). Fix this. -- Please provide SHOW CREATE TABLE for verification.
  • Also, check that the connection during reading is UTF-8. -- Theoretically, you dealt with this via &useUnicode=yes&characterEncoding=UTF-8.
Rick James
  • 135,179
  • 13
  • 127
  • 222