1

Hi so I've made an application in java that lets the user to store data in a MySQL database. Whenever I enter this به‌رواری ده‌رچوون I just get ??????. Is there any way I can fix that? It's really important that this job is done before tomorrow... I've never faced a problem such as this, and now I'm creating a database app for a company in Kurdistan, and now my friend asked, well does it store data with their characters? Heart attack! Please help!

I'm using PhpMyadmin on a localhost computer using XAMPP

1010
  • 1,779
  • 17
  • 27
  • are you using unicode? are you storing and reading your data with the proper character encoding? – 1010 May 05 '15 at 18:20
  • What is proper character encoding? And yes I am using UTF8_Unicode_ci – Daroan Adnan May 05 '15 at 18:25
  • I mean that you must be consistent with the encoding when reading and writing. [here](https://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html) says that mysql 5.0 supports those arabic characters. – 1010 May 05 '15 at 18:33
  • please post a [mcve](http://stackoverflow.com/help/mcve) showing this behaviour, and the create statement of your table. – 1010 May 05 '15 at 18:41
  • Okay I just managed to make my mysql database capable of handling kurdish characters, but there is still a problem. Whenever I try to add new data via phpmyadmin there seem to be no problem but with my java application which I have built there is. – Daroan Adnan May 05 '15 at 19:03
  • @1010 Do you have any clue? – Daroan Adnan May 05 '15 at 19:06
  • @BK435 Or do you have any clue? – Daroan Adnan May 05 '15 at 19:06
  • Post a new question for the Java issue. Include code samples where you configure/open the connection to the database. Issue will 10-1 be the same: non matching encodings. – Peter Tillemans May 05 '15 at 20:14
  • @DaroanAdnan not without seeing any code. – 1010 May 05 '15 at 21:04
  • check [these questions](http://stackoverflow.com/search?q=arabic+mysql+java), some of them seem to be a similar problem. – 1010 May 05 '15 at 22:22
  • @1010 - since there are at least 4 different ways to screw up with `CHARACTER SET`, your link to "these questions" should be further filtered down to "????" cases. In particular the first one (`ابو نص`) would wrongly imply that the garbled text can be converted back to the right text. – Rick James May 05 '15 at 23:30

2 Answers2

2

What happened:

  • you had utf8-encoded data (good)
  • SET NAMES latin1 was in effect (default, but wrong)
  • the column was declared CHARACTER SET latin1 (default, but wrong)

As you INSERTed the data, it was converted to latin1, which does not have values for Arabic (Kurdish/Farsi/etc) characters, so question marks replaced them.

The cure (for future INSERTs):

  • utf8-encoded data (good)
  • mysqli_set_charset('utf8') (or whatever your client needs for establishing the CHARACTER SET)
  • check that the column(s) and/or table default are CHARACTER SET utf8
  • If you are displaying on a web page, <meta...utf8> should be near the top.

The discussion above is about CHARACTER SET, the encoding of characters. Now for a tip on COLLATION, which is used for comparing and sorting.

To double check that the data is stored correctly, do SELECT col, HEX(col)....
ه‌رچوون should come back D987E2808CD8B1DA86D988D988D986
Arabic characters in utf8 have hex of D8xx or D9xx.

(utf8mb4 works just as well as utf8; either works for Arabic.)

Bad news: The data that was inserted and turned into '???' cannot be recovered.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • There is one problem though, I have the COLLATION of the mysql server the same as the data stored in the tables. utf8mb4 that would say, but I can't compare strings of 4 bytes, only 3... – Daroan Adnan May 06 '15 at 16:45
  • Each Arabic _character_ occupies 2 _bytes_. `D987` is hex for 2 bytes. If the `CHARACTER SET` is `utf8`, use `COLLATION utf8_unicode_ci`. for `CHARACTER SET utf8mb4`, use `COLLATION utf8mb4_unicode_ci`. – Rick James May 06 '15 at 20:40
0

put the below code in the connection PHP file then set collation in phpmyadmin (mysql) to utf8_general_ci or utf8_unicode_ci:

<?php
$servername="localhost";
$username="root";
$password="";
$databasename="ab1";
$conn = new mysqli($servername, $username, $password, $databasename);
if ($conn->connect_error) {
die("connection failed" .$conn->connect_error);
}else
echo "connection sucssessfullly";
$conn->query("SET NAMES 'utf8'");
$conn->query("SET CHARACTER SET utf8");
?>