2

I want to insert arabic information to the database but always get caracters like this : ابو نص. I use the UTF-8 encoding in my pages and i set my database to utf8_general_ci.

I read many questions similar to this question but I don't find a solution for my case.

this is a solution but with php and i don't know how to do the same thing in java.

The code of insert (by JdbcTemplate)

final String move_insert = "insert into r_movement (PPR,cd_fonc,nom_etabl,ville,delegation,date_debut,date_fin,nbjour,nbmois,nbannees,cina,cinn) "
               + "values (?,?,?,?,?,?,?,?,?,?,?,?)";

       getJdbcTemplate()
       .update(move_insert, new Object[] {move.getPpr(),move.getFonction(),move.getNom_etabl(),move.getVille(),move.getDelegation(),move.getDate_debut(),move.getDate_fin(),c.getNbjours(),c.getNbmois(),c.getNbyears(),move.getCina(),move.getCinn()});

This is my table :

CREATE TABLE `r_movement` (
 `id_move` int(11) NOT NULL AUTO_INCREMENT,
 `PPR` int(11) NOT NULL,
 `cd_fonc` varchar(255) CHARACTER SET utf8 NOT NULL,
 `nom_etabl` varchar(255) CHARACTER SET utf8 NOT NULL,
 `ville` varchar(255) CHARACTER SET utf8 NOT NULL,
 `delegation` varchar(255) CHARACTER SET utf8 NOT NULL,
 `date_debut` date NOT NULL,
 `date_fin` date NOT NULL,
 `nbjour` int(255) NOT NULL,
 `nbmois` int(255) NOT NULL,
 `nbannees` int(255) NOT NULL,
 `CINA` varchar(255) CHARACTER SET utf8 NOT NULL,
 `CINN` varchar(255) CHARACTER SET utf8 NOT NULL,
 PRIMARY KEY (`id_move`)
) ENGINE=InnoDB AUTO_INCREMENT=17 DEFAULT CHARSET=utf8
Community
  • 1
  • 1
Souad
  • 4,856
  • 15
  • 80
  • 140
  • 1
    First step: separate out the database access from the web page part. I suggest you write a short console app which *just* inserts data and then retrieves it. Diagnose the strings by printing out their UTF-16 code units (use `charAt` and convert each `char` to an `int`). Also, please show the code you're using to insert the data. – Jon Skeet Jun 02 '13 at 18:57
  • I am also using Arabic character with mysql, I used `InnoDb` and the default charset is `utf8`, I have no problem with it.Did you check inside the database if the characters also like `ابو Ù†Ø` ? – Azad Jun 02 '13 at 19:02
  • my table is InnoBD too but which language do you use for insert data @AzadOmer? – Souad Jun 02 '13 at 19:07
  • @JonSkeet I separate the database access from the web page, i'm using the pattern MVC – Souad Jun 02 '13 at 19:10
  • @AzadOmer yes the characters are like this ابو Ù†Ø in database and also in the page after inserting – Souad Jun 02 '13 at 19:12
  • @Souad: I am using java, and I have a lot of record in arabic in my `MySQL` database. – Azad Jun 02 '13 at 19:12
  • which type of interclassement do you have in your database ? I have utf8_general_ci – Souad Jun 02 '13 at 19:15
  • 1
    My point is that in order to *diagnose the problem* you should completely separate the two. Work out whether the problem is on the web side or the database side. – Jon Skeet Jun 02 '13 at 19:15
  • Make sure to read http://stackoverflow.com/questions/138948/how-to-get-utf-8-working-in-java-webapps in case it covers what you need to get your setup working. – Bobulous Jun 02 '13 at 19:19
  • I try to insert arabic data to the database using directly phpmyadmin and all is ok, so the problem is in the web side – Souad Jun 02 '13 at 19:20
  • @Arkanon i configured my app as in the link you provided but should i create the class CharsetFilter and change the web.xml file ?? – Souad Jun 02 '13 at 19:32

3 Answers3

3

I solved Finnaly The problem the configuration in the file web.xml was missed !

<filter>
    <filter-name>encoding-filter</filter-name>
    <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
    <init-param>
      <param-name>encoding</param-name>
      <param-value>utf-8</param-value>
    </init-param>
    <init-param>
      <param-name>forceEncoding</param-name>
      <param-value>true</param-value>
    </init-param>
  </filter>
  <filter-mapping>
    <filter-name>encoding-filter</filter-name>
    <url-pattern>/*</url-pattern>
    <dispatcher>REQUEST</dispatcher>
    <dispatcher>FORWARD</dispatcher>
  </filter-mapping>

I can now insert arabic data to database safely! Thanks

Souad
  • 4,856
  • 15
  • 80
  • 140
2

Try setting character encoding in connection string as explained in docs. e.g.

jdbc:mysql://localhost/some_db?useUnicode=yes&characterEncoding=UTF-8

You also can set that as a server configuration. Look at the doc.

akostadinov
  • 17,364
  • 6
  • 77
  • 85
  • I try it and i get this error : **The reference to entity "characterEncoding" must end with the ';' delimiter.** ? – Souad Jun 02 '13 at 20:07
  • I set this configuration in my file **spring-datasource.xml** – Souad Jun 02 '13 at 20:09
  • @Souad, you need to escape `&` characters to make a valid XML. btw see description of these options in this doc: http://dev.mysql.com/doc/refman/5.7/en/connector-j-reference-configuration-properties.html – akostadinov Jun 02 '13 at 20:15
  • Now I cannot logging to my application!! I have another problem whith spring security! – Souad Jun 02 '13 at 20:20
  • @Souad,perhaps with this encoding your password string looks different... try resetting your password to see if this is the case – akostadinov Jun 02 '13 at 20:27
  • you mean chang it to onother one in the database ? – Souad Jun 02 '13 at 20:28
  • I change it to another password and i still can't enter to the application. – Souad Jun 02 '13 at 20:32
  • @Souad, try using some simple characters like only lower case latin letters that match between ascii and utf8. Perhaps there is some conversion happening that breaks stuff but you would need to debug what's going on. Look at DB configuration. Put chars in your DB and see how they come out of it to understand where your configuration breaks. – akostadinov Jun 02 '13 at 20:35
  • my password is not in arabic, it's latin. I have no problem with inserting arabic data manually to database. however, I get this by Firebug : Server Apache-Coyote/1.1 Content-Type text/html;charset=ISO-8859-1 Content-Language fr Date Sun, 02 Jun 2013 20:34:50 GMT Content-Length 2435 – Souad Jun 02 '13 at 20:37
  • please help, how to insert arabic data safely to database? should i change the charest of the server? @akostadinov – Souad Jun 02 '13 at 21:30
  • @Souad, did you create your fields with the correct character set (not collation)? http://dev.mysql.com/doc/refman/5.0/en/charset-column.html – akostadinov Jun 03 '13 at 03:22
  • I edit my question and i put my table structure. please check it and thank you @akostadinov – Souad Jun 03 '13 at 11:42
  • I changed the charset of the server apache I addedd this : **AddDefaultCharset UTF-8** at the end of the file httpd.conf, i set the charest of the whole database to UTF-8, the charset of JVM, the URL of the connection i added "?useUnicode=true&characterEncoding=utf8", i changed the charset of every field in datavase to UTF-8, the meta balise and the html pages to UTF-8 and NOTHING HAS CHANGED!! I lose my mind – Souad Jun 03 '13 at 11:51
0

This Answer (belatedly) discusses how to recover the mojibaked text.

ابو نص represents ابو نص. Hex: D8A7D8A8D98820D986D8B5. That's 5 Arabic characters (Dxxx), plus a space (20).

What caused the problem:

  • The bytes you have in the client are correctly encoded in utf8.
  • You connected with latin1, probably by default. (It should have been utf8.)
  • The column in the table was declared CHARACTER SET latin1. (Or possibly it was inherited from the table/database.) (It should have been utf8.)

The fix for the data is a "2-step ALTER".

ALTER TABLE Tbl MODIFY COLUMN col VARBINARY(...) ...;
ALTER TABLE Tbl MODIFY COLUMN col VARCHAR(...) ... CHARACTER SET utf8 ...;

where the lengths are big enough and the other "..." have whatever else (NOT NULL, etc) was already on the column.

By the way, utf8_general_ci is a "collation"; only the "character set", utf8, is relevant to this problem.

Rick James
  • 135,179
  • 13
  • 127
  • 222