39

as the title said, I have a problem between java and mysql

The mysql DB, tables, and columns are utf8_unicode_ci. I have an application that took some input from an xml, then compose the query...

public String [] saveField(String xmltag, String lang){     
  NodeList nodo = this.doc.getElementsByTagName(xmltag);
  String [] pos = new String[nodo.getLength()];     
  for (int i = 0 ; i < nodo.getLength() ; i++ ) {
     Node child = nodo.item(i);
     pos[i] =  "INSERT INTO table (id, lang, value) VALUES (" +
        child.getAttributes().getNamedItem("id").getNodeValue().toString() + " , " +
        lang + " , " + 
        "'" + child.getFirstChild().getTextContent() + "'" +
        ");";       
    }   
   return pos;
}

this method return an array of String that contains one or more SQL insert Query... then

Class.forName("com.mysql.jdbc.Driver").newInstance();
con = DriverManager.getConnection("jdbc:mysql:///dbname", "user", "pass");
.....
Statement s; s =
this.con.createStatement ();
s.execute(query);

both with s.execyte and s.executeUpdate the special characters are stored as ?

so special char are not stored correctly: מסירות קצרות is stored as ?????????

Hi! is stored as Hi!

Any advice?

Thanks

Marcx
  • 6,806
  • 5
  • 46
  • 69
  • 1
    How are you reading the source XML? Is it coming from a file or is it a string from a web service, or something else? It is possible that you original reading of the xml is causing the problem. – Robert Diana Jul 18 '10 at 12:58
  • it's a string from a webservice, i use db.parse("http://......") to get the xml contents... – Marcx Jul 18 '10 at 13:14

2 Answers2

96

Solved, I forgot to add the encoding when initializing Connection:

before was:

con = DriverManager.getConnection("jdbc:mysql:///dbname", "user", "pass");

now (working):

con = DriverManager.getConnection("jdbc:mysql:///dbname?useUnicode=true&characterEncoding=utf-8", "user", "pass");

Marcx
  • 6,806
  • 5
  • 46
  • 69
  • 2
    shouldn't it be utf8? https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-charsets.html – tObi Feb 19 '18 at 20:11
  • Yes it should but I can't suggest an edit because it is only one character – Phil Jan 09 '19 at 16:21
11

AUGH!

Okay, so, this isn't directly the thing you asked for, but this:

 pos[i] =  "INSERT INTO table (id, lang, value) VALUES (" +
    child.getAttributes().getNamedItem("id").getNodeValue().toString() + " , " +
    lang + " , " + 
    "'" + child.getFirstChild().getTextContent() + "'" +
    ");";       

Set off all my internal "DON'T DO THIS" alarms.

Do you have absolute and complete control over the incoming text? Are you sure someone won't have an apostrophe in the incoming text, even by accident?

Instead of creating SQL text, please refactor your code so that you end up calling:

PreparedStatement pstmt =
    con.prepareStatement("INSERT INTO table (id, lang, value) VALUES (?,?,?)");
// then, in a loop:
pstmt.setString(0, child.getAttributes().getNamedItem("id").getNodeValue().toString());
pstmt.setString(1, lang);
pstmt.setString(2, child.getFirstChild().getTextContent());
pstmt.execute();

That is, let the DB escape the text. Please, unless someday you want to have a conversation like this one. As an advantageous side effect, this approach may solve your problem, assuming that the string values are still correct when you read them from the XML. (As someone else mentioned, it's very possible that things are getting messed up when you read from the XML)

Daniel Martin
  • 23,083
  • 6
  • 50
  • 70
  • lol. I can understand a side comment but awhole answer picking on the poor fellow ? -1 – kellogs Aug 01 '13 at 10:14
  • @DanielMartin +1, is there a way to get final query string from `pstmt` after you set all its value, I need to know that so that I can log what query was executing. – Watt Jan 13 '14 at 21:03
  • @Watt better late than never: `pstmt.toString()` will do the trick – LoicAG Mar 27 '14 at 17:06