2

I want to do string comparison in Java and MySQL for Utf-8. How to do that I am confused. How to do that. Any suggestions please. Is it normal string comparison or anything special.

jensgram
  • 31,109
  • 6
  • 81
  • 98
IamIronMAN
  • 1,871
  • 6
  • 22
  • 28

2 Answers2

1

Java Strings are composed of Unicode UTF-16 characters.

There are Charset classes to manage conversion to and from other character sets. In the case of UTF-8, a subset of UTF-16, conversion to UTF-16 should not be problematic - by the time you have a String the conversion with have happened, then all normal String operations apply.

See http://java.sun.com/javase/technologies/core/basic/intl/faq.jsp

This StackOverflow Question addresses the use of UTF-8 in Java/MYSQL.

Community
  • 1
  • 1
djna
  • 54,992
  • 14
  • 74
  • 117
  • @dj thanks. I configured unicode uf-8 in mysql, java my next task is string comparision how to achive that i am looking after the oracle link that you have sent. Can you guide me. Please and thanks. – IamIronMAN Nov 22 '10 at 07:35
  • @dj thanks. What is the difference between utf-8 and utf-16. why java is composed of utf-16 and why not utf-8. mysql,jdbc all are utf-8 which i configured. If am wrong in my approach please correct me. thank you – IamIronMAN Nov 22 '10 at 07:49
  • Your Strings will seamlesslesy contain a (UTF-16) representation of the UTF-8 values. Just read from the Database and look in the String. Read up on all the capabilities of Java Strings. Give a specific comparison problem if that doesn't make sense. – djna Nov 22 '10 at 13:22
1

The obvious approach is to just compare the strings using Java String.equals(String) etc. Java Strings are Unicode Strings*, and Java and MySQL both know how to handle Unicode data.

The only tricky bit is making sure that you've configured MySQL to use Unicode (typically UTF-8). In older versions of MySQL this typically entailed defining the database appropriately AND using certain parameters in the JDBC URL. Either way, consult the MySQL and MySQL connector documentation for the versions you are using.

* Strictly speaking, Java Strings are encoded in a form of UTF-16, with Unicode codepoints beyond 65535 represented as a pair of char values.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • @Stephan thanks.yes I configured mysql for unicode utf-8 and added the ?useUnicode=true&characterEncoding=utf8 in jdbc url. Now i can save other language contents in mysql and able to retrive it in correct format also. Now my next view is on string comparision for java and mysql in utf-8 format. How to achieve that can you please guide me. – IamIronMAN Nov 22 '10 at 07:22
  • @Leo-win - if the database and JDBC driver are configured to use UTF-8, then just use `String.equals()` or SQL `=` or `LIKE`. – Stephen C Nov 22 '10 at 13:45