OK. So lets look at what your code does:
// line 1
String s = "สวัสดี Mr.Java Sp'e c'i'a'l'' '";
We have a String with various international characters in it ... and some "'"
characters.
// line 2
s = s.replaceAll("'", "'");
Assuming that those are really "'"
characters characters, we will replace all instances of "'"
with an XML / HTML character entity giving us:
"สวัสดี Mr.Java Sp'e c'i'a'l'' '"
And so ...
// line 3
s = StringEscapeUtils.escapeHtml(s);
This replaces any active HTML / XML characters with character references. This includes the ampersand characters "&"
that you previously inserted. The result is this:
"&#xxxx;&#xxxx;&#xxxx;&#xxxx; Mr.Java Sp'e
c'i'a'l'' '"
(The &#xxxx;
numeric character references encode those Thai (?) characters.)
When you embed that in an HTML document and display it, you will see "สวัสดี Mr.Java Sp'e c'i'a'l'' '"
See what has happened? You have HTML escaped your HTML escaped apostrophies!!
So what do you really need to do?
There is no need replace apostrophes with '
. Apostrophes are legal in HTML text.
There should be no need to add HTML escapes so that you can store text in a database:
Any modern database will allow you to store Unicode strings without any special encoding.
If you are trying to prevent the database's SQL parser getting confused by quotes in the text you are storing, you are doing it the wrong way. The right way to do this is to use a PreparedStatement
, add parameter placeholders to the query, and use the PreparedStatement.setXxx
methods to provide the parameter values. The execute
(or whatever) will take care of any SQL escaping that needs to be done.