I have a script in PHP that stores values to a MySQL database from a web store. The store allows customers to leave a message which can create havoc when they use emojis. To prevent these characters from breaking my script I've used FILTER_SANITIZE_STRING
and FILTER_FLAG_STRIP_HIGH
on all my strings prior to sending them all to MySQL.
This works well except for when I display it again in a Java program I've written I'll have things like "I'm"
instead of "I'm"
.
Is there a way to have Java find and convert the ASCII values back into characters?
My current plan of attack is to have a function that takes each relevant string column, examines each word looking for &#
, finds the position of the simi-colon after the &#
, replaces that value with the corresponding ASCII character, and returns the new string.
It's doable, but I'm hoping there is an existing means to do this without re-inventing the wheel.
Edit: Thank you to @rzwitserloot for pointing me in the right direction, for anyone who sees this and does not read my comment in his answer, I ended up using JSoup. Here is a snippet of the final code section related to this on the Java side for anyone else working through this:
// Connect method opens a connection to the MySQL server
connect();
// Query the MySQL server
resultSet = statement.executeQuery("select * from order_tracking order by DateOrdered");
// If there is any result, iterate through them until the end is reached.
while (resultSet.next()) {
// Add each returned row into the list to send to the table
Jsoup.parse(resultSet.getString(2)).text()
.
.
.
}
The .text()
at the end of the Jsoup.parse(String)
gets rid of the html formatting (i.e. <Head><Body>
etc) that Jsoup automatically throws in and returns only the text portion with the &
(or whatever ascii value it might be) properly formatted.
Thanks!