So I'm trying to input blog comments into a database for an NLP experiment but I'm having some issues: I'm using prepare statements on the inserts but all the single quotes are turning into question marks.
I'm testing on OS X and don't know the character encoding: I assume it's default isn_swedish, etc, but after a few hours of scattered Googling I haven't been able to figure out how to determine it. I'm submitting something like "I didn't say that" as a param to
PreparedStatement statement = connect.prepareStatement("INSERT IGNORE INTO bwog.article (article_id, date, title, content, url) VALUES (?, ?, ?, ?, ?)");
...
...
String s = "I didn't say that"; //not literal string, but printlns like this
statment.setString(4, s);
and it's turning into "I didn?t say that" in the database after execution and all that.
I assume it's some kind of assumption issue where I didn't know about or forgot to fulfill some precondition.
SOLUTION: It was character encoding. Database and tables were in UTF-8 but command line connection was in latin1 for all the "character_set%" variables, so even though the data was fine it appeared garbled.