0

I have an issue with invalid characters appearing in an Oracle database. The "¿" or upside-down question mark. I know its caused by UTF8 encoding's being put into the encoding of the Oracle database. Examples of characters I am expecting are ' – ' which looks like a normal hyphen but isnt, and ' ’ ' which should be a normal single quote.

Input: "2020-08-31 – 2020-12-31" - looks like normal hyphen but not Output: "2020-08-31 ¿ 2020-12-31"

I know the primary source of the characters are copy and paste from office programs like word and its conversion to smart quotes and stuff. Getting the users to turn off this feature is not a viable solution.

There was a good article about dealing with this and they offered a number of solutions:

  • switch database to UTF8 - cant do it due to reasons
  • change varchar2 to nvarchar2 - I tried this solution as it would be the best for my needs, but when tested I was still ending up with invalid characters, so its not the only point.
  • use blob/clob - tried and got page crashes
  • filter - not attractive as there are a lot of places that would need to be updated.

So the technologies being used are Eclipse, Spring, JDBC version 4.2, and Oracle 12.

So the information is entered into the form, the form gets saved, it passes from the controller into the DAO and when its checked here, the information is correct. Its when it passes here into the JDBC where I lose sight of it, and once it enters the database I cant tell if its already changed or if thats where its happening, but the database stores the invalid character.

So somewhere between the insert statement to the database this is happening. So its either the JDBC or the database, but how to tell and how to fix? I have changed the field where the information is being stored, its originally a varchar2 but I tried nvarchar2 and its still invalid.

I know that this question is asking about the same topic, but the answers do not help me.

SELECT * FROM V$NLS_PARAMETERS WHERE PARAMETER IN ('NLS_CHARACTERSET', 'NLS_NCHAR_CHARACTERSET');

PARAMETER                                                        VALUE                                                                CON_ID
---------------------------------------------------------------- ---------------------------------------------------------------- ----------
NLS_CHARACTERSET                                                 WE8ISO8859P15                                                             0 
NLS_NCHAR_CHARACTERSET                                           AL16UTF16                                                                 0 

So in the comments I was given a link to something that might help but its giving me an error.

@Override public Long createComment(Comment comment) {
        return jdbcTemplate.execute((Connection connection) -> {
            PreparedStatement ps = connection.prepareStatement("INSERT INTO COMMENTS (ID, CREATOR, CREATED, TEXT) VALUES (?,?,?,?)", new int[] { 1 });
            ps.setLong(1, comment.getFileNumber());
            ps.setString(2, comment.getCreator());
            ps.setTimestamp(3, Timestamp.from(comment.getCreated()));
            ((OraclePreparedStatement) ps).setFormOfUse(4, OraclePreparedStatement.FORM_NCHAR);
            ps.setString(4, comment.getText());
            if(ps.executeUpdate() != 1) return null;
            ResultSet rs = ps.getGeneratedKeys();
            return rs.next() ? rs.getLong(1) : null;
        });

An exception occurred: org.apache.tomcat.dbcp.dbcp2.DelegatingPreparedStatement cannot be cast to oracle.jdbc.OraclePreparedStatement

Fering
  • 322
  • 2
  • 18

1 Answers1

1

Character set WE8ISO8859P15 (resp. ISO 8859-15) does not contain or , so you cannot store them in a VARCHAR2 data field.

Either you migrate the database to Unicode (typically AL32UTF8) or use NVARCHAR2 data type for this column.

I am not familiar with Java but I guess you have to use getNString. The linked document should provide some hints to solve the issue.

Wernfried Domscheit
  • 54,457
  • 9
  • 76
  • 110
  • The article was very helpful, but getting it to work with the application wasnt so straight forward. I now have it so its writting into the database correctly, and it reads but what was an invalid character is now something else, so a simple correction to the encoding and I should be good. Thank you very much for that link – Fering Jul 02 '20 at 17:44