2

I have a Java application which connects to an Oracle database. One of the tables contains a blob column, which we populate using utl_raw.cast_to_raw, like this:

public class MyDAO extends JdbcDaoSupport {
    public void persist(String name, String data) {
        String sql = "insert into table (id, name, data) values " +
                     "(sequence.nextval, ?, utl_raw.cast_to_raw(?))";
        getJdbcTemplate().update(sql, [name, data]);
    }
}

For some reason beyond my knowledge, we don't use Hibernate, JPA or anything like that in this application. Just plain SQL statements and Spring JDBC to execute them. (...) Anyway, the above method works fine. At least with Oracle.

I've set up an in-memory H2 database with minimal data, to use in our JUnit tests. But there is no utl_raw package in H2. Apparently, the way to go is to use the rawtohex function instead. It kind of works... but it doubles the data length. For example, if I persist the string "TEST" in the blob column and later I retrieve its value, it becomes " T E S T". As a byte array, it would be [0, 84, 0, 69, 0, 83, 0, 84]. The H2 documentation states that this is supposed to happen:

Converts a string to the hex representation. 4 hex characters per string character are used.

Weird. I guess it's that way to support UTF-16 strings (?), but I'm using UTF-8. It actually corrupts my data.

Is there any other way to store the blob in H2? I've seen this answer where they suggest using a PreparedStatement, but then I'd have to handle the connection myself:

super.getConnection().prepareStatement(/* ... */).executeUpdate();
// release the statement and the connection, handle exceptions

Sure, it's not much uglier than what we have right now, but still, I'm very reluctant to change working production code only to make it compatible with our JUnit test setup.

Another option would be to retrieve the blob as if it were an UTF-16 string, so it will actually use 4 hexs per character. But again, I would be fixing the problem just for H2, and breaking it for Oracle instead.

Do I have any other choice? Is there any way I can tell the H2 database to use just 1 byte per character in rawtohex? Or perhaps another function I can use instead of rawtohex?

AJPerez
  • 3,435
  • 10
  • 61
  • 91

2 Answers2

2

I would not do the conversion in the database (either Oracle or H2).

H2's docs say to use PreparedStatement.setBinaryStream. That might work on Oracle too? Depending on the size of the string you're converting, other approaches might be more performant. Btw any particular reasons for converting strings to binary objects?

If you know it's always relatively small strings (e.g. < 1MB), you're probably pretty safe to try something like this:

String sql = "insert into table (id, name, data) values " +
             "(sequence.nextval, ?, ?)";
PreparedStatement ps = super.getConnection().prepareStatement(sql);
InputStream stream = new ByteArrayInputStream(data.getBytes(StandardCharsets.UTF_8));
ps.setString(1, name);
ps.setBinaryStream(2, stream);
ps.executeUpdate();
Kamal
  • 415
  • 2
  • 10
  • Thanks, I'll try it with both databases, although I'm still not convinced to change working code just for testing purposes. – AJPerez Jun 13 '18 at 07:16
  • As for the reason to convert them to binary objects... well, we needed a column to store "large strings", and somebody decided that blobs were the way to do it. I guess a clob would have been a better choice, but the application is already on production, so I'd need a *very* compelling reason to get the change approved. – AJPerez Jun 13 '18 at 07:26
  • I agree about not changing code just for testing purposes. If you're only ever going to use Oracle, then why not use it also for testing? On the other hand, if you might use this code on some other db, then a change such as this will be required. – Kamal Jun 13 '18 at 18:29
  • I can think about a couple compelling reasons for moving to CLOB, depending on the situation. 1. Security: depending on where the data is coming from (and if you sanitise or not), a crafty user could send you an executable file to be stored there and retrieved later. 2. BLOB requires "filtering" if you'd like to use Oracle Text for searching, which is likely less efficient: https://docs.oracle.com/en/database/oracle/oracle-database/12.2/ccapp/indexing-with-oracle-text.html#GUID-C92BF4DC-5911-4507-91EB-946E8540BCB1 – Kamal Jun 13 '18 at 19:00
1

I have been trying this using Liquibase at build time for populating an in memory H2 database.

I encountered the same space padding issue, but resolved it by switching from using RAWTOHEX to using StringToUTF8.

Jim
  • 45
  • 1
  • 1
  • 8