8

I have the following schematic implementation of a JAX-RS service endpoint:

@GET
@Path("...")
@Transactional
public Response download() {
    java.sql.Blob blob = findBlob(...);
    return Response.ok(blob.getBinaryStream()).build();
}

Invoking the JAX-RS endpoint will fetch a Blob from the database (through JPA) and stream the result back to the HTTP client. The purpose of using a Blob and a stream instead of e.g. JPA's naive BLOB to byte[] mapping is to prevent that all of the data must be kept in memory, but instead stream directly from the database to the HTTP response.

This works as intended and I actually don't understand why. Isn't the Blob handle I get from the database associated with both the underlying JDBC connection and transaction? If so, I would have expected the Spring transaction to be commited when I return from the download() method, making it impossible for the JAX-RS implementation to later access data from the Blob to stream it back to the HTTP response.

jarnbjo
  • 33,923
  • 7
  • 70
  • 94

3 Answers3

7

Are you sure that the transaction advice is running? By default, Spring uses the "proxy" advice mode. The transaction advice would only run if you registered the Spring-proxied instance of your resource with the JAX-RS Application, or if you were using "aspectj" weaving instead of the default "proxy" advice mode.

Assuming that a physical transaction is not being re-used as a result of transaction propagation, using @Transactional on this download() method is incorrect in general.

If the transaction advice is actually running, the transaction ends when returning from the download() method. The Blob Javadoc says:  "A Blob object is valid for the duration of the transaction in which is was created." However, §16.3.7 of the JDBC 4.2 spec says:  "Blob, Clob and NClob objects remain valid for at least the duration of the transaction in which they are created." Therefore, the InputStream returned by getBinaryStream() is not guaranteed to be valid for serving the response; the validity would depend on any guarantees provided by the JDBC driver. For maximum portability, you should rely on the Blob being valid only for the duration of the transaction.

Regardless of whether the transaction advice is running, you potentially have a race condition because the underlying JDBC connection used to retrieve the Blob might be re-used in a way that invalidates the Blob.

EDIT: Testing Jersey 2.17, it appears that the behavior of constructing a Response from an InputStream depends on the specified response MIME type. In some cases, the InputStream is read entirely into memory first before the response is sent. In other cases, the InputStream is streamed back.

Here is my test case:

@Path("test")
public class MyResource {

    @GET
    public Response getIt() {
        return Response.ok(new InputStream() {
            @Override
            public int read() throws IOException {
                return 97; // 'a'
            }
        }).build();
    }
}

If the getIt() method is annotated with @Produces(MediaType.TEXT_PLAIN) or no @Produces annotation, then Jersey attempts to read the entire (infinite) InputStream into memory and the application server eventually crashes from running out of memory. If the getIt() method is annotated with @Produces(MediaType.APPLICATION_OCTET_STREAM), then the response is streamed back.

So, your download() method may be working simply because the blob is not being streamed back. Jersey might be reading the entire blob into memory.

Related: How to stream an endless InputStream with JAX-RS

EDIT2: I have created a demonstration project using Spring Boot and Apache CXF:
https://github.com/dtrebbien/so30356840-cxf

If you run the project and execute on the command line:

curl 'http://localhost:8080/myapp/test/data/1' >/dev/null

Then you will see log output like the following:

2015-06-01 15:58:14.573 DEBUG 9362 --- [nio-8080-exec-1] org.apache.cxf.transport.http.Headers    : Request Headers: {Accept=[*/*], Content-Type=[null], host=[localhost:8080], user-agent=[curl/7.37.1]}

2015-06-01 15:58:14.584 DEBUG 9362 --- [nio-8080-exec-1] org.apache.cxf.jaxrs.utils.JAXRSUtils    : Trying to select a resource class, request path : /test/data/1
2015-06-01 15:58:14.585 DEBUG 9362 --- [nio-8080-exec-1] org.apache.cxf.jaxrs.utils.JAXRSUtils    : Trying to select a resource operation on the resource class com.sample.resource.MyResource
2015-06-01 15:58:14.585 DEBUG 9362 --- [nio-8080-exec-1] org.apache.cxf.jaxrs.utils.JAXRSUtils    : Resource operation getIt may get selected
2015-06-01 15:58:14.585 DEBUG 9362 --- [nio-8080-exec-1] org.apache.cxf.jaxrs.utils.JAXRSUtils    : Resource operation getIt on the resource class com.sample.resource.MyResource has been selected
2015-06-01 15:58:14.585 DEBUG 9362 --- [nio-8080-exec-1] o.a.c.j.interceptor.JAXRSInInterceptor   : Request path is: /test/data/1
2015-06-01 15:58:14.585 DEBUG 9362 --- [nio-8080-exec-1] o.a.c.j.interceptor.JAXRSInInterceptor   : Request HTTP method is: GET
2015-06-01 15:58:14.585 DEBUG 9362 --- [nio-8080-exec-1] o.a.c.j.interceptor.JAXRSInInterceptor   : Request contentType is: */*
2015-06-01 15:58:14.585 DEBUG 9362 --- [nio-8080-exec-1] o.a.c.j.interceptor.JAXRSInInterceptor   : Accept contentType is: */*
2015-06-01 15:58:14.585 DEBUG 9362 --- [nio-8080-exec-1] o.a.c.j.interceptor.JAXRSInInterceptor   : Found operation: getIt

2015-06-01 15:58:14.595 DEBUG 9362 --- [nio-8080-exec-1] o.s.j.d.DataSourceTransactionManager     : Creating new transaction with name [com.sample.resource.MyResource.getIt]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
2015-06-01 15:58:14.595 DEBUG 9362 --- [nio-8080-exec-1] o.s.j.d.DataSourceTransactionManager     : Acquired Connection [ProxyConnection[PooledConnection[org.hsqldb.jdbc.JDBCConnection@7b191894]]] for JDBC transaction
2015-06-01 15:58:14.596 DEBUG 9362 --- [nio-8080-exec-1] o.s.j.d.DataSourceTransactionManager     : Switching JDBC Connection [ProxyConnection[PooledConnection[org.hsqldb.jdbc.JDBCConnection@7b191894]]] to manual commit
2015-06-01 15:58:14.602 DEBUG 9362 --- [nio-8080-exec-1] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL query
2015-06-01 15:58:14.603 DEBUG 9362 --- [nio-8080-exec-1] o.s.jdbc.core.JdbcTemplate               : Executing prepared SQL statement [SELECT data FROM images WHERE id = ?]
2015-06-01 15:58:14.620 DEBUG 9362 --- [nio-8080-exec-1] o.s.j.d.DataSourceTransactionManager     : Initiating transaction commit
2015-06-01 15:58:14.620 DEBUG 9362 --- [nio-8080-exec-1] o.s.j.d.DataSourceTransactionManager     : Committing JDBC transaction on Connection [ProxyConnection[PooledConnection[org.hsqldb.jdbc.JDBCConnection@7b191894]]]
2015-06-01 15:58:14.621 DEBUG 9362 --- [nio-8080-exec-1] o.s.j.d.DataSourceTransactionManager     : Releasing JDBC Connection [ProxyConnection[PooledConnection[org.hsqldb.jdbc.JDBCConnection@7b191894]]] after transaction
2015-06-01 15:58:14.621 DEBUG 9362 --- [nio-8080-exec-1] o.s.jdbc.datasource.DataSourceUtils      : Returning JDBC Connection to DataSource
2015-06-01 15:58:14.621 DEBUG 9362 --- [nio-8080-exec-1] o.a.cxf.phase.PhaseInterceptorChain      : Invoking handleMessage on interceptor org.apache.cxf.interceptor.OutgoingChainInterceptor@7eaf4562

2015-06-01 15:58:14.622 DEBUG 9362 --- [nio-8080-exec-1] o.a.cxf.phase.PhaseInterceptorChain      : Adding interceptor org.apache.cxf.interceptor.MessageSenderInterceptor@20ffeb47 to phase prepare-send
2015-06-01 15:58:14.622 DEBUG 9362 --- [nio-8080-exec-1] o.a.cxf.phase.PhaseInterceptorChain      : Adding interceptor org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor@5714d386 to phase marshal
2015-06-01 15:58:14.622 DEBUG 9362 --- [nio-8080-exec-1] o.a.cxf.phase.PhaseInterceptorChain      : Chain org.apache.cxf.phase.PhaseInterceptorChain@11ca802c was created. Current flow:
  prepare-send [MessageSenderInterceptor]
  marshal [JAXRSOutInterceptor]

2015-06-01 15:58:14.623 DEBUG 9362 --- [nio-8080-exec-1] o.a.cxf.phase.PhaseInterceptorChain      : Invoking handleMessage on interceptor org.apache.cxf.interceptor.MessageSenderInterceptor@20ffeb47
2015-06-01 15:58:14.623 DEBUG 9362 --- [nio-8080-exec-1] o.a.cxf.phase.PhaseInterceptorChain      : Adding interceptor org.apache.cxf.interceptor.MessageSenderInterceptor$MessageSenderEndingInterceptor@6129236d to phase prepare-send-ending
2015-06-01 15:58:14.623 DEBUG 9362 --- [nio-8080-exec-1] o.a.cxf.phase.PhaseInterceptorChain      : Chain org.apache.cxf.phase.PhaseInterceptorChain@11ca802c was modified. Current flow:
  prepare-send [MessageSenderInterceptor]
  marshal [JAXRSOutInterceptor]
  prepare-send-ending [MessageSenderEndingInterceptor]

2015-06-01 15:58:14.623 DEBUG 9362 --- [nio-8080-exec-1] o.a.cxf.phase.PhaseInterceptorChain      : Invoking handleMessage on interceptor org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor@5714d386
2015-06-01 15:58:14.627 DEBUG 9362 --- [nio-8080-exec-1] o.a.c.j.interceptor.JAXRSOutInterceptor  : Response content type is: application/octet-stream
2015-06-01 15:58:14.631 DEBUG 9362 --- [nio-8080-exec-1] o.apache.cxf.ws.addressing.ContextUtils  : retrieving MAPs from context property javax.xml.ws.addressing.context.inbound
2015-06-01 15:58:14.631 DEBUG 9362 --- [nio-8080-exec-1] o.apache.cxf.ws.addressing.ContextUtils  : WS-Addressing - failed to retrieve Message Addressing Properties from context
2015-06-01 15:58:14.636 DEBUG 9362 --- [nio-8080-exec-1] o.a.cxf.phase.PhaseInterceptorChain      : Invoking handleMessage on interceptor org.apache.cxf.interceptor.MessageSenderInterceptor$MessageSenderEndingInterceptor@6129236d
2015-06-01 15:58:14.639 DEBUG 9362 --- [nio-8080-exec-1] o.a.c.t.http.AbstractHTTPDestination     : Finished servicing http request on thread: Thread[http-nio-8080-exec-1,5,main]
2015-06-01 15:58:14.639 DEBUG 9362 --- [nio-8080-exec-1] o.a.c.t.servlet.ServletController        : Finished servicing http request on thread: Thread[http-nio-8080-exec-1,5,main]

I have trimmed the log output for readability. The important thing to note is that the transaction is committed and the JDBC connection is returned before the response is sent. Therefore, the InputStream returned by blob.getBinaryStream() is not necessarily valid and the getIt() resource method may be invoking undefined behavior.

EDIT3: A recommended practice for using Spring's @Transactional annotation is to annotate the service method (see Spring @Transactional Annotation Best Practice). You could have a service method that finds the blob and transfers the blob data to the response OutputStream. The service method could be annotated with @Transactional so that the transaction in which the Blob is created would remain open for the duration of the transfer. However, it seems to me that this approach could introduce a denial of service vulnerability by way of a "slow read" attack. Because the transaction should be kept open for the duration of the transfer for maximum portability, numerous slow readers could lock up your database table(s) by holding open transactions.

One possible approach is to save the blob to a temporary file and stream back the file. See How do I use Java to read from a file that is actively being written? for some ideas on reading a file while it's being simultaneously written, though this case is more straightforward because the length of the blob can be determined by calling the Blob#length() method.

Community
  • 1
  • 1
Daniel Trebbien
  • 38,421
  • 18
  • 121
  • 193
  • You are making a few valid points, but I don't think it explains my behaviour. The implementing class is a Spring managed bean and the JAX-RS server is configured with Spring (using jaxrs:server in the XML context definition). I am also 100% sure that the HTTP response is streamed from the database, since we have a very slow network connection between DB and HTTP server and fast connections between HTTP server and clients, so it's easy to see on the way the client receives the data, that it is not entirely cached by the HTTP server before sent to the client. – jarnbjo Jun 01 '15 at 16:17
  • @jarnbjo: I assume that you are using Apache CXF for the JAX-RS implementation? I just added a complete example to my answer. – Daniel Trebbien Jun 01 '15 at 20:27
  • I've now had time to debug the code to find out what actually happens. Even if the current implementation might not be the most clever approach and there might be other 'recommended' practices, it does actually work. As I pointed out in my own answer, all assumptions regarding the Spring transaction management and the Apache CXF handling of the response stream were correct. I had incorrectly assumed that reading from the BLOB stream will fail after commiting the transaction. – jarnbjo Jun 03 '15 at 13:36
1

I've spent some time now debugging the code, and all my assumptions in the question are more or less correct. The @Transactional annotation works as expected, the transaction (both the Spring and the DB transactions) are commited immediately after returning from the download method, the physical DB connection is returned to the connection pool and the content of the BLOB is obviously been read later and streamed to the HTTP response.

The reason why this still works is that the Oracle JDBC driver implements functionality beyond what's required by the JDBC specification. As Daniel pointed out, the JDBC API documentation states that "A Blob object is valid for the duration of the transaction in which is was created." The documentation only states that the Blob is valid during the transaction, it does not state (as claimed by Daniel and initially assumed by me), that the Blob is not valid after ending the transaction.

Using plain JDBC, retrieving the InputStream from two Blobs in two different transactions from the same physical connection and not reading the Blob data before after the transactions are commited demonstrates this behaviour:

Connection conn = DriverManager.getConnection(...);
conn.setAutoCommit(false);

ResultSet rs = conn.createStatement().executeQuery("select data from ...");
rs.next();
InputStream is1 = rs.getBlob(1).getBinaryStream();
rs.close();
conn.commit();

rs = conn.createStatement().executeQuery("select data from ...");
rs.next();
InputStream is2 = rs.getBlob(1).getBinaryStream();
rs.close();
conn.commit();

int b1 = 0, b2 = 0;
while(is1.read()>=0) b1++;
while(is2.read()>=0) b2++;

System.out.println("Read " + b1 + " bytes from 1st blob");
System.out.println("Read " + b2 + " bytes from 2nd blob");

Even if both Blobs have been selected from the same physical connection and from within two different transactions, they can both be read completely.

Closing the JDBC connection (conn.close()) does however finally invalidate the Blob streams.

jarnbjo
  • 33,923
  • 7
  • 70
  • 94
  • §16.3.7 of the JDBC 4.2 spec confirms your interpretation that a `Blob` can be valid outside the transaction (I have updated my answer accordingly). However, reading Oracle's [JDBC Developer's Guide](http://docs.oracle.com/database/121/JJDBC/toc.htm), I am not seeing additional guarantees regarding the validity of a `Blob` outside of the transaction in which it was created/generated. I would have many questions such as what happens when the connection is re-used and the LOB data is modified? What happens when the LOB is deleted? Does this work only within the LOB prefetch size? Etc. – Daniel Trebbien Jun 03 '15 at 15:52
0

I had a similar related problem and I can confirm that at least in my situation PostgreSQL throws an exception Invalid large object descriptor : 0 with autocommit when using the StreamingOutput approach. The reason of this is that when the Response from JAX-RS is returned the transaction is committed and the streaming method is executing later. In the meanwhile the file descriptor is not valid anymore.

I have created some helper method, so that the streaming part is opening a new transaction and can stream the Blob. com.foobar.model.Blob is just a return class encapsulating the blob so that not the complete entity must be fetched. findByID is a method using a projection on the blob column and only fetching this column.

So StreamingOutput of JAX-RS and Blob under JPA and Spring transactions are working, but it must be tweaked. The same applied to JPA and EJB, I guess.

// NOTE: has to run inside a transaction to be able to stream from the DB
@Transactional
public void streamBlobToOutputStream(OutputStream outputStream, Class entityClass, String id, SingularAttribute attribute) {
    BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(outputStream);
    try {
        com.foobar.model.Blob blob = fooDao.findByID(id, entityClass, com.foobar.model.Blob.class, attribute);
        if (blob.getBlob() == null) {
            return;
        }
        InputStream inputStream;
        try {
            inputStream = blob.getBlob().getBinaryStream();
        } catch (SQLException e) {
            throw new RuntimeException("Could not read binary data.", e);
        }
        IOUtils.copy(inputStream, bufferedOutputStream);
        // NOTE: the buffer must be flushed without data seems to be missing
        bufferedOutputStream.flush();
    } catch (Exception e) {
        throw new RuntimeException("Could not send data.", e);
    }
}

/**
 * Builds streaming response for data which can be streamed from a Blob.
 *
 * @param contentType        The content type. If <code>null</code> application/octet-stream is used.
 * @param contentDisposition The content disposition. E.g. naming of the file download. Optional.
 * @param entityClass        The entity class to search in.
 * @param id                 The Id of the entity with the blob field to stream.
 * @param attribute          The Blob attribute in the entity.
 * @return the response builder.
 */
protected Response.ResponseBuilder buildStreamingResponseBuilder(String contentType, String contentDisposition,
                                                                 Class entityClass, String id, SingularAttribute attribute) {
    StreamingOutput streamingOutput = new StreamingOutput() {

        @Override
        public void write(OutputStream output) throws IOException, WebApplicationException {
            streamBlobToOutputStream(output, entityClass, id, attribute);
        }
    };
    MediaType mediaType = MediaType.APPLICATION_OCTET_STREAM_TYPE;
    if (contentType != null) {
        mediaType = MediaType.valueOf(contentType);
    }
    Response.ResponseBuilder response = Response.ok(streamingOutput, mediaType);
    if (contentDisposition != null) {
        response.header("Content-Disposition", contentDisposition);
    }
    return response;
}

/**
 * Stream a blob from the database.
 * @param contentType        The content type. If <code>null</code> application/octet-stream is used.
 * @param contentDisposition The content disposition. E.g. naming of the file download. Optional.
 * @param currentBlob The current blob value of the entity.
 * @param entityClass The entity class to search in.
 * @param id          The Id of the entity with the blob field to stream.
 * @param attribute   The Blob attribute in the entity.
 * @return the response.
 */
@Transactional
public Response streamBlob(String contentType, String contentDisposition,
                           Blob currentBlob, Class entityClass, String id, SingularAttribute attribute) {
    if (currentBlob == null) {
        return Response.noContent().build();
    }
    return buildStreamingResponseBuilder(contentType, contentDisposition, entityClass, id, attribute).build();
}

I also have to add to my answer that there might be an issue with the Blob behavior under Hibernate. By default Hibernate is merging the complete entity with the DB, also if only one field was changed, i.e. if you update a field name and also have a large Blob image untouched the image will be updated. Even worse because before the merge if the entity is detached Hibernate has to fetch the Blob from the DB to determine the dirty status. Because blobs cannot be byte wise compared (too large) they are considered immutable and the equal comparison is only based on the object reference of the blob. The fetched object reference from the DB will be a different object reference, so although nothing was changed the blob is updated again. At least this was the situation for me. I have used the annotation @DynamicUpdate at the entity and have written a user type handling the blob in a different way and checking if the must be updated.

k_o_
  • 5,143
  • 1
  • 34
  • 43