31

I'm working on a Java web application in which files will be stored in a database. Originally, we retrieved files already in the DB by simply calling getBytes on our result set:

byte[] bytes = resultSet.getBytes(1);
...

This byte array was then converted into a DataHandler using the obvious constructor:

dataHandler = new DataHandler(bytes, "application/octet-stream");

This worked great until we started trying to store and retrieve larger files. Dumping the entire file contents into a byte array and then building a DataHandler out of that simply requires too much memory.

My immediate idea is to retrieve a stream of the data in the database with getBinaryStream and somehow convert that InputStream into a DataHandler in a memory-efficient way. Unfortunately it doesn't seem like there's a direct way to convert an InputStream into a DataHandler. Another idea I've been playing with is reading chunks of data from the InputStream and writing them to the OutputStream of the DataHandler. But... I can't find a way to create an "empty" DataHandler that returns a non-null OutputStream when I call getOutputStream...

Has anyone done this? I'd appreciate any help you can give me or leads in the right direction.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
pcorey
  • 850
  • 2
  • 9
  • 14

8 Answers8

24

An implementation of the answer from Kathy Van Stone:

At first, create a helper class, which creates a DataSource from an InputStream:

public class InputStreamDataSource implements DataSource {
    private InputStream inputStream;

    public InputStreamDataSource(InputStream inputStream) {
        this.inputStream = inputStream;
    }

    @Override
    public InputStream getInputStream() throws IOException {
        return inputStream;
    }

    @Override
    public OutputStream getOutputStream() throws IOException {
        throw new UnsupportedOperationException("Not implemented");
    }

    @Override
    public String getContentType() {
        return "*/*";
    }

    @Override
    public String getName() {
        return "InputStreamDataSource";
    }
}

And then you can create a DataHandler from an InputStream:

DataHandler dataHandler = new DataHandler(new InputStreamDataSource(inputStream))

imports:

import javax.activation.DataSource;
import java.io.OutputStream;
import java.io.InputStream;
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
bugs_
  • 3,544
  • 4
  • 34
  • 39
19

I also ran into this issue. If your source data is a byte[], Axis already has a class that wraps the InputStream and creates a DataHandler object. Here is the code

// This constructor takes byte[] as input
ByteArrayDataSource rawData = new ByteArrayDataSource(resultSet.getBytes(1));
DataHandler data = new DataHandler(rawData);
yourObject.setData(data);

Related imports

import javax.activation.DataHandler;
import org.apache.axiom.attachments.ByteArrayDataSource;
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Jorge Pombar
  • 207
  • 2
  • 2
  • 7
    Since it loads all the data to memory, it would cause problems when managing large data. – Jordan Silva Dec 02 '13 at 14:41
  • 1
    There are other implementations for the DataSource interface : I used import javax.mail.util.ByteArrayDataSource; – Croo Mar 23 '21 at 17:23
18

My approach would be to write a custom class implementing DataSource that wraps your InputStream. Then create the DataHandler giving it the created DataSource.

Kathy Van Stone
  • 25,531
  • 3
  • 32
  • 40
  • Ah, that's a great idea. I'll try that when I get a chance. – pcorey May 13 '10 at 21:58
  • I thought the same. But beware, that then the DataHandler must be used (consume its input), "inside you loop", while the ResultSet is open. For example, you cant probably pass the DataHandler object to an upper layer. – leonbloy May 13 '10 at 22:02
  • @leonbloy The stated goal was to process the data without copying it from result set. This implies that the result set must be open the entire time regardless of how you do it. – Kathy Van Stone May 13 '10 at 23:04
4

Note that the getInputStream of the DataSource must return a new InputStream every time called. This means you need to copy it somewhere first.

For more information, see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4267294

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Stefan
  • 41
  • 1
2

bugs_'s code doesn't work for me. I use DataSource to create attachments to email (from objects that have inputStream and name) and content of attachments lost.

It looks like Stefan is right and a new inputStream must be returned every time. At least in my specific case. The following implementation deals with the problem:

public class InputStreamDataSource implements DataSource {

    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    private final String name;

    public InputStreamDataSource(InputStream inputStream, String name) {
        this.name = name;
        try {
            int nRead;
            byte[] data = new byte[16384];
            while ((nRead = inputStream.read(data, 0, data.length)) != -1) {
                buffer.write(data, 0, nRead);
            }

            buffer.flush();
            inputStream.close();
        } catch (IOException e) {
            e.printStackTrace();
        }

    }

    @Override
    public String getContentType() {
        return new MimetypesFileTypeMap().getContentType(name);
    }

    @Override
    public InputStream getInputStream() throws IOException {
        return new ByteArrayInputStream(buffer.toByteArray());
    }

    @Override
    public String getName() {
        return name;
    }

    @Override
    public OutputStream getOutputStream() throws IOException {
        throw new IOException("Read-only data");
    }
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Yuriy N.
  • 4,936
  • 2
  • 38
  • 31
0

I've met the situation when InputStream requested from DataSource twice: using a logging handler together with MTOM feature.

With this proxy stream solution, my implementation works fine:

import org.apache.commons.io.input.CloseShieldInputStream;
import javax.activation.DataHandler;
import javax.activation.DataSource;
...

private static class InputStreamDataSource implements DataSource {
    private InputStream inputStream;

    @Override
    public InputStream getInputStream() throws IOException {
        return new CloseShieldInputStream(inputStream);
    }

    @Override
    public OutputStream getOutputStream() throws IOException {
        throw new UnsupportedOperationException("Not implemented");
    }

    @Override
    public String getContentType() {
        return "application/octet-stream";
    }

    @Override
    public String getName() {
        return "";
    }
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Grigory Kislin
  • 16,647
  • 10
  • 125
  • 197
0

Here is an answer for specifically working with the Spring Boot org.springframework.core.io.Resource object which is, I think, how a lot of us are getting here. Note that you might need to modify the content type in the code below as I'm inserting a PNG file into an HTML formatted email.

Note: As others have mentioned, merely attaching an InputStream isn't enough as it gets used multiple times. Just mapping through to Resource.getInputStream() does the trick.

public class SpringResourceDataSource implements DataSource {
    private Resource resource;

    public SpringResourceDataSource(Resource resource) {
        this.resource = resource;
    }

    @Override
    public InputStream getInputStream() throws IOException {
        return resource.getInputStream();
    }

    @Override
    public OutputStream getOutputStream() throws IOException {
        throw new UnsupportedOperationException("Not implemented");
    }

    @Override
    public String getContentType() {
        return "image/png";
    }

    @Override
    public String getName() {
        return "SpringResourceDataSource";
    }
}

Usage of the class looks like this:

PathMatchingResourcePatternResolver pathMatchingResourcePatternResolver = new PathMatchingResourcePatternResolver();
Resource logoImage = pathMatchingResourcePatternResolver.getResource("/static/images/logo.png");
MimeBodyPart logoBodyPart = new MimeBodyPart();
DataSource logoFileDataSource = new SpringResourceDataSource(logoImage);

logoBodyPart.setDataHandler(new DataHandler(logoFileDataSource));
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Gandalf
  • 71
  • 1
  • 4
0
import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.tool.xml.XMLWorkerHelper;
import org.apache.commons.io.IOUtils;

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;

.
.
.

 DataSource ds = new ByteArrayDataSource(convertHtmlToPdf("<span>html here</span>"), "application/pdf");

 DataHandler dataHandler = new DataHandler(ds);

.
.
.

public static byte[] convertHtmlToPdf(String htmlString) throws IOException, DocumentException {
    Document document = new Document();

    ByteArrayOutputStream out = new ByteArrayOutputStream();

    PdfWriter writer = PdfWriter.getInstance(document, out);
    document.open();

    InputStream in = IOUtils.toInputStream(htmlString);
    XMLWorkerHelper.getInstance().parseXHtml(writer, document, in);
    document.close();

    return out.toByteArray();
}

possible error: the meta tag must be closed. <meta></meta>

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131