40

I am using IBM Websphere Application Server v6 and Java 1.4 and am trying to write large CSV files to the ServletOutputStream for a user to download. Files are ranging from a 50-750MB at the moment.

The smaller files aren't causing too much of a problem but with the larger files it appears that it is being written into the heap which is then causing an OutOfMemory error and bringing down the entire server.

These files can only be served out to authenticated users over HTTPS which is why I am serving them through a Servlet instead of just sticking them in Apache.

The code I am using is (some fluff removed around this):

    resp.setHeader("Content-length", "" + fileLength);
    resp.setContentType("application/vnd.ms-excel");
    resp.setHeader("Content-Disposition","attachment; filename=\"export.csv\"");

    FileInputStream inputStream = null;

    try
    {
        inputStream = new FileInputStream(path);
        byte[] buffer = new byte[1024];
        int bytesRead = 0;

        do
        {
            bytesRead = inputStream.read(buffer, offset, buffer.length);
            resp.getOutputStream().write(buffer, 0, bytesRead);
        }
        while (bytesRead == buffer.length);

        resp.getOutputStream().flush();
    }
    finally
    {
        if(inputStream != null)
            inputStream.close();
    }

The FileInputStream doesn't seem to be causing a problem as if I write to another file or just remove the write completely the memory usage doesn't appear to be a problem.

What I am thinking is that the resp.getOutputStream().write is being stored in memory until the data can be sent through to the client. So the entire file might be read and stored in the resp.getOutputStream() causing my memory issues and crashing!

I have tried Buffering these streams and also tried using Channels from java.nio, none of which seems to make any bit of difference to my memory issues. I have also flushed the OutputStream once per iteration of the loop and after the loop, which didn't help.

Robert
  • 1,286
  • 1
  • 17
  • 37
Martin
  • 1,057
  • 1
  • 9
  • 16
  • 2
    Try setting this Websphere Web container custom property - com.ibm.ws.webcontainer.channelwritetype=sync details are here - http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/index.jsp?topic=/com.ibm.websphere.express.doc/info/exp/ae/rweb_custom_props.html – Davanum Srinivas - dims Apr 01 '10 at 18:17

10 Answers10

45

The average decent servletcontainer itself flushes the stream by default every ~2KB. You should really not have the need to explicitly call flush() on the OutputStream of the HttpServletResponse at intervals when sequentially streaming data from the one and same source. In for example Tomcat (and Websphere!) this is configureable as bufferSize attribute of the HTTP connector.

The average decent servletcontainer also just streams the data in chunks if the content length is unknown beforehand (as per the Servlet API specification!) and if the client supports HTTP 1.1.

The problem symptoms at least indicate that the servletcontainer is buffering the entire stream in memory before flushing. This can mean that the content length header is not set and/or the servletcontainer does not support chunked encoding and/or the client side does not support chunked encoding (i.e. it is using HTTP 1.0).

To fix the one or other, just set the content length beforehand:

response.setContentLengthLong(new File(path).length());

Or when you're not on Servlet 3.1 yet:

response.setHeader("Content-Length", String.valueOf(new File(path).length()));
BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
  • Is there any way of doin this without knowing the file size? I'm creating a large file using Apache POI, therefore there's no actual file. All I have is `excelFile.write()`. To find out the size, I would need to read the whole stream. – Emil Terman Feb 07 '23 at 08:33
  • 1
    @Emil: Unfortunately no. Your best bet is to create a temp file on a very fast storage system, or to switch to CSV to have a predictable file size. – BalusC Feb 07 '23 at 11:04
1

So, following your scenario, shouldn't you been flush(ing) inside that while loop (on every iteration), instead of outside of it? I would try that, with a bit larger buffer though.

Kostas
  • 11
  • 1
1
  1. Kevin's class should close the m_out field if it's not null in the close() operator, we don't want to leak things, do we?

  2. As well as the ServletOutputStream.flush() operator, the HttpServletResponse.flushBuffer() operation may also flush the buffers. However, it appears to be an implementation specific detail as to whether or not these operations have any effect, or whether http content length support is interfering. Remember, specifying content-length is an option on HTTP 1.0, so things should just stream out if you flush things. But I don't see that

SteveL
  • 11
  • 1
  • 1) that's debatable. the class did not create the stream, so you could argue it has no ownership and should not perform close operations. – Renan Dec 09 '11 at 08:18
1

The while condition does not work, you need to check the -1 before using it. And please use a temporary variable for the output stream, its nicer to read and it safes calling the getOutputStream() repeadably.

OutputStream outStream = resp.getOutputStream();
while(true) {
    int bytesRead = inputStream.read(buffer);
    if (bytesRead < 0)
      break;
    outStream.write(buffer, 0, bytesRead);
}
inputStream.close();
out.close();
eckes
  • 10,103
  • 1
  • 59
  • 71
1

Does flush work on the output stream.

Really I wanted to comment that you should use the three-arg form of write as the buffer is not necessarily fully read (particularly at the end of the file(!)). Also a try/finally would be in order unless you want you server to die unexpectedly.

Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
  • Flush does work on the outputstream. Yeah it does have a try/finally block around it, the inputstream is closed in it. I have tried with both 1 and 3-arg version of both read and write and it didn't seem to make a difference so for readabilitys sake I used the 1 arg version in the post. – Martin Mar 26 '09 at 11:50
1

I have used a class that wraps the outputstream to make it reusable in other contexts. It has worked well for me in getting data to the browser faster, but I haven't looked at the memory implications. (please pardon my antiquated m_ variable naming)

import java.io.IOException;
import java.io.OutputStream;

public class AutoFlushOutputStream extends OutputStream {

    protected long m_count = 0;
    protected long m_limit = 4096; 
    protected OutputStream m_out;

    public AutoFlushOutputStream(OutputStream out) {
        m_out = out;
    }

    public AutoFlushOutputStream(OutputStream out, long limit) {
        m_out = out;
        m_limit = limit;
    }

    public void write(int b) throws IOException {

        if (m_out != null) {
            m_out.write(b);
            m_count++;
            if (m_limit > 0 && m_count >= m_limit) {
                m_out.flush();
                m_count = 0;
            }
        }
    }
}
Kevin Hakanson
  • 41,386
  • 23
  • 126
  • 155
1

I'm also not sure if flush() on ServletOutputStream works in this case, but ServletResponse.flushBuffer() should send the response to the client (at least per 2.3 servlet spec).

ServletResponse.setBufferSize() sounds promising, too.

david a.
  • 5,283
  • 22
  • 24
0

your code has an infinite loop.

do
{
    bytesRead = inputStream.read(buffer, offset, buffer.length);
    resp.getOutputStream().write(buffer, 0, bytesRead);
}
while (bytesRead == buffer.length);

offset has the same value thoughout the loop, so if initially offset = 0, it will remain so in every iteration which will cause infinite-loop and which will leads to OOM error.

0

unrelated to your memory problems, the while loop should be:

while(bytesRead > 0);
james
  • 1,379
  • 7
  • 6
  • Hmm if I set the while loop to that it would never write anything to the output stream. Unless I move an initial read outside the loop. Perhaps I should be using this instead while ((bytesRead = inputStream.read(buffer, offset, buffer.length)) != -1 ) would be safer. Either way unrelated :( – Martin Mar 26 '09 at 14:20
  • warning: returning 0 bytes is perfectly possible and should not terminate the loop. – eckes Jun 24 '11 at 10:18
-1

Ibm websphere application server uses asynchronous data transfer for servlets by default. That means that it buffers response. If you have problems with large data and OutOfMemory exceptions, try changing settings on WAS to use synchronous mode.

Setting the WebSphere Application Server WebContainer to synchronous mode

You must also take care of loading chunks and flush them. Sample for loading from large file.

ServletOutputStream os = response.getOutputStream();
FileInputStream fis = new FileInputStream(file);
            try {
                int buffSize = 1024;
                byte[] buffer = new byte[buffSize];
                int len;
                while ((len = fis.read(buffer)) != -1) {
                    os.write(buffer, 0, len);
                    os.flush();
                    response.flushBuffer();
                }
            } finally {
                os.close();
            }
zoki
  • 19
  • 3