12

I generate a very large .csv file from a database using the method outlined in

https://stackoverflow.com/a/13456219/141172

It works fine, up to a point. When the exported file is too large, I get an OutOfMemoryException.

If I turn off output buffering by modifying that code like this:

protected override void WriteFile(System.Web.HttpResponseBase response)
{
    response.BufferOutput = false; // <--- Added this
    this.Content(response.OutputStream);
}

the file download completes. However, it is several orders of magnitude slower than when output buffering was enabled (measured for the same file with buffering true/false, on localhost).

I understand that is slower, but why would it slow to a relative crawl? Is there anything I can do to improve processing speed?

UPDATE

It would also be an option to use File(Stream stream, String contentType) as suggested in the comments. However, I'm not sure how to create stream. The data is dynamically assembled based on a DB query, and a MemoryStream would run out of contiguous physical memory. Suggestions are welcome.

UPDATE 2

It was suggested in the comments that alternately reading from the database and writing to the stream is causing a degradation. I modified the code to perform the stream writing in a separate thread (using the producer/consumer pattern). There is no appreciable difference in performance.

Community
  • 1
  • 1
Eric J.
  • 147,927
  • 63
  • 340
  • 553
  • I don't understand why you need that linked answer. What's wrong with simply returning `File(myStream, "text/csv")` from your action? http://msdn.microsoft.com/en-us/library/dd493017(v=vs.100).aspx – ta.speot.is Sep 24 '14 at 07:26
  • Am I reading that wrong? You're passing your response output stream as input to Content? Which then writes to the response output stream again? Which class are you overriding the WriteFile method of? – Erik Funkenbusch Sep 24 '14 at 07:37
  • @ta.speot.is: I'm not sure how to create `myStream`. The data is dynamically assembled based on a DB query, and a MemoryStream would run out of contiguous physical memory. Can you show a pattern to create a `myStream` that can be passed to `File(myStream, "text/csv")`? – Eric J. Sep 24 '14 at 15:20
  • @ErikFunkenbusch: The method being overridden is from FileResult. Content() is an `Action`, allowing me to supply a method that writes to `response.OutputStream` in a structured manner. http://msdn.microsoft.com/en-us/library/system.web.mvc.fileresult.writefile(v=vs.118).aspx – Eric J. Sep 24 '14 at 15:25
  • Well, it would be slow because, with output caching on, when the server needs a new block to send to the client it gets it from cache, which is very fast. So if it takes 100ms to send a block, and 1ms to get the next block, this is 101ms + 101ms +.. etc.. Without caching it may take 100ms to read the next block from database, so now it takes 100ms to send the block and 100ms to get the next block, so it's 200ms + 200ms +... etc... the output cache effectively lets you use the "sending time" to read ahead, while uncached it blocks and reads on demand. – Erik Funkenbusch Sep 24 '14 at 17:00
  • You could implement a mechanism in which you read "just enough" ahead, to keep the buffers full... but this is probably something you'd have to figure out with your data. Maybe you could use an on-the-fly enumerable to always keep the next block ready... – Erik Funkenbusch Sep 24 '14 at 17:03
  • @ErikFunkenbusch: My understanding is that the output cache waits until the page is fully rendered before sending the contents. I believe that is why I get an out of memory exception with the buffer on, for large files. I am currently enumerating the result from the DB, writing to the response stream as I enumerate. – Eric J. Sep 24 '14 at 17:37
  • @EricJ. - Sorry, I thought you were asking why turning output caching off was so much slower (you said you knew it would be slower, but didn't understand why it was that much slower, and I was attempting to explain that). – Erik Funkenbusch Sep 24 '14 at 18:01
  • I don't know what ASP.NET and IIS are doing exactly with output streaming but maybe too small chunks are being uses. Hook in a `BufferedStream` with a very big buffer, like 4MB. – usr Sep 24 '14 at 18:22
  • @usr: Bingo! The buffered stream speeds things up around 100x. Feel free to turn your comment into an answer. – Eric J. Sep 24 '14 at 18:26

2 Answers2

9

I don't know what ASP.NET and IIS are doing exactly with output streaming but maybe too small chunks are being uses. Hook in a BufferedStream with a very big buffer, like 4MB.

According to your comments it worked. Now, tune down the buffer size to save memory and have a smaller working set. Good for cache.

As a subjective comment I'm disappointed that this is even necessary. IIS should use the right buffers automatically which is extremely easy with TCP connections.

EDIT FROM OP

Here is the code derived from this answer

public ActionResult Export()
{
    // Domain specific stuff here
    return new FileGeneratingResult("MyFile.txt", "text/text",
            stream => this.StreamExport(stream), false);
}

private void StreamExport(Stream stream)
{
    using (BufferedStream bs = new BufferedStream(stream, 256*1024))
    using (StreamWriter sw = new StreamWriter(bs))  
    foreach (var stuff in MyData())
    {
        sw.Write(stuff);
    }
}
Eric J.
  • 147,927
  • 63
  • 340
  • 553
usr
  • 168,620
  • 35
  • 240
  • 369
  • Performance gain is on the order of 100x. I, too, am surprised that an external `BufferedStream` is needed. – Eric J. Sep 24 '14 at 18:34
  • 1
    A 256KB buffer is subjectively about as fast as a 4MB buffer. As this is a seldom-used feature, I'm not going to tune any further. Anyone reading this handling a high volume of such requests would be well advised to test smaller buffers to find the optimal buffer size for good performance and a small memory footprint. – Eric J. Sep 24 '14 at 18:39
  • @EricJ. I realize this is a little while back now, but I'm running into a similar issue and I find this illuminating. Can you update your question to show how you used `BufferedStream`? I'm confused by how to hook it in. Thank you. – LoJo Oct 06 '15 at 02:16
  • @LoJo: Pasted in a simplified version of the code I created based on this answer. – Eric J. Oct 06 '15 at 02:44
  • 1
    I further increased performance by putting a buffer on the writer. Generating a very large csv file, (~11GB). I'm just going to assume when I put one on my reader I'll have further performance increase as it is also ~11GB. Thanks! – interesting-name-here Feb 20 '17 at 16:54
0

In Eric's latest update, he mentioned using another thread. I too had this problem for implementing database exports. Here is some example code for the solution I used:

Handling with temporary file stream

Community
  • 1
  • 1
Rob R
  • 133
  • 1
  • 10