Problem
This has been driving me insane for days. We have a process here to import records from a csv file to the database, through a admin page which resides in a ASP.NET Web Forms (.NET 4.0) project. The process itself was too slow and I was responsible to make it faster. I started by changing the core logic, which gave a good performance boost.
But if I upload large files (well relatively large, about 3 MB tops), I have to wait until the upload process finishes until I start importing, and I don't return any progress to the client while I do so. The process itself is not that long, it takes about 5 to 10 seconds to complete, and yes, I've considered creating a separate Task
and polling the server, but I thought it was overkill.
What have I done so far?
So, to fix this issue, I've decided to read the incoming stream and import the values while i'm reading. I created a generic handler (.ashx), and put the following code inside void ProcessRequest(HttpContext context)
:
using (var stream = context.Request.GetBufferlessInputStream())
{
}
First I remove the headers, then I read the stream (through a StreamReader
) until I get a CRLF, convert the line to my model object, and keep reading the csv. When I get 200 records or so, I bulk update all of them to the database. Then, I keep getting more records until I end the file or get 200 records.
This seems to be working, but then, I've decided to stream my response as well. First, I disabled BufferOutput:
context.Response.BufferOutput = false;
Then, I added those headers to my response:
context.Response.AddHeader("Keep-Alive", "true");
context.Response.AddHeader("Cache-Control", "no-cache");
context.Response.ContentType = "application/X-MyUpdate";
Then, after sending those 200 records to the database, I write a response:
response.Write(s);
response.Flush();
s
is a string with a fixed size of 256 chars. I know 256 chars doesn't always equate 256 bytes, but I was just being sure I wouldn't write walls of text to the response and mess something up.
Here's its format:
| pipeline (record delimiter)
1 or 0 success or failure
; delimiter
error message (if applicable)
| pipeline (next demiliter and so on)
Example:
|0;"Invalid price on line 123"|1;|1;|0;"Invalid id on line 127"|
On the client side, here's what I have (just the request part):
function import(){
var formData = new FormData();
var file = $('[data-id=file]')[0].files[0];
formData.append('file', file);
var xhr = new XMLHttpRequest();
var url = "/Site/Update.ashx";
xhr.onprogress = updateProgress;
xhr.open('POST', url, true);
xhr.setRequestHeader("Content-Type", "multipart/form-data");
xhr.setRequestHeader("X-File-Name", file.name);
xhr.setRequestHeader("X-File-Type", file.type);
xhr.send(formData);
}
function updateProgress(evt){
debugger;
}
What happened :(
- It doesn't send the data immediately to the client when I call
response.Flush
. I understand there is buffering from the client-side, but it doesn't seem to be working at all, even when I send a lot of dummy data to bypass this issue. - After some time, when I write too much stuff on
Response.Write
, the method will become slower and slower until it hangs. Same withResponse.Flush
. I guess I'm missing something here. - I created a simple webforms project to test what I've been trying to do. It has a generic handler which will return a number each second for 10 seconds. It actually updates (not always on a 1-sec fashion) and I can see the ongoing progress.
- When I write just a few lines to the response, it actually shows progress, but ALWAYS after the whole process is almost finishing. The main problem is when I get errors and I try to write these to the response. They're longer than
success
strings because they contain the error message.
I assume that if I write Response.Flush
it's not 100% guaranteed to go to the client, correct? Or is the client itself the problem? If it is the client, why does the server hang when I call Response.Write
too much?
EDIT: As an addendum, if I throw the same piece of code into a aspx page, it works. So I believe it has something to do with the xhr
(XMLHttpRequest) itself, which is not prepared to process streaming data, it seems.
I'll be glad to give more information if needed.