0

I've been experimenting with some old code that needs refactoring in places and was testing if there was any improvement to iis threads etc by uploading file asynchronously (Server Side). Using jQuery file upload client side.

The original code

[HttpPost]
public ActionResult UploadDocument( HttpPostedFileBase uploadedFile ) {

  // Do any validation here

  // Read bytes from http input stream into fileData
  Byte[] fileData;

  using ( BinaryReader binaryReader = 
          new BinaryReader( uploadedFile.InputStream ) ) {

    fileData = binaryReader.ReadBytes( uploadedFile.ContentLength );

  }

  // Create a new Postgres bytea File Blob ** NOT Async **
  _fileService.CreateFile( fileData );

  return Json(
    new {
      ReturnStatus = "SUCCESS" // Or whatever
    }
  );

}

The new code

[HttpPost]
public async Task<ActionResult> UploadDocumentAsync( HttpPostedFileBase uploadedFile ) {

  // Do any validation here

  // Read bytes from http input stream into fileData
  Byte[] fileData = new Byte[uploadedFile.ContentLength];

  await uploadedFile.InputStream.ReadAsync( fileData, 0, uploadedFile.ContentLength );

  // Create a new Postgres bytea File Blob ** NOT Async **
  _fileService.CreateFile( fileData );

  return Json(
    new {
      ReturnStatus = "SUCCESS" // Or whatever
    }
  );

}

The new method appears to work correctly but my question is:

Is the following code the correct (best) way to do it? and are there any gotchas doing it this way? There is a lot of contradictory and out of date information out there. There also seems to be a lot of debate on whether there is any improvement or point in actually doing this. Yes it give back threads to iis etc but is it worth the overhead type of debate.

The code in question

// Read bytes from http input stream into fileData
Byte[] fileData = new Byte[uploadedFile.ContentLength];

await uploadedFile.InputStream.ReadAsync( fileData, 0, uploadedFile.ContentLength );
Todd Menier
  • 37,557
  • 17
  • 150
  • 173
William Humphreys
  • 1,206
  • 1
  • 14
  • 38
  • 2
    async /await is the tail wagging the dog I think. Would love to see some arguments that prove me wrong. This is a good question. Therefore, it will almost certainly be deleted by some crossing guard admin. –  Apr 03 '15 at 14:32
  • Why did you not make CreateFile async as well? This is a test question to see whether you have a particular common misunderstanding or not. Meanwhile I'd like to refer to you to my treatment of this topic: http://stackoverflow.com/a/25087273/122718 and http://stackoverflow.com/a/12796711/122718. – usr Apr 03 '15 at 14:52
  • usr I probably will but thats not really what I was asking. Though on that topic there is much debate on whether fast local database access is quicker left synchronous. – William Humphreys Apr 03 '15 at 14:56

2 Answers2

1

The code is correct. Using async IO is not all or nothing. You can safely mix sync and async IO, as you have done.

Whether you should be using async has been covered already. For ASP.NET the basic rule is to use it in cases where the operation might have a really high latency and is being invoked a lot at the same time. Only in that case is it important to not free threads. If the operations are quick or rare then there are not many threads to free and this is a waste of developer time.

Reading from a buffered (default) input stream is the same as reading from a file. ASP.NET buffers long inputs to files on disk. This is a classic case where async IO does not provide gains in throughput. In fast the file is likely fully cached so that the IO is purely CPU based (a memcpy from the cache). Total waste of dev time and CPU cycles. No benefits at all.

Community
  • 1
  • 1
usr
  • 168,620
  • 35
  • 240
  • 369
  • So in this case would you say its a good thing to Async the file uploads or not. – William Humphreys Apr 03 '15 at 15:45
  • My advice: Since you probably have very few concurrent uploads the question is moot. Use the default which is synchronous. If you really want this to be async make the upload unbuffered. This is an ASP.NET setting somewhere. – usr Apr 03 '15 at 15:48
  • It is quite a misunderstood subject and a lot of conflicting information. It doesn't help that its become quite 'fashionable' now to do it the Async way. – William Humphreys Apr 03 '15 at 15:48
  • That is true. In my answers I'm on a holy crusade to pull people out of this fad. There are very few good uses for async on the server (but they exist). – usr Apr 03 '15 at 15:49
  • The live system this is on does have a lot of concurrent uploads and site hits generally. Really it would benefit from faster etc hardware but that isn't an option for them. – William Humphreys Apr 03 '15 at 15:51
  • If the number of concurrent uploads becomes greater than 10 I'd *consider* going async. When it gets to 100 it's a really good case for async. But few sites have those levels of concurrency. – usr Apr 03 '15 at 15:59
  • I'd gestimate about 30 40 uploads at any peak time though this is in bursts and not 24/7. That's really what prompted me to look at this on refactoring to maybe improve performance. I actually wasn't going to async the database update as I believe this doesn't really gain much improvement. I genuinely don't know this to be the case as all the conflicting information. Its really what prompted the question. The uploads are small as well average 100k max but are over a very slow VPN. – William Humphreys Apr 03 '15 at 16:07
  • 1
    Sounds like you should run a test. Write a slow-uploading client and launch a few hundred of them. – usr Apr 03 '15 at 16:33
  • @usr >> If you really want this to be async make the upload unbuffered... I dont understand why this is true. Is he looking for paralellism in which case he should fire off a task? –  Apr 03 '15 at 16:53
  • @Sam that advice of mine was bogus... If you don't care about temporary disk space usage buffered uploads are not a problem. In fact they make async unnecessary since all data is already available without blocking when the controller is hit. So the new advice is: Use buffered synchronous uploads and measure this. – usr Apr 03 '15 at 17:05
  • I am interested to see your benchmarks. Do you plan to test against a parallel design as I posted in my answer? –  Apr 03 '15 at 17:20
1

To answer your questions specifically:

  1. There's nothing wrong with your code, and in my opinion you should go forward with that change. Depending on a variety of factors (file sizes/buffering as pointed out, concurrency, etc), it may or may not make a noticeable improvement (and any improvement will be in scalability, not speed), but you will almost certainly not be worse off. The "overhead" you speak of involves the state machine written by the compiler, and it is nearly negligable in most situations.

  2. One minor gotcha I can think of: You'd want to take a look at what's happening after await and make sure there's no assumptions that that code will be running on the same thread as the code before the await. Not likely in this case, but something to be aware of.

I would disagree with the notion that async is a "fad" with few good uses on the server. It's a direct response to hardware trends. Processor speeds are not increasing at the rate they once did. Today the race is for more cores, and we're seeing programming languages and frameworks react to that trend with features that make it easier to write software to take advantage of that hardware-level concurrency.

Relatively long-running I/O is commonplace in server applications (working with files, calling external APIs, etc), and freeing threads to do other work during those operations is almost always a benefit.

Community
  • 1
  • 1
Todd Menier
  • 37,557
  • 17
  • 150
  • 173