I plan to read a remote file line by line asynchronously using https://github.com/Dasync/AsyncEnumerable (since there is not yet Async Streams [C# 8 maybe]: https://github.com/dotnet/csharplang/blob/master/proposals/async-streams.md):
public static class StringExtensions
{
public static AsyncEnumerable<string> ReadLinesAsyncViaHttpClient(this string uri)
{
return new AsyncEnumerable<string>(async yield =>
{
using (var httpClient = new HttpClient())
{
using (var responseStream = await httpClient.GetStreamAsync(uri))
{
using (var streamReader = new StreamReader(responseStream))
{
while(true)
{
var line = await streamReader.ReadLineAsync();
if (line != null)
{
await yield.ReturnAsync(line);
}
else
{
return;
}
}
}
}
}
});
}
public static AsyncEnumerable<string> ReadLinesAsyncViaWebRequest(this string uri)
{
return new AsyncEnumerable<string>(async yield =>
{
var request = WebRequest.Create(uri);
using (var response = request.GetResponse())
{
using (var responseStream = response.GetResponseStream())
{
using (var streamReader = new StreamReader(responseStream))
{
while(true)
{
var line = await streamReader.ReadLineAsync();
if (line != null)
{
await yield.ReturnAsync(line);
}
else
{
return;
}
}
}
}
}
});
}
}
It seems that they both run just fine in a simple Console application like below:
public class Program
{
public static async Task Main(string[] args)
{
// Or any other remote file
const string url = @"https://gist.githubusercontent.com/dgrtwo/a30d99baa9b7bfc9f2440b355ddd1f75/raw/700ab5bb0b5f8f5a14377f5103dbe921d4238216/by_tag_year.csv";
await url.ReadLinesAsyncViaWebRequest().ForEachAsync(line =>
{
Console.WriteLine(line, Color.GreenYellow);
});
await url.ReadLinesAsyncViaHttpClient().ForEachAsync(line =>
{
Console.WriteLine(line, Color.Purple);
});
}
}
... but I have some concerns if it is used as part of an ASP.NET Core WebAPI to process the lines and then push them using PushStreamContent:
- https://learn.microsoft.com/en-us/previous-versions/aspnet/hh995285(v=vs.108)
- https://blog.stephencleary.com/2016/10/async-pushstreamcontent.html
The idea would be to have a pipeline of data which leverages async
/ await
so that the number of threads in use is as low as possible and also to avoid an increase in memory (which leverage the enumerable-like feature of AsyncEnumerable).
I read several articles but it seems it's all non .NET Core versions and I don't really know if there would be some potential performance issues / caveats in regard to what I would like to achieve?
- Difference between HttpRequest, HttpWebRequest and WebRequest
- http://www.diogonunes.com/blog/webclient-vs-httpclient-vs-httpwebrequest/
An example of "business" case would be:
using System;
using System.Collections.Async;
using System.IO;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
namespace WebApplicationTest.Controllers
{
[Route("api/[controller]")]
[ApiController]
public class DumbValuesController : ControllerBase
{
private static readonly Random Random = new Random();
// GET api/values
[HttpGet]
public async Task<IActionResult> DumbGetAsync([FromQuery] string fileUri)
{
using (var streamWriter = new StreamWriter(HttpContext.Response.Body))
{
await fileUri.ReadLinesAsyncViaHttpClient().ForEachAsync(async line =>
{
// Some dumb process on each (maybe big line)
line += Random.Next(0, 100 + 1);
await streamWriter.WriteLineAsync(line);
});
}
return Ok();
}
}
}