7

I have three controller methods returning IAsyncEnumerable of WeatherForecast. The first one #1 uses SqlConnection and yields results from an async reader. The second one #2 uses EF Core with the ability to use AsAsyncEnumerable extension. The third one #3 uses EF Core and ToListAsync method.

I think the downside of #1 and #2 is if I, for example, do something time-consuming inside while or for each then the database connection will be open till the end. In scenario #3 I'm able to iterate over the list with a closed connection and do something else.

But, I don't know if IAsyncEnumerable makes sense at all for database queries. Are there any memory and performance issues? If I use IAsyncEnumerable for returning let's say HTTP request from API, then once a response is returned it's not in memory and I'm able to return the next one and so on. But what about the database, where is the whole table if I select all rows (with IAsyncEnumerable or ToListAsync)?

Maybe it's not a question for StackOverflow and I'm missing something big here.

#1

[HttpGet("db", Name = "GetWeatherForecastAsyncEnumerableDatabase")]
public async IAsyncEnumerable<WeatherForecast> GetAsyncEnumerableDatabase()
{
    var connectionString = "";
    await using var connection = new SqlConnection(connectionString);

    string sql = "SELECT * FROM [dbo].[Table]";
    await using SqlCommand command = new SqlCommand(sql, connection);

    connection.Open();
    await using var dataReader = await command.ExecuteReaderAsync();
    while (await dataReader.ReadAsync())
    {
        yield return new WeatherForecast
        {
            Date = Convert.ToDateTime(dataReader["Date"]),
            Summary = Convert.ToString(dataReader["Summary"]),
            TemperatureC = Convert.ToInt32(dataReader["TemperatureC"])
        };
    }

    await connection.CloseAsync();
}

#2

[HttpGet("ef", Name = "GetWeatherForecastAsyncEnumerableEf")]
public async IAsyncEnumerable<WeatherForecast> GetAsyncEnumerableEf()
{
    await using var dbContext = _dbContextFactory.CreateDbContext();
    await foreach (var item in dbContext
        .Tables
        .AsNoTracking()
        .AsAsyncEnumerable())
    {
        yield return new WeatherForecast
        {
            Date = item.Date,
            Summary = item.Summary,
            TemperatureC = item.TemperatureC
        };
    }
}

#3

[HttpGet("eflist", Name = "GetWeatherForecastAsyncEnumerableEfList")]
public async Task<IEnumerable<WeatherForecast>> GetAsyncEnumerableEfList()
{
    await using var dbContext = _dbContextFactory.CreateDbContext();
    var result =  await dbContext
        .Tables
        .AsNoTracking()
        .Select(item => new WeatherForecast
        {
            Date = item.Date,
            Summary = item.Summary,
            TemperatureC = item.TemperatureC
        })
        .ToListAsync();

    return result;
}
T. Dominik
  • 406
  • 3
  • 13
  • How many rows does the `[dbo].[Table]` have? – Theodor Zoulias Nov 29 '21 at 01:02
  • It might be 10000 up to 1000000 – T. Dominik Nov 29 '21 at 07:41
  • IAsyncEnumerable has limit. https://stackoverflow.com/questions/58986882/asyncenumerablereader-reached-the-configured-maximum-size-of-the-buffer-when-e – foad abdollahi Nov 29 '21 at 07:47
  • @foad-abdollahi good to know. It seems that it can be changed and also I'm using .NET 6 which support no buffering – T. Dominik Nov 29 '21 at 08:28
  • Except from simple queries like `SELECT * FROM [dbo].[Table]`, is it possible to also have heavier queries that take considerable amount of time to complete? For example queries that execute a user defined function (or subquery) for each row in the `[dbo].[Table]`? – Theodor Zoulias Nov 29 '21 at 11:37
  • @theodor-zoulias Assume that the query is somehow optimized and returned in a satisfying time. Even for complicated queries. But might return a lot of data. So we don't need to expect a timeout issue. – T. Dominik Nov 29 '21 at 11:47
  • What's your *actual* use case here? Returning an `IAsyncEnumerable` will send a JSON array to the caller, even if it writes the elements one by one. The *caller* will have to read and parse the entire payload. You'll avoid some blocking but that won't help the caller at all. To get *real* improvements you'd have to use a protocol that supports streaming like gRPC. Or you could use streaming JSON, ie emiting one JSON object per line. This would allow the caller to process the response line-by-line. – Panagiotis Kanavos Oct 19 '22 at 15:39
  • You can create a service that supports both gRPC and streaming JSON through [JSON transcoding](https://learn.microsoft.com/en-us/aspnet/core/grpc/json-transcoding?view=aspnetcore-7.0) (ie converting gRPC results to JSON) in ASP.NET Core 7. – Panagiotis Kanavos Oct 19 '22 at 15:40
  • Another option is to use SignalR and [stream results to the client](https://learn.microsoft.com/en-us/aspnet/core/signalr/streaming?view=aspnetcore-6.0). In all cases the server sends a stream of objects to the client, which can process them as they arrive. SignalR works with every client. To use gRPC with web browsers you need gRPC-Web, which isn't quite as efficient as the binary gRPC protocol – Panagiotis Kanavos Oct 19 '22 at 15:42

1 Answers1

4

Server-Side

If I only cared with the server I'd go with option 4 in .NET 6 :

  • Use an injected DbContext, write a LINQ query and return the results as AsAsyncEnumerable() instead of ToListAsync()
public class WeatherForecastsController:ControllerBase
{
    WeatherDbContext _dbContext;
    public WeatherForecastsController(WeatherDbContext dbContext)
    {
        _dbContext=dbContext;
    }

    public async IAsyncEnumerable<WeatherForecast> GetAsync()
    {
        return _dbContext.Forecasts.AsNoTracking()
                         .Select(item => new WeatherForecast
                         {
                             Date = item.Date,
                             Summary = item.Summary,
                             TemperatureC = item.TemperatureC
                         })
                         .AsAsyncEnumerable();
    }
}

A new Controller instance is created for every request which mean the DbContext will be around for as long as the request is being processed.

The [FromServices] attribute can be used to inject a DbContext into the action method directly. The behavior doesn't really change, the DbContext is still scoped to the request :

public async IAsyncEnumerable<WeatherForecast> GetAsync([FromServices] WeatherContext dbContext)
{
    ...
}

ASP.NET Core will emit a JSON array but at least the elements will be sent to the caller as soon as they're available.

Client-Side

The client will still have to receive the entire JSON array before deserialization.

One way to handle this in .NET 6 is to use DeserializeAsyncEnumerable to parse the response stream and emit items as they come:

using var stream=await client.GetAsStreamAsync(...);
var forecasts= JsonSerializer.DeserializeAsyncEnumerable(stream, new JsonSerializerOptions
        {
            DefaultBufferSize = 128
        });
await foreach(var forecast in forecasts)
{
...
}

The default buffer size is 16KB so a smaller one is needed if we want to receive objects as soon as possible.

This is a parser-specific solution though.

Use a streaming JSON response

A common workaround to this problem is to use streaming JSON aka JSON per line, aka Newline Delimited JSON aka JSON-NL or whatever. All names refer to the same thing - sending a stream of unindented JSON objects separated by a newline. It's an old technique that many tried to hijack and present as their own

{ "Date": "2022-10-18", Summary = "Blah", "TemperatureC"=18.5 }
{ "Date": "2022-10-18", Summary = "Blah", "TemperatureC"=18.5 }
{ "Date": "2022-10-18", Summary = "Blah", "TemperatureC"=18.5 }

That's not valid JSON but many parsers can handle it. Even if a parser can't, we can simply read one line of text at a time and parse it.

Use a different protocol

Even streaming JSON responses is a workaround. HTTP doesn't allow server-side streaming in the first place. The server has to send all data even if the client only reads the first 3 items since there's no way to cancel the response.

It's more efficient to use a protocol that does allow streaming. ASP.NET Core 6 offers two options:

In both cases the server sends objects to clients as soon as they're available. Clients can cancel the stream as needed.

In a SignalR hub, the code could return an IAsyncEnumerable or a Channel:

public class AsyncEnumerableHub : Hub
{
    ...
    public async IAsyncEnumerable<WeatherForecast> GetForecasts()
    {
        return _dbContext.Forecasts.AsNoTracking()
                         .Select(item => new WeatherForecast
                         {
                             Date = item.Date,
                             Summary = item.Summary,
                             TemperatureC = item.TemperatureC
                         })
                         .AsAsyncEnumerable();
    }
}

In gRPC, the server method writes objects to a response stream :

public override async Task StreamingFromServer(ForecastRequest request,
    IServerStreamWriter<ForecastResponse> responseStream, ServerCallContext context)
{
    ...
    await foreach (var item in queryResults)
    {
        if (context.CancellationToken.IsCancellationRequested)
        {
            return;
        }
        await responseStream.WriteAsync(new ForecastResponse{Forecast=item});
    }
}
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236