0

I'm trying to asynchronously add multiple entities to my database, as the "arrange" part of my integration test.

However, I'm getting double entries with my original code:

private async Task<Foo[]> GenerateFoos(params string[] names)
{
    var tasks = names.Select(async name =>
    {
        var foo = new Foo() { Name = name };
        await AddAsync(foo);
        return foo;
    });

    await Task.WhenAll(tasks);

    return tasks.Select(task => task.Result).ToArray();
}

[Test]
public async Task MyTest()
{
    var expected = await GenerateFoos("A", "B", "C");
}

It generates double the amount of objects (two of each) in my database, and I can't figure out why.

I've looked up other examples online but they all assume that the async method (AddAsync in my case) returns the objects I'm looking to return, but that is not the case here; as AddAsync only returns the ID, but I want to return the added object itself.

I've reworked the code twice and these two alternatives do not insert duplicate:

// Alternative 1
    
var result = new List<Corporation>();
        
foreach (var name in names)
{
    var foo= new Foo() { Name = name };
    await AddAsync(foo);
    result.Add(foo);
}

return result.ToArray();

// Alternative 2

var foos = names.Select(name => new Foo{ Name = name });

await Task.WhenAll(foos.Select(foo => AddAsync(foo)));

return foos.ToArray();

So I'm pretty sure the error stems from my code, not that of AddAsync. But I don't see why the first one is behaving differently. While I don't expect the content of AddAsync matters here, I'm adding it for completion's sake:

public static async Task AddAsync<TEntity>(TEntity entity)
    where TEntity : class
{
    using var scope = _scopeFactory.CreateScope();

    var context = scope.ServiceProvider.GetService<MyDbContext>();

    context.Add(entity);

    await context.SaveChangesAsync();
}

Can anyone spot why my tasks are being executed twice?

Flater
  • 12,908
  • 4
  • 39
  • 62
  • `tasks = names.Select(...).ToList()`? – GSerg Jul 12 '20 at 20:57
  • @GSerg: It does solve the problem, so it'll be a double enumeration issue (I would expect to be warned about that but apparently not...). Can you explain where/why the double enumeration is taking place? Post an answer and you'll get the tick :) – Flater Jul 12 '20 at 21:01

1 Answers1

7

var tasks contains a query, not the result of a query.

This query is enumerated twice, on Task.WhenAll() and on tasks.Select().
Because the call to AddAsync is inside a function created during the enumeration, it is executed twice.

Materializing the query solves the problem:

var tasks = names.Select(async name =>
    {
        var foo = new Foo() { Name = name };
        await AddAsync(foo);
        return foo;
    })
    .ToList();

Note that your Alternative 2 contains the same double enumeration problem, but this time the enumeration only causes new instances of Foo to be unnecessarily created. The calls to AddAsync are outside of the enumeration.

GSerg
  • 76,472
  • 17
  • 159
  • 346