-1

While going over some code in a console application, I saw the nested Task.WhenAll in the SecondInitialize function. I decided to test this function with a large Locations List and see how it reacted.

What I saw is that with about 100 locations, 100*100 = 10,000 Calculate calls, the t.Wait() inside of Start takes about 60 seconds to return or sometimes just hang completely. If i try to click Break All the console application doesn't even respond to my click and visual studio crashes.

When using my 'Easier to read version' inside of SecondInitialize, it also takes a while to return. Consistent behavior.

Now the weird part is, whenever I use the debugger and put a breakpoint inside of the SecondInitialize and then hit continue, it will finish in about 5-7 seconds.

So my question is, why is it hanging taking a long time normally when I see it being faster when I debug inside that function? Another question is whether or not the use of Tasks is being utilized correctly

public void Start()
{
    var t = CacheInitialize(locations, CancellationToken.None);
    t.Wait();
}

public Task CacheInitialize(IList<Location> locations, CancellationToken token)
{
    return SecondInitialize(locations, token);
}

public async Task SecondInitialize(IList<Location> locations, CancellationToken token)
{
    await Task.WhenAll(locations.Select(first =>
    {
        return Task.WhenAll(locations.Where(second => !second.Equals(first)).Select(second =>
        {
            return Calculate(first, second, token);
        }));
    }));

    //Easier to read version of ^
    //var tasks = locations.SelectMany(first => locations.Where(second => !second.Equals(first)).Select(second =>
    //{
    //  return Calculate(first, second, token);
    //}));
    //await Task.WhenAll(tasks);


    //No Tasks.
    //for (int x = 0; x < locations.Length; x++)
    //{
    //    for (int y = 0; y < locations.Length; y++)
    //    {
    //        if (x == y)
    //            continue;
    //        await Calculate(locations[x], locations[y], token).ConfigureAwait(false);
    //    }
    //}
}

public async Task<TripLength> Calculate(Location start, Location finish, CancellationToken token)
{
    if (start == finish)
        return TripLength.Zero;

    var parameters = new RouteParameters
    {
        Coordinates = new []
        {
            new Coordinate(start.Latitude, start.Longitude),
            new Coordinate(finish.Latitude, finish.Longitude)
        }
    };

    var route = await RunRoute(parameters, token);

    return ToTripLength(route);
}


protected Task<RouteResult> RunRoute(RouteParameters routeParams, CancellationToken token)
{
    return Task.Run(async () =>
    {
        var routingTask = Task.Run(() =>
        {
            RouteResult routeResults;
            var status = _routeService.Route(routeParams, out routeResults);
            return routeResults;
        }, token);
    return await routingTask.ConfigureAwait(false);

    }, token);
}
Kelsey Abreu
  • 1,094
  • 3
  • 17
  • 42
  • It would be awesome if you could provide a [mcve], including the `Route` method. – mjwills Jan 21 '19 at 21:25
  • 1
    Explain what you're trying to do, not how you think the solution would look. If I didn't work for an OTA I'd have a real hard time guessing what you want to do. I can't read the code but I suspect you want to speed up mapping search results to your own objects. – Panagiotis Kanavos Jan 22 '19 at 09:39
  • 1
    BTW what is `_routeService` and what does `_routeService.Route` do? It matters a *lot*. Different APIs are used for data parallelism (eg PLINQ), different once are used when you want to make multiple concurrent calls to a remote service (eg ActionBlock/TransformBlock) – Panagiotis Kanavos Jan 22 '19 at 09:43
  • 1
    First, the API meant for data parallelism is PLINQ. Instead of using raw tasks and using multiple nested `Task.WhenAll` statements to get the results, you could write a LINQ query to map one set of results to another and "just" put an `AsParallel()` statement near the top. Instead of having eg 100 tasks trying to get scheduled on 4 cores you could have 4 tasks each with its own partition of 25 data points to calculate – Panagiotis Kanavos Jan 22 '19 at 09:44
  • 1
    Second, calling a remote service is an *async* operation that shouldn't waste tasks. Making 100 concurrent calls to a third-party service will probably get you throttled too. That's where TransformBlock comes in, allowing you to make calls using a specific DoP. – Panagiotis Kanavos Jan 22 '19 at 09:46
  • 1
    Third, the code is overconvoluted, probably because it blocks itself and then tries to fix the problem by adding more code. The entire `runroute` method can be replaced with a single `RunRoute(RouteParameters routeParams, CancellationToken token)=>Task.Run(()=>_routeService,Route(routeParams,var out ruteResults); return results;},token);`. No need for the nested tasks or `ConfigureAwait`s – Panagiotis Kanavos Jan 22 '19 at 09:49
  • This smells like an XY problem. – Ian Kemp Jan 22 '19 at 11:21
  • @IanKemp yes and no. Once you read the code you realize it's trying to calculate lenghts between destinations - that's the X. Using raw tasks instead of a higher level library is the Y. Given that `100` is a standard result size when searching for air tickets, one could even suspect the OP is trying to create multi-hop routes from single segment trips. That would be Problem W or even Problem V - you can't just brute force the travelling salesman problem and calculating 10K routes is a lot cheaper once you find a way to eliminate invalid results – Panagiotis Kanavos Jan 23 '19 at 09:16

2 Answers2

1

The problem seems to be how to calculate routes from all trips connecting a set of locations (origins and destinations?) and calculate the length (cost?) of each route. The expensive work seems to be the call to _routeService.Route and ToTripLength.

Calculating 10K combinations from 100 locations is trivial and doesn't need parallelization. A simple LINQ query would work:

var combinations=( from start in locations
                   from finish in locations
                   where start!=finish
                   select (start,finish))
                 .ToArray();

What happens after that depends on what _routeService.Route does. If it's a local library, this is a data parallelism problem that tries to calculate 10K data points in the most efficient manner. That can be handled with PLINQ

If it's a call to an external service it's a concurrency problem that shouldn't waste CPU time waiting for 10K remote requests to respond.

Assuming _routeService.Route is a local libary, one can use PLINQ. A couple of helper methods will make writing the query easier though :

RouteParameters locationsToParams((Location start,Location finish) combination)
{
    return new RouteParameters {
        Coordinates = new[]
        {
            new Coordinate( start.Latitude, start.Longitude ),
            new Coordinate( finish.Latitude, finish.Longitude )
        }
    };
}

RouteResult  callRoute(RouteParameters routeParams)
{
    _routeService.Route(routeParams, out var routeResults);
    return routeResults;
}

var tripLengths = from cmb in combinations.AsParallel()
                  let routeParams=locationsToParams(cmb)
                  let result=callRoute(routeParams)
                  select ToTripLength(result);
var finalResults = tripLengths.ToArray();

AsParallel() will take the input IEnumerable, in this case combinations, partition it to as many partitions as there are cores and then use one worker task per partition. Each partition's data is fed to its worker task, minimizing the synchronization cost.

This could be used as a quick and rather dirty way to make 10K remote requests, as each call to Route will run on one of the worker tasks. This is wasteful because it blocks a task only to wait for a response. WithDegreeOfParallelism could be used to use more worker tasks than cores, but this still wastes CPU time waiting for a response. Blocking calls start with a SpinWait before a thread is suspended, which means a blocking call to a remote service can use a CPU core while doing nothing. This can seriously harm scalability on a server environment.

var tripLengths = from cmb in combinations.AsParallel()
                                          .WithDegreeOfParalellism(10)
                  let routeParams=locationsToParams(cmb)
                  let result=callRoute(routeParams)
                  select ToTripLength(result);
var finalResults = tripLengths.ToArray();
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
0

Since your example is not complete and can’t be compiled it’s kind of hard to see what exactly you are trying to do.

But as far as I can tell there are several problems:

  1. Calling Wait (or Result) on a Task can lead to deadlocks. Using ConfigureAwait( false ) will help avoid such issues but can’t eliminate all of them. So best always await a Task when you want to access it’s result.

  2. I don’t see what you are trying to accomplish with nesting Task.WhenAll within Task.WhenAll. WhenAll returns a single Task which you can just await without Task.WhenAll. Each Task you create will add some performance overhead, so you should try to create as few task as possible.

  3. Using Task.Run with an async delegate to await another Task(created by Task.Run) makes no sense, you are creating more Tasks than you need. You can just await a single Task.Run

I tried to create a working example (it won’t do any work) based on your code to show what you should change. Please note async Main method is only available in C# 7.1 or above.

public class Program
{
    public static async Task Main( String[] args )
    {
        var foo = new Foo();

        var sw = Stopwatch.StartNew();

        await foo.Start();

        sw.Stop();

        Console.WriteLine($"Elapsed {sw.Elapsed} {sw.ElapsedMilliseconds}ms");
        Console.ReadLine();
    }
}

public class Foo
{
    public async Task CacheInitialize( IList<Location> locations, CancellationToken token ) =>
        await SecondInitialize( locations, token )
            .ConfigureAwait( false );

    public async Task<TripLength> Calculate( Location start, Location finish, CancellationToken token )
    {
        if ( start == finish )
            return TripLength.Zero;

        var parameters = new RouteParameters
        {
            Coordinates = new[]
            {
                new Coordinate( start.Latitude, start.Longitude ),
                new Coordinate( finish.Latitude, finish.Longitude )
            }
        };

        var route = await RunRoute( parameters, token );

        return new TripLength();
    }

    public async Task SecondInitialize( IList<Location> locations, CancellationToken token )
    {
        var tasks = new List<Task>( locations.Count );

        foreach ( var outer in locations )
        foreach ( var inner in locations )
        {
            if ( inner.Equals( outer ) )
                continue;

            tasks.Add( Calculate( outer, inner, token ) );
        }

        await Task.WhenAll( tasks );
    }

    public async Task Start()
    {
        var locations = new List<Location>();
        await CacheInitialize( locations, CancellationToken.None )
            .ConfigureAwait( false );
    }

    protected async Task<RouteResult> RunRoute( RouteParameters routeParams, CancellationToken token )
    {
        return await Task
                     .Run( () =>
                           {
                               //RouteResult routeResults;
                               //var status = _routeService.Route( routeParams, out routeResults );
                               //return routeResults;
                               return new RouteResult();
                           },
                           token )
                     .ConfigureAwait( false );
    }
}

public class Coordinate
{
    public Double Latitude { get; }
    public Double Longitude { get; }
    public Coordinate( Double latitude, Double longitude )
    {
        Latitude = latitude;
        Longitude = longitude;
    }
}
public class RouteParameters
{
    public Coordinate[] Coordinates { get; set; }
}
public class TripLength
{
    public static TripLength Zero = new TripLength();
}
public class RouteResult
{
}
public class Location
{
    public Double Latitude { get; }
    public Double Longitude { get; }
}
musium
  • 2,942
  • 3
  • 34
  • 67