1

I have written this code. It recursively creates folders in the web system by making REST Calls. So basically, it creates a folder for the root node, then gets all the child nodes and parallely and recursively calls itself. (for each child)

the only problem with the code is that if a node has too may children OR if the hierarchy is too deep, then I start getting "TaskCancellation" errors.

I have already tried increasing the timeout to 10 minutes.. but that does not solve the problem.

So my question is how can I start say 50 tasks, then wait for something to get freed and proceed only when there is an open slot in 50.

Currently I think my code is going on creating tasks without any limit as is flows through the hierarchy.

public async Task CreateSPFolder(Node node, HttpClient client, string docLib, string currentPath = null)
{
        string nodeName = Uri.EscapeDataString(nodeName);
        var request = new { __metadata = new { type = "SP.Folder" }, ServerRelativeUrl = nodeName };
        string jsonRequest = JsonConvert.SerializeObject(request);
        StringContent strContent = new StringContent(jsonRequest);
        strContent.Headers.ContentType = MediaTypeHeaderValue.Parse("application/json;odata=verbose");
        HttpResponseMessage resp = await client.PostAsync(cmd, strContent);                
        if (resp.IsSuccessStatusCode)
        {                    
            currentPath = (currentPath == null) ? nodeName : currentPath + "/" + nodeName;
        }
        else
        {
            string content = await resp.Content.ReadAsStringAsync();
            Console.WriteLine(content);
            throw new Exception("Failed to create folder " + content);
        }
    }

    List<Task> taskList = new List<Task>();
    node.Children.ToList().ForEach(c => taskList.Add(CreateSPFolder(c, client, docLib, currentPath)));
    Task.WaitAll(taskList.ToArray());
}
Knows Not Much
  • 30,395
  • 60
  • 197
  • 373
  • 1
    one way is to use this approach http://stackoverflow.com/a/25524803/1239433 – NeddySpaghetti Aug 29 '14 at 05:32
  • 1
    While you *can* do this with recursion, I recommend that you treat this problem in a different way: introduce a queue and then process that queue. An ideal queue for this kind of scenario is `ActionBlock` from the TPL Dataflow library. – Stephen Cleary Aug 29 '14 at 12:21

2 Answers2

3

You can use a SemaphoreSlim to control the number of concurrent tasks. You initialize the semaphore to the maximum number of tasks you want to have and then each time you execute a task you acquire the semaphore and then release it when you are finished with the task.

This is a somewhat simplified version of your code that runs forever using random numbers and executes a maximum of 2 tasks at the same time.

class Program
{
    private static SemaphoreSlim semaphore = new SemaphoreSlim(2, 2);

    public static async Task CreateSPFolder(int folder)
    {
        try
        {
            await semaphore.WaitAsync();
            Console.WriteLine("Executing " + folder);
            Console.WriteLine("WaitAsync - CurrentCount " + semaphore.CurrentCount);

            await Task.Delay(2000);
        }
        finally
        {
            Console.WriteLine("Finished Executing " + folder);
            semaphore.Release();
            Console.WriteLine("Release - CurrentCount " + semaphore.CurrentCount);
        }

        var rand = new Random();
        var next = rand.Next(10);
        var children = Enumerable.Range(1, next).ToList();

        Task.WaitAll(children.Select(CreateSPFolder).ToArray());            
    }

    static void Main(string[] args)
    {
        CreateSPFolder(1).Wait();

        Console.ReadKey();
    }
}
NeddySpaghetti
  • 13,187
  • 5
  • 32
  • 61
1

First of all, I think your problem is not the amount of tasks but the amount of blocked threads waiting at Task.WaitAll(taskList.ToArray());. It's better to wait asynchronously in such cases (i.e. await Task.WhenAll(taskList);

Secondly, you can use TPL Dataflow's ActionBlock with a MaxDegreeOfParallelism set to 50 and just post to it for every folder to be created. That way you have a flat queue of work to be executed and when it's empty, you're done.

Pseudo code:

var block = new ActionBlock<FolderInfo>(
    async folderInfo => {
        await CreateFolderAsync(folderInfo);
        foreach (var subFolder in GetSubFolders(folderInfo))
        {
            block.Post(subFolder);
        }
    },
    new DataFlowExecutionOptions {MaxDegreeOfParallelism = 5});

block.Post(rootFolderInfo);
i3arnon
  • 113,022
  • 33
  • 324
  • 344