TPL vs Multithreading

Question

I am new to threading and I need a clarification for the below scenario.

I am working on apple push notification services. My application demands to send notifications to 30k users when a new deal is added to the website.

can I split the 30k users into lists, each list containing 1000 users and start multiple threads or can use task?

Is the following way efficient?

if (lstDevice.Count > 0)
{

    for (int i = 0; i < lstDevice.Count; i += 2)
    {
        splitList.Add(lstDevice.Skip(i).Take(2).ToList<DeviceHelper>());
    }

    var tasks = new Task[splitList.Count];
    int count=0;
    foreach (List<DeviceHelper> lst in splitList)
    {
        tasks[count] = Task.Factory.StartNew(() =>
        {
            QueueNotifications(lst, pMessage, pSubject, pNotificationType, push);
        },
            TaskCreationOptions.None);
       count++;
    }

QueueNotification method will just loop through each list item and creates a payload like

foreach (DeviceHelper device in splitList)
{
    if (device.PlatformType.ToLower() == "ios")
    {
        push.QueueNotification(new AppleNotification()
                                    .ForDeviceToken(device.DeviceToken)
                                    .WithAlert(pMessage)
                                    .WithBadge(device.Badge)
                                     );
        Console.Write("Waiting for Queue to Finish...");
    }
}
push.StopAllServices(true);

It don't think that you really need lot's of tasks - I think you just need a few and do your IO async - aside from this: bundling and using those is a good idea yes - but the question is not easy to answer - you have to do the performance tests yourself - we don't know your system — Random Dev, Nov 03 '14 at 08:39
You could also take a look at `Parallel.ForEach` which basically does all the low level stuff that you do, i.e. splitting the input into several partitions and executing them in different tasks. But whether this would be a good solution or not depends on the work done by the `QueueNotifications` method. — Dirk, Nov 03 '14 at 08:47
Splitting into lists tends to be poor design. What happens if one list winds up with a lot of notifications that take longer and one winds up with a lot of notifications that don't take very long? — David Schwartz, Nov 03 '14 at 08:50
My advice is to avoid using threads. Use a higher level abstraction like TPL or P-LINQ. These abstractions are implemented on top of threads by very clever people. Is the CPU even the botteneck? To create truly scalable push notifications you need to be able to scale across computers. I suggest that you take a look at [Azure Mobile Services](http://azure.microsoft.com/en-us/documentation/services/mobile-services/). — Martin Liversage, Nov 03 '14 at 09:04
Looks like you should be looking at [`TPL Dataflow`](http://msdn.microsoft.com/en-us/library/hh228603(v=vs.110).aspx) — Yuval Itzchakov, Nov 03 '14 at 09:13
PLINQ (ie `Parallel.For`) already does what you ask, partitioning the list then using a limited set of tasks to process each item — Panagiotis Kanavos, Nov 04 '14 at 09:26

score 6 · Answer 1 · answered Nov 03 '14 at 09:37

Technically it is sure possible to split a list and then start threads that runs your List in parallel. You can also implement everything yourself, as you already have done, but this isn't a good approach. At first splitting a List into chunks that gets processed in parallel is already what Parallel.For or Parallel.ForEach does. There is no need to re-implement everything yourself.

Now, you constantly ask if something can run 300 or 500 notifications in parallel. But actually this is not a good question because you completly miss the point of running something in parallel.

So, let me explain you why that question is not good. At first, you should ask yourself why do you want to run something in parallel? The answer to that is, you want that something runs faster by using multiple CPU-cores.

Now your simple idea is probably that spawning 300 or 500 threads is faster, because you have more threads and it runs more things "in parallel". But that is not exactly the case.

At first, creating a thread is not "free". Every thread you create has some overhead, it takes some CPU-time to create a thread, and also it needs some memory. On top of that, if you create 300 threads it doesn't mean 300 threads run in parallel. If you have for example an 8 core CPU only 8 threads really can run in parallel. Creating more threads can even hurt your performance. Because now your program needs to switch constanlty between threads, that also cost CPU-performance.

The result of all that is. If you have something lightweight some small code that don't do a lot of computation it ends that creating a lot of threads will slow down your application instead of running faster, because the managing of your threads creates more overhead than running it on (for example) 8 cpu-cores.

That means, if you have a list of 30,000 of somewhat. It usally end that it is faster to just split your list in 8 chunks and work through your list in 8 threads as creating 300 Threads.

Your goal should never be: Can it run xxx things in parallel? The question should be like: How many threads do i need, and how much items should every thread process to get my work as fastest done.

That is an important difference because just spawning more threads doesn't mean something ends up beeing fast.

So how many threads do you need, and how many items should every thread process? Well, you can write a lot of code to test it. But the amount changes from hardware to hardware. A PC with just 4 cores have another optimum than a system with 8 cores. If what you are doing is IO bound (for example read/write to disk/network) you also don't get more speed by increasing your threads.

So what you now can do is test everything, try to get the correct thread number and do a lot of benchmarking to find the best numbers.

But actually, that is the whole purpose of the TPL library with the Task<T> class. The Task<T> class already looks at your computer how many cpu-cores it have. And when you are running your Task it automatically tries to create as much threads needed to get the maximum out of your system.

So my suggestion is that you should use the TPL library with the Task<T> class. In my opinion you should never create Threads directly yourself or doing partition yourself, because all of that is already done in TPL.

Its quite useful theory. Thanks!! Can I get a sample piece of code? How is my code different from your understanding — dotnet developer, Nov 03 '14 at 10:00
_"The `Task` class already looks at your computer how many cpu-cores it have"_ Not really. Tasks are executed using a scheduler and in this case the scheduler is using the .NET thread pool which uses the CPU count to limit the number of threads. However, using TPL alone does not by some magic process parallelize your workload to maximize performance. You will have to split your work into smaller tasks and creating more tasks than the thread pool can handle will queue them up and potentially waste resources. — Martin Liversage, Nov 03 '14 at 10:24
Yes, the `ThreadPool` is used. But no, even the default ThreadPool Scheduler does a lot more than just setting the Thread amount to the number of the CPU count. Even the default ThreadPool creates more threads if it is needed. — David Raab, Nov 03 '14 at 12:08

score 0 · Answer 2 · edited May 23 '17 at 11:57

0

I think the Task-Class is a good choise for your aim, becuase you have an easy handling over the async process and don't have to deal with Threads directly.

Maybe this help: Task vs Thread differences

But to give you a better answer, you should improve your question an give us more details.

You should be careful with creating to much parallel threads, because this can slow down your application. Read this nice article from SO: How many threads is too many?. The best thing is you make it configurable and than test some values.

edited May 23 '17 at 11:57

Community

1
1

answered Nov 03 '14 at 08:42

BendEg

20,098
17
57
131

Thanks for the prompt response. So you mean the code which I have written will suffice the requirement? Is TPL able to handle and run ~300 tasks paralelly (Considering in my case 30k users with 300 list, 1000 users in each list)? Pls clarify. – dotnet developer Nov 03 '14 at 08:46
Maybe it is important how **QueueNotifications** works, because maybe you can improve things inside (threading). – BendEg Nov 03 '14 at 08:51

score 0 · Answer 3 · answered Nov 03 '14 at 09:01

0

I agree Task is a good choice however creating too many tasks also bring risks to your system and for failures, your decision is also a factor to come up a solution. For me I prefer MSQueue combining with thread pool.

answered Nov 03 '14 at 09:01

Phong Vo

1,078
7
16

Martin Liversage · Answer 4 · 2014-11-03T11:14:14.683

If you want parallelize the creation of the push notifications and maximize the performance by using all CPU's on the computer you should use Parallel.ForEach:

Parallel.ForEach(
  devices,
  device => {
    if (device.PlatformType.ToUpperInvariant() == "IOS") {
      push.QueueNotification(
        new AppleNotification()
          .ForDeviceToken(device.DeviceToken)
          .WithAlert(message)
          .WithBadge(device.Badge)
      );
    }
  }
);
push.StopAllServices(true);

This assumes that calling push.QueueNotification is thread-safe. Also, if this call locks a shared resource you may see lower than expected performance because of lock contention.

To avoid this lock contention you may be able to create a separate queue for each partition that Parallel.ForEach creates. I am improvising a bit here because some details are missing from the question. I assume that the variable push is an instance of the type Push:

Parallel.ForEach(
  devices,
  () => new Push(),
  (device, _, push) => {
    if (device.PlatformType.ToUpperInvariant() == "IOS") {
      push.QueueNotification(
        new AppleNotification()
          .ForDeviceToken(device.DeviceToken)
          .WithAlert(message)
          .WithBadge(device.Badge)
      );
    }
    return push;
  },
  push.StopAllServices(true);
);

This will create a separate Push instance for each partition that Parallel.ForEach creates and when the partition is complete it will call StopAllServices on the instance.

This approach should perform no worse than splitting the devices into N lists where N is the number of CPU's and and starting either N threads or N tasks to process each list. If one thread or task "gets behind" the total execution time will be the execution time of this "slow" thread or task. With Parallel.ForEach all CPU's are used until all devices have been processed.

Hi Martin, but everylist at the end of the loop should run, push.stopallservices(). In the parallel.foreach, where do I accomodate this piece of code? Can you provide a link on a good tutorial on parallel.foreach? — dotnet developer, Nov 03 '14 at 09:54
@dotnetdeveloper: If I understand you correctly you should simply call `push.StopAllService(true)` when `Paralle.ForEach` completes. This means you will call this method once after alle notifications have been queued. — Martin Liversage, Nov 03 '14 at 10:02
As per your edited code, Push.stopallservices() run only once at the end of the loop right? Thats not the case I want it to be run for chunks. — dotnet developer, Nov 03 '14 at 10:04
@dotnetdeveloper: But if you want to "chunk" the devices yourself (perhaps you want a `push` instance for each chunk?) then what is your question? You have already provided a solution using TPL and you will not gain anything by switching to threads. The purpose of my answer is to point out that you in general will get better performance by using `Parallel.ForEach` instead of "chunking" the input yourself. — Martin Liversage, Nov 03 '14 at 10:29
I just want to clarify if my apprach is correct or I should be using asynchronous threading or any other efficient method suggested by stackoverflow members. If you claim TPL is better than multithreading, then I have no issues to continute with my own code. Thanks!! — dotnet developer, Nov 03 '14 at 10:32

TPL vs Multithreading

4 Answers4