3

We have a web application that monitors several social media's like twitter/facebook. The data that we receive from these social media's are being saved into the database. The data is big, and for that reason I was thinking about using Task's to save the data into the database.

But I have some concerns about how it works. Can't find in the docs what happens if I start 10 threads for 1 user multiplied by 2000 users? Or maybe 10.000 users.

What exactly will Task do? Does it have some sort of queue for these Threads?

Any light on this matter would be really appreciated it. Thanks!

Quoter
  • 4,236
  • 13
  • 47
  • 69
  • 1
    See http://stackoverflow.com/questions/4130194/what-is-the-difference-between-task-and-thread – Sam Leach Oct 08 '13 at 07:38
  • You'll need 1000 cores to run 1000 threads. Anything more than the number of cores on a server, might slow the system down instead of speeding it up. – SWeko Oct 08 '13 at 07:42
  • 6
    @SWeko: That's not true. I have 1650 threads running on my machine and I (sadly) do not have 1650 cores on my CPU. Keep in mind that threads can do more than computation; they can wait for IO or network, just sleep until they are needed, etc. That being said, creating 10k worker threads is very likely a stupid idea and a thread pool works much better. – Joey Oct 08 '13 at 07:43
  • 2
    I have 1284 :) At best, they waste memory (there is a 1MB minimum) :) However, having a single process with that many threads, would be a red flag. – SWeko Oct 08 '13 at 07:46
  • You may be better off just using a few dedicated threads (Perhaps [`Parallel.ForEach(`](http://msdn.microsoft.com/en-us/library/system.threading.tasks.parallel.foreach.aspx)) and using the [producer consumer model](http://stackoverflow.com/questions/12323940/howto-parallel-foreach-executes-many-processes-after-each-process-run-a-new-pr/12324559#12324559) to queue up the work and let it be processed by the queue. – Scott Chamberlain Oct 08 '13 at 07:56
  • I had a similar problem and start using a threadpool. See this question/answer: http://stackoverflow.com/questions/18954514/usage-multithreading-could-lead-to-excessive-memory-use – Willem de Jong Oct 08 '13 at 07:40
  • @SWeko 1MB is the *default* Windows stack size, but I believe .NET defaults with something much smaller. It is still, of course, a waste. – Cory Nelson Oct 14 '13 at 21:09

3 Answers3

4

You should be using the Asynchronous Programming Model (APM) to do this.

By using the APM you can avoid creating hundreds of threads.

Behind the scenes the APM can use I/O Completion Ports to provide callbacks when data has arrived or been sent, which means a thread doesn't have to be wasted.

See here for some details:

http://msdn.microsoft.com/en-us/library/vstudio/hh191443.aspx

http://msdn.microsoft.com/en-us/library/vstudio/hh300224.aspx

Matthew Watson
  • 104,400
  • 10
  • 158
  • 276
4

A Task is just a wrapper around a delegate that can be scheduled to the TaskScheduler's queue. Task does not create any threads. When you are calling task.Start() internally it calls the taskScheduler.QueueTask(Task) method.

By default all tasks use the TaskScheduler.Default which internally uses the ThreadPool. You can specify your custom task scheduler when creating TaskFactory - this class has a constructor that accepts a TaskScheduler parameter. Current scheduller is always available through the TaskScheduler.Current property.

Alexander Simonov
  • 1,564
  • 1
  • 9
  • 15
1

A Task uses a Thread to do the work.

Tasks are queued to the ThreadPool.

To answer your question; you'd need to try it out and see if the application copes.

You could probably share threads so it would not be 10*2000 threads.

Sam Leach
  • 12,746
  • 9
  • 45
  • 73