0

I have some long-running operations that number in the hundreds. At the moment they are each on their own thread. My main goal in using threads is not to speed these operations up. The more important thing in this case is that they appear to run simultaneously.

I'm aware of cooperative multitasking and fibers. However, I'm trying to avoid anything that would require touching the code in the operations, e.g. peppering them with things like yieldToScheduler(). I also don't want to prescribe that these routines be stylized to be coded to emit queues of bite-sized task items...I want to treat them as black boxes.

For the moment I can live with these downsides:

  • Maximum # of threads tend to be O(1000)
  • Cost per thread is O(1MB)

To address the bad cache performance due to context-switches, I did have the idea of a timer which would juggle the priorities such that only idealThreadCount() threads were ever at Normal priority, with all the rest set to Idle. This would let me widen the timeslices, which would mean fewer context switches and still be okay for my purposes.

Question #1: Is that a good idea at all? One certain downside is it won't work on Linux (docs say no QThread::setPriority() there).

Question #2: Any other ideas or approaches? Is QtConcurrent thinking about this scenario?

(Some related reading: how-many-threads-does-it-take-to-make-them-a-bad-choice, many-threads-or-as-few-threads-as-possible, maximum-number-of-threads-per-process-in-linux)

Community
  • 1
  • 1

2 Answers2

0
  1. IMHO, this is a very bad idea. If I were you, I would try really, really hard to find another way to do this. You're combining two really bad ideas: creating a truck load of threads, and messing with thread priorities.

  2. You mention that these operations only need to appear to run simultaneously. So why not try to find a way to make them appear to run simultaneously, without literally running them simultaneously?

Terry Mahaffey
  • 11,775
  • 1
  • 35
  • 44
  • I did mention fibers and cooperative multitasking. So I know alternatives exist, but the ones I know of require meddling with these long-running operations code itself. I'm interested in finding something that is more pre-emptive... and for my situation, I'm able to be a bit experimental. Kind of like the "one-process-per-tab" philosophy of Google Chrome, which I'm sure some people would be dubious of if they'd heard it proposed (many probably still are...but I'm sold) – HostileFork says dont trust SE Dec 29 '09 at 05:42
  • Google Chrome one-process-per-tab was about tab isolation, so a crash in one tab doesn't affect others. The scale is also different. It has nothing at all to do with this project or your proposed idea. Your logic is deeply flawed here. I stand by my advice. But it's your project, so do what you'd like. Good luck. – Terry Mahaffey Dec 29 '09 at 07:38
  • I don't think you've done due diligence in pointing out any "deeply flawed" argument in what I suggest above. You just said "lots of threads are bad, setting thread priorities is bad" and that's a knee-jerk reaction. I was hoping for some more nuanced discourse on this question. – HostileFork says dont trust SE Dec 29 '09 at 20:17
  • Clarified: I'm not going to downvote you or anything and I appreciate your input. But I don't feel you added anything that wasn't already covered in the SO threads I linked, or in my description of the question, hence I don't consider it a satisfactory answer. – HostileFork says dont trust SE Dec 29 '09 at 20:26
0

It's been 6 months, so I'm going to close this.

Firstly I'll say that threads serve more than one purpose. One is speedup...and a lot of people are focusing on that in the era of multi-core machines. But another is concurrency, which can be desirable even if it slows the system down when taken as a whole. Yet concurrency can be achieved using mechanisms more lightweight than threads, although it may complicate the code.

So this is just one of those situations where the tradeoff of programmer convenience against user experience must be tuned to fit the target environment. It's how Google's approach to a process-per-tab with Chrome would have been ill-advised in the era of Mosaic (even if process isolation was preferable with all else being equal). If the OS, memory, and CPU couldn't give a good browsing experience...they wouldn't do it that way now.

Similarly, creating a lot of threads when there are independent operations you want to be concurrent saves you the trouble of sticking in your own scheduler and yield() operations. It may be the cleanest way to express the code, but if it chokes the target environment then something different needs to be done.

So I think I'll settle on the idea that in the future when our hardware is better than it is today, we'll probably not have to worry about how many threads we make. But for now I'll take it on a case-by-case basis. i.e. If I have 100 of concurrent task class A, and 10 of concurrent task class B, and 3 of concurrent task class C... then switching A to a fiber-based solution and giving it a pool of a few threads is probably worth the extra complication.

  • Hi Brian! It sounds like the question really is, when do you use the OS's threading facilities, and when do you "roll your own" multi-threading mechanism? And I suppose the answer depends on how well-suited the OS's threads are to the task at hand, vs how clever you are at making something better-suited. You can always "roll your own" pre-emptive multitasking using setjmp() and longjmp(), but it's tricky to get right and may not be worth the effort... – Jeremy Friesner Jun 07 '10 at 01:07