1

Suppose there is a need for building a spreadsheet-like engine that needs to be ultra fast, each cell dependencies could be on parallel calculation branch. Could thread be created for each parallel branch ? Isn't thread costfull in term of memory. Easily you could think that with 1000 formulas rows or even 1 million you would have to create same number of threads is it realistic ?

If it isn't realistic is there an alternative to threads for this kind of scenario ?

user310291
  • 36,946
  • 82
  • 271
  • 487

6 Answers6

3

For CPU intensive tasks, the optimal number of threads is usually the same number of CPUs. The overhead of creating threads can be much higher than the work that thread does if you are not careful.

Its worth nothing that CPU is often not the main issue. Often memory bandwidth or cache utilisation is more of an issue, in which case having one thread efficiently written can out perform attempting to distribute work across many thread. If the work each thread does is CPU intensive, and uses relatively less memory bandwidth, having multiple threads can help.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • Thanks for the remark about caching. Still I can't see how it could play a role if for exaple I do have to add cell Ai + Bi with very different numbers 1 million times ? – user310291 Apr 29 '12 at 20:00
  • I don't know if you trying to add the same values one million times (in which case I don't know why you would do this) or one million different value or one million different cells. If you are trying to add one million different cells together the operation (the add) is fair cheap compared with the memory operations (two loads and a save) This means you have to determine how to access memory as efficiently as possible, and improving the throughput of the add (which is what using multiple cpus will gain you) is less important. – Peter Lawrey Apr 30 '12 at 07:05
3

In modern Java programming, you should avoid threads altogether, and instead use executors. The rest of the world calls them working queues. See Item 68 in Effective Java by Joshua Bloch.

Personally, I strongly prefer the APIs of Grand Central Dispatch. The Java version is called HawtDispatch. That API is simpler, and just works.

nes1983
  • 15,209
  • 4
  • 44
  • 64
2

Your best bet is Task Parallel Library or Fork/Join Framework in Java. They do use threads but optimize the number of threads and put work items on a work queue for you. They take care of a lot of low level optimization problems in really clever ways. You just use constructs like Parallel.For, etc.

Stilgar
  • 22,354
  • 14
  • 64
  • 101
1

The Task Parallel Library can help you utilize the CPU as much as possible, and does most of the heavy lifting of thread creation for you.

If you have a very large number of (very) parallelizable computations, and you need the absolute best performance you can have, you will have too look beyond the cpu. There are alternatives that combines LINQ/TPL with the GPU such as MS. Accelerator and Brahma. See for example Utilizing the GPU with c#

Community
  • 1
  • 1
Anders Forsgren
  • 10,827
  • 4
  • 40
  • 77
1

The only thing besides threads that comes to mind are SIMD commands (unless you want to use special hardware which means you'd have to use a lower lvl Language). You'd have to use a external Library for these tough to gain access to the Processors/Gaphic Cards functions. Also CUDA or OpenCL might interest you. On the other Hand you normally don't want to create that many threads as you described, you could use a thread Pool, with a fixed or dynamic amount of threads, that manages how many threads are created and executes tasks from a queue. Also there is a Fork/Join Feature in Java 7 which helps with thread management.

I'd say have a look at thread pools, with these you can balance out the overhead created from too many threads.

Since you are looking for Information this might help out a bit for threads too.

Amsel
  • 13
  • 1
  • 3
1

Please take also a look at Ateji PX. It's an extension to the java language for parallelization that may help you. It was a commercial product but meanwhile it has become available for free.

dajood
  • 3,758
  • 9
  • 46
  • 68