0

I am just curious because I can not find my answer Googling, is it possible to optimize further code that uses MPI by using vectorisation like SSE or higher version of SSE.

What I am interested is try to exploit more performance from the CPU, if possible. This is an idea for my thesis but I still have to discuss it with my mentor and see what we can come up with.

If yes, can you please give me a reference where I can start reading. :D

Lum Zhaveli
  • 175
  • 2
  • 18
  • 3
    Well yes, of course... There are many levels at which you can exploit parallelism. One level is across computers/nodes/clusters; For that you use MPI. Another level are cores on the same computer; For that you can use OpenMP/pthreads/... . At the bottom level you optimize single-threaded code to milk the most performance out of a single core; For that you use vector instructions such as provided by the SSE and AVX instruction set extensions. – Iwillnotexist Idonotexist Jan 28 '16 at 03:00
  • I guess i get the answer. I can write a single-thread code using SSE and just use MPI to distribute that to computers/nodes/cluster. Also I am not interested using CUDA or OpenCL now. Pthreads and fibers sound great to me, especially fibers since context switching is small and developers are doing very interesting things with them when building game engines. Thanks for quick answer. – Lum Zhaveli Jan 28 '16 at 03:04
  • 2
    You can freely mix vectorisation (SIMD) with multiprocessing and message passing as those are all orthogonal technologies. Performance gains are not orthogonal though and very much algorithm- and hardware-specific. In any case, your question is too broad for Stack Overflow. – Hristo Iliev Jan 28 '16 at 06:22
  • 1
    For a broad question [here is a broad answer](http://stackoverflow.com/questions/20933746/parallel-programming-using-haswell-architecture/20948208#20948208). – Z boson Jan 28 '16 at 10:35

0 Answers0