I want to write my own parallel code or at least try whether manually parallelizing some of my code is faster than having Eigen use its own internal parallel routines.
I have been following this guide and added at the top of a header file the following directive (but also tried it at the top of main):
#define EIGEN_DONT_PARALLELIZE
Yet, when I ask Eigen to print the number of threads it's been using, via Eigen::nbThreads
I consistently get two. I have tried to force the issue with the initParallel()
method which is designed for user-defined parallel regions but to no avail. Could it be that I need to place my pre-processor token somewhere else? I am using gcc 8.1, CLion with CMake. I have also tried to force the issue with setNbThreads(0).
To eventually include OpenMP in my own code, I have followed the inclusion of OpenMP as recommended here as well as added this in my CMakeLists.txt
: target_link_libraries(OpenMP::OpenMP_CXX).
Or could it be that Eigen just tells me how many cores are in principle available, which doesn't sound like what is written in the documentation.
Edit
I am not sure if this is important but CLion (editor) complains MACRO EIGEN_DONT_PARALLELIZE
is never used. I looked in Eigen/Core and saw that it is used only in the form of a condition for an if statement, so I ignored this editor warning, but maybe I should not have?
I have now reproduced this behaviour with a much smaller example.