0

I have the following Fortran code that calls a C++ function parallel_klu. parallel_klu creates eight threads (to execute another function called factor) every time it is called and after it returns to Fortran the threads are destroyed.

program linkFwithCPP        
   use iso_c_binding
   implicit none 
   interface
   subroutine my_routine (Ax, Ai, Ap, b, n, nz) bind (C, name = "parallel_klu")
   import :: c_double
   import :: c_int
   integer (c_int), intent (out), dimension (*) :: Ap
   integer (c_int), intent (out), dimension (*) :: Ai
   real (c_double), intent (out), dimension (*) :: Ax
   real (c_double), intent (out), dimension (*) :: b
   integer(c_int), value :: n
   integer(c_int), value :: nz
   end subroutine
   end interface


   integer,parameter ::n=10
   integer,parameter ::nz=34

   ! declare the 4 arrays
   real (c_double), dimension (0:nz-1) :: Ax
   real (c_double), dimension (0:n-1) :: b
   integer (c_int), dimension (0:n) :: Ap
   integer (c_int), dimension (0:nz-1) :: Ai
   !Inputing array values
    open(unit = 1, file = 'Ax.txt')
    read(1,*) Ax         
    close(1)

    open(unit = 2, file = 'Ai.txt')
    read(2,*) Ai            
    close(2)

    open(unit = 3, file = 'Ap.txt')
    read(3,*) Ap            
    close(3)

    open(unit = 4, file = 'b.txt')
    read(4,*) b            
    close(4)


    ! Call C++ function
    call my_routine(Ax, Ai, Ap, b, n, nz)
    write (*, *) "Fortran: b:", b
    pause
 end program linkFwithCPP  

The function in C++ is the following:

 void parallel_klu(double Ax[], int Ai[], int Ap[], double b[], const int n, const int nz)
{
  threadList.push_back(std::thread(factor, Ap, Ai, Ax, b, Q, P, Pinv, n, 0, 5000));
  threadList.push_back(std::thread(factor, Ap, Ai, Ax, b, Q, P, Pinv, n, 5000, 10000));
  threadList.push_back(std::thread(factor, Ap, Ai, Ax, b, Q, P, Pinv, n, 10000, 15000));
  threadList.push_back(std::thread(factor, Ap, Ai, Ax, b, Q, P, Pinv, n, 15000, 20000));
  threadList.push_back(std::thread(factor, Ap, Ai, Ax, b, Q, P, Pinv, n, 20000, 25000));
  threadList.push_back(std::thread(factor, Ap, Ai, Ax, b, Q, P, Pinv, n, 25000, 30000));
  threadList.push_back(std::thread(factor, Ap, Ai, Ax, b, Q, P, Pinv, n, 30000, 35000));
  threadList.push_back(std::thread(factor, Ap, Ai, Ax, b, Q, P, Pinv, n, 35000, 40000));

// wait for all threads to finish
for (auto& threadID : threadList){
    threadID.join();
}

}

Is there a way of avoiding creating the 8 threads every time the function parallel_klu is called? Something like creating them for the very first call and then just sending a signal for the threads to execute again.

francescalus
  • 30,576
  • 16
  • 61
  • 96
Anas
  • 359
  • 1
  • 5
  • 14

1 Answers1

0

First, I would strongly suggest figuring out if the thread creation is killing your application's performance.

If it is, you can use a concept called a thread pool, which is basically a set of threads that you can send tasks to, and it keeps the threads constructed/active when there is nothing to do. This concept can also be used to fix the number of worker threads. Some good suggestions are given in this answer.

Community
  • 1
  • 1
rubenvb
  • 74,642
  • 33
  • 187
  • 332