I have a list of function pointers called tasks_ready_master. The pointers point to functions (tasks) defined in a seperate module. I want to execute them in parallel using threads. Each thread has a queue called "thread_queue" of capacity 1. This queue will contain the task that should be executed by the thread. Once it is done, the task is retired from the queue. We have also a queue where we put all the tasks (called "master _queue"). This is my implementation for the execution subroutine:
subroutine master_worker_execution(self,var,tasks_ready_master,first_task,last_task)
type(tcb),dimension(20)::tasks_ready_master !< the master array of tasks
integer::i_task !< the task counter
type(tcb)::self !< self
integer,intent(in)::first_task,last_task
type(variables),intent(inout)::var !< the variables
!OpenMP variables
integer::num_thread !< the rank of the thread
integer:: OMP_GET_THREAD_NUM !< function to get the rank of the thread
type(QUEUE_STRUCT),pointer:: thread_queue
type(QUEUE_STRUCT),pointer::master_queue
logical::success
integer(kind = OMP_lock_kind) :: lck !< a lock
call OMP_init_lock(lck) !< lock initialization
!$OMP PARALLEL PRIVATE(i_task,num_thread,thread_queue) &
!$OMP SHARED(tasks_ready_master,self,var,master_queue,lck)
num_thread=OMP_GET_THREAD_NUM() !< the rank of the thread
!$OMP MASTER
call queue_create(master_queue,last_task-first_task+1) !< create the master queue
do i_task=first_task,last_task
call queue_append_data(master_queue,tasks_ready_master(i_task),success) !< add the list elements to the queue (full queue)
end do
!$OMP END MASTER
!$OMP BARRIER
if (num_thread .ne. 0) then
do while (.not. queue_empty(master_queue)) !< if the queue is not empty
call queue_create(thread_queue,1) !< create a thread queue of capacity 1
call OMP_set_lock(lck) !< set the lock
call queue_append_data(thread_queue,master_queue%data(1),success) !< add the first element of the list to the thread queue
call queue_retrieve_data(master_queue) !< retire the first element of the master queue
call OMP_unset_lock(lck) !< unset the lock
call thread_queue%data(1)%f_ptr(self,var) !< execute the one and only element of the thread queueu
call queue_retrieve_data(thread_queue) !< retire the element
end do
end if
!$OMP MASTER
call queue_destroy(master_queue) !< destory the master queue
!$OMP END MASTER
call queue_destroy(thread_queue) !< destroy the thread queue
!$OMP END PARALLEL
call OMP_destroy_lock(lck) !< destroy the lock
end subroutine master_worker_execution
The problem is that I get a segmentation fault:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x7f30fd3ca700 in ???
#0 0x7f30fd3ca700 in ???
#1 0x7f30fd3c98a5 in ???
#1 0x7f30fd3c98a5 in ???
#2 0x7f30fd06920f in ???
#2 0x7f30fd06920f in ???
#3 0x56524a0f1d08 in __master_worker_MOD_master_worker_execution._omp_fn.0
at /home/hakim/stage_hecese_HPC/OpenMP/hecese_OMP/master_worker.f90:70
#4 0x7f30fd230a85 in ???
#3 0x56524a0f1ad7 in __queue_MOD_queue_destroy
at /home/hakim/stage_hecese_HPC/OpenMP/hecese_OMP/queue.f90:64
#4 0x56524a0f1d94 in __master_worker_MOD_master_worker_execution._omp_fn.0
at /home/hakim/stage_hecese_HPC/OpenMP/hecese_OMP/master_worker.f90:81
#5 0x7f30fd227e75 in ???
#6 0x56524a0f1f68 in __master_worker_MOD_master_worker_execution
at /home/hakim/stage_hecese_HPC/OpenMP/hecese_OMP/master_worker.f90:54
#7 0x56524a0f29b5 in __app_management_MOD_management
at /home/hakim/stage_hecese_HPC/OpenMP/hecese_OMP/app_management_without_t.f90:126
#8 0x56524a0f579b in hecese
at /home/hakim/stage_hecese_HPC/OpenMP/hecese_OMP/program_hecese.f90:398
#9 0x56524a0ed26e in main
at /home/hakim/stage_hecese_HPC/OpenMP/hecese_OMP/program_hecese.f90:13
Erreur de segmentation (core dumped)
I tried to retire the while loop and it works (no seg fault). I don't understand where the mistake came from.
While debugging with gdb, it guides me to the line where we use queue_append_data
and queue_retrieve_data
.
This is the ouput I get when I use valgrind:
==13100== Memcheck, a memory error detector
==13100== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==13100== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==13100== Command: ./output_hecese_omp
==13100==
==13100== Thread 3:
==13100== Jump to the invalid address stated on the next line
==13100== at 0x0: ???
==13100== by 0x10EB64: __master_worker_MOD_master_worker_execution._omp_fn.0 (master_worker.f90:73)
==13100== by 0x4C8BA85: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==13100== by 0x4F1D608: start_thread (pthread_create.c:477)
==13100== by 0x4DD7292: clone (clone.S:95)
==13100== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==13100==
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x4888700 in ???
#1 0x48878a5 in ???
#2 0x4cfb20f in ???
#3 0x0 in ???
==13100==
==13100== Process terminating with default action of signal 11 (SIGSEGV)
==13100== at 0x4CFB169: raise (raise.c:46)
==13100== by 0x4CFB20F: ??? (in /usr/lib/x86_64-linux-gnu/libc-2.31.so)
==13100==
==13100== HEAP SUMMARY:
==13100== in use at exit: 266,372 bytes in 121 blocks
==13100== total heap usage: 194 allocs, 73 frees, 332,964 bytes allocated
==13100==
==13100== LEAK SUMMARY:
==13100== definitely lost: 29,280 bytes in 3 blocks
==13100== indirectly lost: 2,416 bytes in 2 blocks
==13100== possibly lost: 912 bytes in 3 blocks
==13100== still reachable: 233,764 bytes in 113 blocks
==13100== suppressed: 0 bytes in 0 blocks
==13100== Rerun with --leak-check=full to see details of leaked memory
==13100==
==13100== For lists of detected and suppressed errors, rerun with: -s
==13100== ERROR SUMMARY: 3 errors from 1 contexts (suppressed: 0 from 0)