I was trying to parallelize a code but it only deteriorated the performance. I wrote a Fortran code which runs several Monte Carlo integrations and then finds their mean.
implicit none
integer, parameter :: n=100
integer, parameter :: m=1000000
real, parameter :: pi=3.141592654
real MC,integ,x,y
integer tid,OMP_GET_THREAD_NUM,i,j,init,inside
read*,init
call OMP_SET_NUM_THREADS(init)
call random_seed()
!$OMP PARALLEL DO PRIVATE(J,X,Y,INSIDE,MC)
!$OMP& REDUCTION(+:INTEG)
do i=1,n
inside=0
do j=1,m
call random_number(x)
call random_number(y)
x=x*pi
y=y*2.0
if(y.le.x*sin(x))then
inside=inside+1
endif
enddo
MC=inside*2*pi/m
integ=integ+MC/n
enddo
!$OMP END PARALLEL DO
print*, integ
end
As I increase the number of threads, run-time increases drastically. I have looked for solutions for such problems and in most cases shared memory elements happen to be the problem but I cannot see how it is affecting my case.
I am running it on a 16 core processor using Intel Fortran compiler.
EDIT: The program after adding implicit none
, declaring all variables and adding the private clause