I want to parallelize nested loops (I have four cores) in C by using pthreads. Inside the loops I'm simply assigning one value to every index of a 2 dimensional array.
When I tried to parallelize it with four threads it's actually slowing down my program by the factor of 3. I guess it's because the threads are somehow blocking each other.
This is the loop to be parallelized.
for ( i = 0; i < 1000; i++ )
{
for ( j = 0; j < 1000; j++ )
{
x[i][j] = 5.432;
}
}
I tried to parallelize it like this.
void* assignFirstPart(void *val) {
for ( i = 1; i < 500; i++ )
{
for ( j = 1; j < 500; j++ )
{
w[i][j] = 5.432;
}
}
}
void* assignSecondPart(void *val) {
for ( ia = 500; ia < 1000; ia++ )
{
for ( ja = 500; ja < 1000; ja++ )
{
w[ia][ja] = 5.432;
}
}
}
void* assignThirdPart(void *val) {
for ( ib = 1; ib < 1000; ib++ )
{
for ( jb = 500; jb < 1000; jb++ )
{
w[ib][jb] = 5.432;
}
}
}
void* assignFourthPart(void *val) {
for ( ic = 500; ic < 1000; ic++ )
{
for ( jc = 500; jc < 1000; jc++ )
{
w[ic][jc] = 5.432;
}
}
}
success = pthread_create( &thread5, NULL, &assignFirstPart, NULL );
if( success != 0 ) {
printf("Couldn't create thread 1\n");
return EXIT_FAILURE;
}
success = pthread_create( &thread6, NULL, &assignSecondPart, NULL );
if( success != 0 ) {
printf("Couldn't create thread 2\n");
return EXIT_FAILURE;
}
success = pthread_create( &thread7, NULL, &assignThirdPart, NULL );
if( success != 0 ) {
printf("Couldn't create thread 3\n");
return EXIT_FAILURE;
}
success = pthread_create( &thread8, NULL, &assignFourthPart, NULL );
if( success != 0 ) {
printf("Couldn't create thread 4\n");
return EXIT_FAILURE;
}
pthread_join( thread5, NULL );
pthread_join( thread6, NULL );
pthread_join( thread7, NULL );
pthread_join( thread8, NULL );
So as I said, parallelizing it like this slows down my program massively, so I'm probably doing something completely wrong. I'm grateful for any advice.