I am interested in matrix vector multiplication. I am analyzing speeds of matrix vector multiplication. One function represents a matrix as 1d array and another function represents it as a 2d array. The 2d array always is faster when I am running it. I can't figure out why.
I've tried reviewing books.
Matrix as 1d array:
void matrix_mult_vector_v2(const double* A, const double* x, double* result, int n_rows, int n_cols) {
int row, col;
double sum ;
#pragma omp parallel shared(A, x, result, n_rows, n_cols) private(row, col, sum)
{
#pragma omp for schedule(static)
for (row = 0; row < n_rows ; row++) {
sum = 0.0;
for(col = 0; col < n_cols; col++){
int i = col + row * n_cols;
sum += A[i] * x[col];
}
result[row] = sum;
}
}
}
Matrix as 2d array:
/*
* Matrix multiply vector
* result = A * x where A is a matrix and x is a vector
*/
void matrix_mult_vector(double** A, double* x, double* result, int n_rows, int n_cols)
{
int row, col;
double sum;
#pragma omp parallel shared(A, x, result, n_rows, n_cols) private(row, col, sum)
{
//#pragma omp parallel for collapse(2)
#pragma omp for schedule(static)
for (row = 0; row < n_rows ; row++) {
sum = 0.0;
for(col = 0; col < n_cols; col++){
//#pragma omp atomic
sum += A[row][col] * x[col];
}
result[row] = sum;
}
}
}
No errors. Results should be A*x where A is a matrix and x is a vector.