I have two functions that I am working with, and I am attempting to make them run as much as 4x faster.
void get_each_fifth(const matrix_t *matrix, long results[RESULTS_LEN]) {
for (int i = 0; i < matrix->rows; i++) {
for (int j = 0; j < matrix->cols; j++) {
int q = j % RESULTS_LEN;
results[q] += MGET(matrix, i, j);
}
}
}
The function above will need to be optimized to be 4x faster. In this function, I am finding the sums of integers based on their location in the matrix. Elements in column 0, 5, 10, etc. go into the first element of the results array. Elements in column 1, 6, 11, etc. go into the second column of the array. This pattern continues for the remaining columns. To summarize, the numbers in column i
go into element i % 5
of the results array.
long get_each(const matrix_t *matrix) {
long sum = 0;
for (int i = 0; i < matrix->rows; i++) {
for (int j = 0; j < matrix->cols; j++) {
sum += MGET(matrix, i, j);
}
}
return sum;
}
This one will need to be 2x faster; it is the sum all of the elements in the matrix and return the result.
MGET and MSET are defined:
#define MGET(mat, i, j) ((mat)->data[((i)*((mat)->cols)) + (j)])
#define MSET(mat, i, j, x) ((mat)->data[((i)*((mat)->cols)) + (j)] = (x))
and the matrix_t struct is defined
typedef struct {
long rows;
long cols;
int *data;
} matrix_t;
and is allocated with this function:
void set_up_matrix(matrix_t *matrix, int rows, int cols) {
if (matrix == NULL) {
return;
}
matrix->rows = rows;
matrix->cols = cols;
matrix->data = malloc(sizeof(int) * rows * cols);
srand(2021);
for (int i = 0; i < rows; i++) {
for (int j = 0; j < cols; j++) {
MSET(matrix, i, j, rand() % 100);
}
}
}
and result len is defined:
#define RESULTS_LEN 5
Any help would be appreciated!