Apply an operation to an entire block of memory in C

Question

As I know in C language in order to multiply an array by a scalar, you have to iterate over each element using a for-loop. And as I know also the source code for the R software environment is written primarily in C. And from there when I have a big matrix in R like mat = matrix(5, nrow = 1100, ncol = 1100) and then multiply it by a constant and measure the time of this operation, just like so:

t_start = Sys.time()
mat = mat *5 
print(Sys.time()-t_start)

output:

Time difference of 0.005984068 secs

But doing the same thing using for-loops, it takes too much time:

t_start = Sys.time()
for(i in 1:1100)
{
  for(j in 1:1100)
  {
    mat[i,j] = mat[i,j] * 5
  }
}
print(Sys.time()-t_start)

output:

Time difference of 0.1437349 secs

The second way is ~24 times slower, now I'm assuming that behind the scene the first way is also has been done using for-loops, if so, why the time difference is too big?!

I'm wondering if there is a better way to apply an operation to an entire block of memory in C, without iterating over each element using loops. I would like to get some answers from C language perspective, as I'm working currently with C. And those pieces of R-code was just to show two different ways of doing this that R provides and C do not.

No, there is no better way in C unless there are some language and/or hardware extensions enabling it. — Eugene Sh., Mar 07 '19 at 15:57
The C code also uses loops, as @EugeneSh. says, there is no way around it. The difference you're seeing is the difference between loops in compiled code vs. interpreted code. — duckmayr, Mar 07 '19 at 15:58
The difference is very likely on the R side. Multiplying the matrix will probably call a precompiled and optimized C function to do the work. On the other side the loops performed in R only use multiplication of integer with little to optimize on the C side. All the loop management, indexing etc. is done in R — Gerhardh, Mar 07 '19 at 16:01
A background read on R’s “vectorization”: https://stackoverflow.com/q/51681978/4891738 — Zheyuan Li, Mar 07 '19 at 16:05
The answer by @李哲源 linked is really great and explains in much more detail the issue I mentioned. — duckmayr, Mar 07 '19 at 16:13
The C compiler does not need to use the loop as it is - it is allowed to compile to any program whose observable behaviour is indistinguishable from the original (with run time not considered observable behaviour) — Antti Haapala -- Слава Україні, Mar 07 '19 at 16:15
@AnttiHaapala I guess the question is about what the *programmer* can do with the language, rather than what the compiler will understand from that. — Eugene Sh., Mar 07 '19 at 16:17
Well, yes, but what I mean to convey is that the C compiler can vectorize the for loop as it is, but it wouldnt make sense in R — Antti Haapala -- Слава Україні, Mar 07 '19 at 16:20

score 1 · Accepted Answer · answered Mar 08 '19 at 08:49

Even use the C language for loop, it is faster than the first way In R language. so you don't have to worry about for-loop slower in c language. See the results below.

C language for loop: 0.00093478 secs

gcc -otest test.c -g -Wall -O2
./test
Time difference of 0.00093478 secs

the first way In R language: 0.004915237 secs

./Rscript first.R 
Time difference of 0.004915237 secs

test.c code:

#include <stdio.h>
#include <time.h>
#include <stdlib.h>

typedef struct matrix {
    int nrow;
    int ncol;
    int *buf;
    int *(array[]);
} matrix;

double sys_time()
{
    struct timespec ts;
    clock_gettime(CLOCK_MONOTONIC, &ts);
    return ts.tv_sec + ts.tv_nsec / 1000000000.0;
}

void test(matrix *mat)
{
    int i, j;
    double t_start, t_end;

    t_start = sys_time();
    for(i = 0; i < mat->nrow; i++) {
        for(j = 0; j < mat->ncol; j++) {
            mat->array[i][j] *= 5;
        }
    }
    t_end = sys_time();
    printf("Time difference of %g secs\n", t_end - t_start);
}

matrix *create_matrix(int val, int nrow, int ncol)
{
    matrix *mat;
    int *buf;
    int i, j;

    mat = (matrix *)malloc(sizeof(*mat) + nrow * sizeof(int *));
    buf = (int *)malloc(sizeof(int) * nrow * ncol);
    mat->buf = buf;
    mat->nrow = nrow;
    mat->ncol = ncol;
    for(i = 0; i < nrow; i++) {
        for(j = 0; j < ncol; j++)
            buf[j] = val;
        mat->array[i] = buf;
        buf += ncol;
    }
    return mat;
}

void destroy_matrix(matrix *mat)
{
    free(mat->buf);
    free(mat);
}

int main()
{
    matrix *mat;

    mat = create_matrix(5, 1100, 1100);
    test(mat);
    destroy_matrix(mat);

    return 0;
}

Apply an operation to an entire block of memory in C

1 Answers1