-3

I am rephrasing this question based on the comments received.

I have a loop that runs 30 Billion times and assigns values to a chunk of memory assigned using malloc();

When the loop contains a condition it runs much slower than when the condition is not present. Review the scenarios below:

Scenario A: Condition is present and program is slow (43 sec)

Scenario B: Condition is not present and program is much faster (4 sec)

// gcc -O3 -c block.c && gcc -o block block.o



#include <stdio.h>
#include <stdlib.h>


#define LEN 3000000000

int main (int argc, char** argv){

    long i,j;

    unsigned char *n = NULL;
    unsigned char *m = NULL;

    m = (unsigned char *) malloc (sizeof(char) * LEN);

    n = m;

    srand ((unsigned) time(NULL));  

    int t = (unsigned) time(NULL);

    for (j = 0; j < 10; j++){

        n = m;

        for (i = 0; i < LEN; i++){


            //////////// A: THIS IS SLOW
            /*
            if (i % 2){
                *n = 1;         

            } else {
                *n = 0;
            }   
            */
            /////////// END OF A


            /////////// B: THIS IS FAST

            *n = 0;

            i % 2;

            *n = 1;

            /////////// END OF B

            n += 1;

        }
    }


    printf("Done. %d sec \n", ((unsigned) time(NULL)) - t );

    free(m);

    return 0;
}

Regards, KD

user5119599
  • 43
  • 1
  • 4
  • 1
    Without any context we can only guess. One guess is that the 0 is used as a termination character for some string but as said we need some context. – Gerhardh Dec 28 '16 at 15:05
  • 4
    The clairvoyance department is on a maternity leave. All questions about non-working code must be accompanied by a [mcve]. The management wishes to apologise for any inconvenience caused. – n. m. could be an AI Dec 28 '16 at 15:10
  • Post a minimal simplified code that we can run and compile. And if you're right, I predict much up voting. – Dellowar Dec 28 '16 at 15:11
  • What loop is that? – Weather Vane Dec 28 '16 at 15:15
  • 3
    `while (*n);` I guess... – Eugene Sh. Dec 28 '16 at 15:16
  • The compiler is most likely being optimizing away the loop into a memset-to-zero. My GCC 5.3.0 here, however, optimizes a loop filling a memory block to a memset whether the char being set is zero or one. – Hisham H M Dec 28 '16 at 15:31
  • Kindly take this off HOLD status. It is now re-worded into a clearer format – user5119599 Dec 28 '16 at 16:26
  • This has now became a duplicate of http://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array –  Dec 28 '16 at 19:13

1 Answers1

1

You can use gcc -S -O3 to have a look at the resulting assembler. Here is an example on an Intel box:

Fast version:

    movl    %eax, %r12d
    .p2align 4,,10
    .p2align 3
.L2:
    movl    $3000000000, %edx
    movl    $1, %esi
    movq    %rbp, %rdi
    call    memset
    subq    $1, %rbx
    jne .L2

Slow version:

    movl    $10, %edi
    movl    %eax, %ebp
    movl    $3000000000, %esi
    .p2align 4,,10
    .p2align 3
.L2:
    xorl    %edx, %edx
    .p2align 4,,10
    .p2align 3
.L5:
    movq    %rdx, %rcx
    andl    $1, %ecx
    movb    %cl, (%rbx,%rdx)
    addq    $1, %rdx
    cmpq    %rsi, %rdx
    jne     .L5
    subq    $1, %rdi
    jne     .L2

Conclusion: the compiler is smarter than you think. It is able to optimize the inner loop as a memset (which is faster because it uses SSE/AVX or REP instructions on Intel). However, this optimization cannot kick in if the condition is kept - because the result is different.

Didier Spezia
  • 70,911
  • 12
  • 189
  • 154