4

For example you have a function sort(int* numbers, size_t count) with a bubblesort implementation and a C compiler recognizes this pattern. Would the compiler be allowed to change it to another example? Like Quicksort.

Another example would be adding all numbers from 0 to n, where the compiler could replace the for loop with (n*(n-1))/2.

JCWasmx86
  • 3,473
  • 2
  • 11
  • 29
  • 2
    As long as the observable behavior is the same, the compiler is allowed to do pretty much anything. And the time it takes to execute is not part of the observable behavior (otherwise, all optimizations would be forbidden). That said, the compiler/optimizer would have to be able to proof equivalence for all possible inputs, and that will be impractical for most situations. – Hulk Jul 30 '20 at 13:35
  • 2
    Related (C++, but similar arguments apply here): https://stackoverflow.com/questions/6664471/observable-behaviour-and-compiler-freedom-to-eliminate-transform-pieces-c-co – Hulk Jul 30 '20 at 13:41
  • 1
    Also see: https://stackoverflow.com/a/46455917/2513200 – Hulk Jul 30 '20 at 13:43
  • Another related question: https://stackoverflow.com/questions/53100198/what-is-the-precise-relationship-between-c-and-its-abstract-machine – Hulk Jul 30 '20 at 14:11
  • To be able to replace a loop with your second example `(n*(n-1))/2`, the compiler would have to be able to proove that the product `n*(n-1)` cannot overflow, or it would have to generate a fallback-branch that resorts to the original code for large `n`. You see, this gets complicated quickly, even for simple algorithms like that. – Hulk Jul 30 '20 at 14:27

7 Answers7

3

The word algorithm here might be a little strong, but Yes, the default settings of modern compilers generally provide as much freedom to change your code as the current compiler technology allows, but at the same time just about every modern compiler written provides the application programmer some control over what the compiler can and cannot do.

Given the default settings, modern compilers have the ability to use replacement constructs for simple algorithms and to some extent, can even be directed how they choose to replace them based on programmer settable preferences. Compilers provide many variations and levels of optimization options, most commonly related to speed, and/or memory usage. But the scope of what is changed is limited to simpler constructs that can be made more efficient, or totally ignored, depending on the construct.

In your specific example, of a function with the sort algorithm, the algorithm itself would not be replaced, but some of the component sections within it could be optimized, depending on how they are coded.

Conversely, you can also set current compilers to leave your code just as you wrote it. For example in newer versions of GCC (since v4.4) you can completely disable optimizations using these pragma surrounds:

#pragma GCC push_options
#pragma GCC optimize ("O0")

normal C code

#pragma GCC pop_options
ryyker
  • 22,849
  • 3
  • 43
  • 87
  • 1
    It should be emphasised that you could implement compiler optimisations to recognise a specific implementation of bubblesort and replace it with quicksort, and it would be fully compliant, but the effort required would outweigh the general benefit. – OrangeDog Jul 30 '20 at 16:04
  • @OrangeDog - If you are saying that a compiler _could be_ designed to do this very specific task, then I agree, the same discussion would have to include how impractical that would be, ie, most of the efficiencies gained (or lost) by using the various optimization options offered by compilers are to small constructs, or variables (not using `volatile` keyword) or anything that is seen as using redundancies that can be eliminated by ignoring, or modifying. Not by re-factoring entire large complicated algorithms. – ryyker Jul 30 '20 at 16:22
  • the efficiency gain when it is applied would be huge. The problem is the development and runtime of an algorithm to detect when the optimisation can be made vs. the number of times it will ever actually happen. – OrangeDog Jul 30 '20 at 16:48
  • @Yes, I really value compiler speeds today. And where is the fun in using a compiler where all of your hard earned algorithms are _cancleled_ in favor of some compiler writer's algorithm?. – ryyker Jul 30 '20 at 16:54
3

If the called function is within the optimization domain, which there are ways to do even if the function is another file then yes. In the same way that having the called function in the same file. But that doesn't mean that the compiler assumes based on the name of a function what the called code really is, you can easily make your own C library and link it in and have the named functions (qsort for example) do whatever you want. So this is undesireable, yet the tools do it:

#include <stdio.h>
void fun ( void )
{
    printf("0");
}
Disassembly of section .text:

00000000 <fun>:
   0:   e3a00030    mov r0, #48 ; 0x30
   4:   eafffffe    b   0 <putchar>

The compiler has replaced the desired function call with some other function call.

So yes, can and will do this if the optimizer is programmed to. If it has visibility it is more likely to optimize the two parts as a whole:

unsigned int more_fun ( unsigned int a, unsigned int b )
{
    return(a+b);
}
unsigned int fun ( void )
{
    return(more_fun(1,2));
}

Disassembly of section .text:

00000000 <more_fun>:
   0:   e0800001    add r0, r0, r1
   4:   e12fff1e    bx  lr

00000008 <fun>:
   8:   e3a00003    mov r0, #3
   c:   e12fff1e    bx  lr

which is not replacing one function with another like the printf example but it is replacing the optimized function with a possibly more optimized inline solution.

And yes certainly some compilers will optimize out a dead-ish loop:

unsigned int fun ( void )
{
    unsigned int ra;
    for(ra=0;ra<10;ra++)
    {
    }
    return(ra);
}

00000000 <fun>:
   0:   e3a0000a    mov r0, #10
   4:   e12fff1e    bx  lr

Granted this is a trivial one but I have generated pseudo randomizers that were dead code like this, depends on the compiler/optimizer and the code and if it has been programmed to it will reduce the loop/code into something simpler.

Or perhaps this is what you were asking:

unsigned int more_fun ( unsigned int n )
{
    return((n*(n-1))/2);
}
unsigned int fun ( void )
{
    return(more_fun(10));
}

00000000 <more_fun>:
   0:   e2403001    sub r3, r0, #1
   4:   e0020093    mul r2, r3, r0
   8:   e1a000a2    lsr r0, r2, #1
   c:   e12fff1e    bx  lr

00000010 <fun>:
  10:   e3a0002d    mov r0, #45 ; 0x2d
  14:   e12fff1e    bx  lr

The math is removed where possible.

You can't assume a particular compiler will do this, nor expect two compilers to produce the same code/optimization. But CAN and WILL a compiler do this, yes, if it is programmed to and has visibility into all the code it needs to optimize.

As far as "allowed", the compilers job is to functionally replace one language with another, which is in part why any two compilers that translate from one language to another (same input language to same output target) can and will produce different output code, its a functional replacement there isn't a single right answer for the output of a compile, it can vary. So where possible per the language, as shown above the compiler generates functionally equivalent code, that's its job so it is allowed to do that job. Although I would not consider putchar as a printf replacement as the compiler didnt look at my printf code, but at the same time I assume this compiler (gcc) has a command line option to not permit that. Likewise not uncommon for compilers to replace say a struct assignment, two structs of the same type a = b; with a memcpy, but at the same time have a command line option indicating I don't want to you to replace the code I asked for with a memcpy, I wanted it done discretely.

Between gcc and clang you will see these behaviors so as far as the word "allowed" goes, I guess it is allowed. As shown you can examine the compilers output and see for your use case, what the compiler did.

halfer
  • 19,824
  • 17
  • 99
  • 186
old_timer
  • 69,149
  • 8
  • 89
  • 168
  • 1
    Immediately after saying “But that doesn't mean that the compiler assumes based on the name of a function what the called code really is,” you give an example in which the compiler assumes what `printf` does (albeit based on specifications of the C standard). You should reword to clarify what you mean, or fix this. – Eric Postpischil Jul 30 '20 at 20:12
3

Compilers do this regularly for simpler patterns recognised by the optimiser.

In your particular case, the change of algorithm might have counterproductive side-effects as bubble sort is more efficient than quick sort on arrays that are mostly sorted.

Note also that the optimisation must keep the same semantics: replacing a stable sorting algorithm with a non-stable one such as quick sort is only acceptable if it can be asserted that stability is not required. Sorting a simpler arrays of int seems unaffected by this issue, but on architectures with negative zeroes or padding bits, it would make a difference. If the array has a floating type, stability of the sorting algorithm definitely can impact the final order.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
3

Yes, compilers can and will do so. As an example, consider this population count function using Kernighan's algorithm:

int popcount(unsigned x)
{
    int c;

    for (c = 0; x != 0; x &= x - 1)
        c++;

    return (c);
}

Modern clang and gcc versions recognise this algorithm and given suitable flags, compile it down to two instructions:

popcount:
    popcntl %edi, %eax
    retq

This is because the compiler recognises that the function implements a population count and replaces the whole algorithm with a machine instruction that performs the same purpose. Isn't that neat?

fuz
  • 88,405
  • 25
  • 200
  • 352
1

The compiler technology is not able to recognize such complex patterns (and the engineers do not even try to do it, it's undecidable issue) -- but yes --, for the recognized patterns it is allowed by the C standard to replace the initial code with another code whose semantics does not change.

The optimization techniques use different representations of the code for each particular optimization. A complex compiler can translate the input code in lots of different intermediate languages and the representation of each language allows better searching for patterns of some optimization.

But in no case the compilers try to look for pattersn representing complex algorithms. Of course you can try to do it, but this is useless and it is more affordable to employ a good programmer than to try to implement a system that would try to solve the halting problem.

alinsoar
  • 15,386
  • 4
  • 57
  • 74
  • Re “But in no case the compilers try to look for pattersn representing complex algorithms”: What research did you do to support this assertion? I worked with a compiler designed to detect certain signal-processing algorithms and replace them with other code, including calls to highly optimized routines and various operations for cache management. Did you do research and discover no such compiler exists? – Eric Postpischil Jul 30 '20 at 20:03
  • Re “Of course you can try to do it, but this is useless and it is more affordable to employ a good programmer”: On what basis is this claim made? Clang optimizes `int foo(int n) { int sum = 0; for (int i = 1; i < n; ++i) sum += i; return sum; }` to code that does a multiply and shift, with no loop. That is a completely different algorithm. – Eric Postpischil Jul 30 '20 at 20:09
  • Re “try to solve the halting problem”: This has nothing to do with the Halting Problem. – Eric Postpischil Jul 30 '20 at 20:09
  • @EricPostpischil I do not want to be ironical, but do you consider your `int foo ...` to be a complex pattern? Even the students integrate in their playgroung shifting instead of multiplication by 2. I saw a researcher working at IBM who wrote a recognizer of halting problem, trying to match patterns such that to answer `y or n` depending whether the input program halts or not. Of course, his program uses the Curry Howard correspondence for the inputs that are in the class that this correspondence can be applied, but for the rest his machine provides answer only in most of cases, not in all. – alinsoar Jul 31 '20 at 09:14
  • If you worked with signal processing and pretend that you are competent in recognizing SP algorithms I wish I would know if the compiler in which you integrated them is able to have mplayer or ffmpeg optimize the cache better then they do and to convert video formats faster. – alinsoar Jul 31 '20 at 09:18
  • There are algoritms to convert recursion to the closed form of the series they compute, but this is not too complex if only arithmetics is involved, with no other side effects. I am not aware of no such algorithm to be integrate in a compiler, as the OP asked. – alinsoar Jul 31 '20 at 09:29
1

Even if a compiler could recognize an algorithm it should not be allowed to replace the algorithm with a functional equivalent algorithm.

The reason is that the algorithm may have been chosen because of idiosyncrasies of the data it has to process and the compiler cannot know about these.

Note that I say algorithm. That is a lot more than e.g. calculating the length of a string such as strlen.

In the sort example, if the data are always nearly sorted then bubble may have the least overhead. Though bubble is generally not efficient, it could be the most efficient here.

Paul Ogilvie
  • 25,048
  • 4
  • 23
  • 41
  • 1
    I've downvoted this answer because it represents an *opinion* and doesn't actually answer the question. In [my answer](https://stackoverflow.com/a/63340875/417501) I show how the compiler does actually replace algorithms with other algorithms, so compilers do perform the optimisation you say they should not do. – fuz Aug 26 '20 at 14:54
  • @fuz, note that the `popcntl` instruction is a hardware implementation of the algorithm. _The algorithm has not been replaced_. The compiler just recognized (clever in itself) that the C code represented Brian Kernighan's algorithm to count set bits in an integer. – Paul Ogilvie Aug 26 '20 at 15:16
  • @fuz, also please be more precise in your wording. It is not a "population count algorithm" (which could count the population of a country), but "an algorithm to count set bits in an integer" – Paul Ogilvie Aug 26 '20 at 15:19
  • the operation of computing how many bits are set in a number is called *population count.* And yes, the compiler did certainly change the algorithm by replacing it with a specialised instruction. The differences can even be observed, for example in that the `popcnt` instruction always takes the same amount of time. – fuz Aug 26 '20 at 17:45
1

Theoretically it is possible and compiler is free to chose the way program is executed if the observable behaviour of the program will be the same.

0___________
  • 60,014
  • 4
  • 34
  • 74