-1

In my source code I saw a weird behavior of arm compiler where it did redundant iteration over a string, which unnecessary. I display here a minimal example that shows that,and ask my question below that

#include <string.h>
#define MIN(x, y) (((x) < (y)) ? (x) : (y))

int MAX_FILE_NAME = 2500;
int F(char *file){
    int file_len = MIN(strlen(file), MAX_FILE_NAME - 1);
    return file_len;
}
int main(void) {
    F(__FILE__);
    return 0 ;
}

compiled with:

arm-none-eabi-gcc -nostdlib -Xlinker -Map="m7_experiments.map" -Xlinker --cref -Xlinker --gc-sections -Xlinker -print-memory-usage -mcpu=cortex-m7 -mfpu=fpv5-sp-d16 -mfloat-abi=hard -mthumb -T "m7_experiments_Debug.ld" -o "m7_experiments.axf"  ./src/cr_startup_cm7.o ./src/crp.o ./src/flashconfig.o ./src/m7_experiments.o   

Leads to:

Dump of assembler code for function F:
   0x00000104 <+0>:     push    {r4, lr}
   0x00000106 <+2>:     mov     r4, r0
   0x00000108 <+4>:     bl      0x13c <strlen>
   0x0000010c <+8>:     mov     r2, r0
   0x0000010e <+10>:    ldr     r3, [pc, #20]   ; (0x124 <F+32>)
   0x00000110 <+12>:    ldr     r0, [r3, #0]
   0x00000112 <+14>:    subs    r0, #1
   0x00000114 <+16>:    cmp     r2, r0
   0x00000116 <+18>:    bcc.n   0x11a <F+22>
   0x00000118 <+20>:    pop     {r4, pc}
   0x0000011a <+22>:    mov     r0, r4
   0x0000011c <+24>:    bl      0x13c <strlen>
   0x00000120 <+28>:    b.n     0x118 <F+20>
   0x00000122 <+30>:    nop
   0x00000124 <+32>:    lsls    r0, r3, #6
   0x00000126 <+34>:    movs    r0, r0

Note how in the case that the file length is shorter than the defined one, instead of just getting it's length from $r2 it's being computed again, worsening the time run to be as long as 2* file length. which seems unnecessary. Is there some way to justify the compiler behavior in this case? I'm interested to know.

e.ad
  • 380
  • 2
  • 10
  • Are you asking why the optimized compiler code didn't substitute its own version of `strlen` which stops at `MAX_FILE_NAME - 1` before encountering a `0` terminator? – Weather Vane Feb 16 '23 at 19:10
  • 2
    You are compiling without optimisations. Why do you expect optimised code? – fuz Feb 16 '23 at 19:56
  • Need to DV. What will be result of `MIN(x++)`? – 0___________ Feb 16 '23 at 20:37
  • @fuz in my real code example it happens with O3. also, one may consider that some optimizations are relevant even on default compilation process – e.ad Feb 18 '23 at 18:56
  • @e.ad Without optimisations, the compiler can and will generate code as stupid as it likes. Do not expect any optimisations, even those you deem relevant. – fuz Feb 18 '23 at 22:48

2 Answers2

5

It is redundant. But that is because of your code, not the compiler. That macro is going to expand to this:

// x = strlen(file)
// y = MAX_FILE_NAME - 1
(((strlen(file)) < (MAX_FILE_NAME - 1)) ? (strlen(file)) : (MAX_FILE_NAME - 1))

Remember, the preprocessor is essentially just a glorified copy and paste machine. You're calling strlen twice. Try this:

size_t file_len = strlen(file);
file_len = MIN(file_len, MAX_FILE_NAME - 1);
Jason
  • 2,493
  • 2
  • 27
  • 27
  • Amazing, forgot that fact, thanks – e.ad Feb 16 '23 at 19:12
  • Do you think it makes sense that a compiler could detect this sort of inefficiency? For example, if the file pointer is const, then it's data won't change so the two calls can be determined to be redundant – e.ad Feb 16 '23 at 19:13
  • 1
    @e.ad I mean, it would have to be special case built in to the compiler, so unlikely. The compiler does not have any idea what `strlen` means. It's all just C code to the compiler. The compiler would have to *know* (which it likely doesn't) that `strlen` would return the same result for the same string. – Jason Feb 16 '23 at 19:16
  • make sense. thanks again, will accept your answer when time limit passes – e.ad Feb 16 '23 at 19:17
  • 1
    The compiler *could* know about standard library functions. C compilers these days know about `printf` format strings, for example. – Erik Eidt Feb 16 '23 at 19:20
  • 3
    @Jason On the most popular targets, you will find that the compiler _does_ know about `strlen` and a whole bunch of other standard library functions. A common issue people run into when writing their own standard library implementation (without the appropriate compiler flags) is that the compiler recognises code that performs strlen, memcpy, etc. and replaces them with a call to those functions, so in these cases you end up with infinite recursion. :) – Siguza Feb 16 '23 at 19:20
  • So in this case, will the optimizing compiler realize that it might not need to find the true string length? – Weather Vane Feb 16 '23 at 19:23
  • @WeatherVane Possibly, but... wouldn't `int F(char *file){` discard the qualifier. I just tried this out of curiosity, and gcc didn't throw any warnings. – Jason Feb 16 '23 at 19:25
  • @Siguza : 1) can you pls reference me to "compiler recognize CODE that does strlen,etc", 2) I'm not following to the end of your example: I have memcpy implementation that is replaced by call to std:memcpy, and I have call in my code to my mem_cpy, as far as I can see it's ends up with the std:memcpy, isn't it? thanks – e.ad Feb 16 '23 at 19:25
  • @Jason computing the string length twice is one issue that an optimizing compiler might realise it doesn't need to do anyway. Another issue is whether compiler will code a string length function that stops before the `0` terminator is reached. – Weather Vane Feb 16 '23 at 19:28
  • @e.ad Siguza is referring to `strlen` being a standard library function. The compiler can have advanced knowledge of these functions since it can assume they will exist (if linking to libc). I believe he is saying that would not work for example on a function I write called `string_length`. That is just a function that the compiler must work out. – Jason Feb 16 '23 at 19:28
  • @WeatherVane Sorry, I'm not following "Another issue is whether compiler will code a string length function that stops before the 0 terminator is reached." E: Are you saying an extra step beyond optimizing out the second `strlen` call? – Jason Feb 16 '23 at 19:31
  • 1
    @Jason Surely an optimizing compiler would notice that it does not need to call `strlen` twice. I am asking if an optimizing compiler will use its own version of `strlen` that also stops at `MAX_FILE_NAME - 1` before reaching the string terminator. If the string is longer, there is no need to find its terminator. Compilers are allowed to code in *any legal way* that the result will be the same as when unoptimized. – Weather Vane Feb 16 '23 at 19:37
  • 2
    @e.ad here you go: https://godbolt.org/z/vvdd65b4n – Siguza Feb 16 '23 at 19:44
  • 1
    @e.ad - *The compiler does not have any idea what strlen means.* isn't true; GCC and clang both treat it as a builtin function unless you use `-fno-builtin-strlen`. They will constant-propagate through it for string literals, and with optimization enabled do know it's a pure function (no side effects and depending only on its input) and will do constant-subexpression-elimination on multiple calls with the same arg, if they can prove the pointed-to string wasn't written between calls. https://godbolt.org/z/oh6d99M9Y shows both effects. Even at `-O0`, `strlen("abcd")` becomes 4 – Peter Cordes Feb 16 '23 at 20:04
  • @e.ad: GCC and clang will also inline some functions in some cases, e.g. GCC used to inline `strlen` loops instead of calling the standard library implementation, but cases like [Why is this code using strlen heavily 6.5x slower with GCC optimizations enabled?](https://stackoverflow.com/q/55563598) showed that was a bad idea on x86, especially with the `-O1` strategy of using `repnz scasb` which is very slow for large strings compared to a SIMD loop. – Peter Cordes Feb 16 '23 at 20:09
  • @Siguza you have infinitive loop here., – 0___________ Feb 16 '23 at 20:38
  • @0___________ my code (on the left) does not have an infinite loop. That's the whole point I wanted to demonstrate. – Siguza Feb 16 '23 at 20:48
  • @Siguza you have label strlen and jmp to strlen – 0___________ Feb 16 '23 at 20:54
  • @0___________ That's what he was demonstrating. – Jason Feb 16 '23 at 21:15
  • @Siguza: The compiler's `jmp strlen` is supposed to be a tailcall of the library function, but you named your C function `strlen` so it's actually jumping to itself. Name the C function something else; compiling definitions / implementations of standard C functions is problematic without `-fno-builtin`, since GCC will want to replace them with a call to that standard function. – Peter Cordes Feb 16 '23 at 21:31
  • 1
    @PeterCordes I'm well aware of that, that's what I wanted to demonstrate. See [my first comment](https://stackoverflow.com/questions/75476861/unoptimized-behaviour-of-arm-compiler/75476917?noredirect=1#comment133169765_75476917). – Siguza Feb 16 '23 at 21:36
2

Is there some way to justify the compiler behavior in this case? I'm interested to know.

The compiler is playing it safe.

With higher levels of optimization, the compiler uses inside knowledge of strlen() and "knows" strlen(file) will return the same value with a 2nd call.


Consider:

int file_len = MIN(rand(), MAX_FILE_NAME - 1);

MIN() might not return the minimum even with optimizations enabled as it should call rand() a 2nd time, if the first was less.


Consider:

int file_len = MIN(some_user_funciton(file), MAX_FILE_NAME - 1);

Compiler likely has little clue about some_user_funciton(file) and so calls some_user_funciton(file) 2nd time when needed.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • Cool example, thanks. Is it possible for the compiler to determine which functions are deterministic? or it's like a supported list? I mean, for example, imagine we replace rand() by time() – e.ad Feb 16 '23 at 20:00
  • 1
    @"Good" compilers know about the standard library functions. Some can even analyze local functions and assess if deterministic. Code for portability and avoid passing function arguments to macros such as `MIN()`. As your good code is tried on another site, best to avoid some coder cursing the prior author. – chux - Reinstate Monica Feb 16 '23 at 20:07
  • 1
    @e.ad: `strlen` is a builtin by default, GCC and clang can replace `strlen("abcd")` with a constant 4. For other less-widely-used functions that the compiler internals don't have special knowledge of, the library header declaration can use `__attribute__((const))` to declare that it's a pure function which only reads its args, not any global state. e.g. most math functions are like that. Or `__attribute__((pure))` to declare that it has no side effects (but might read global variables). https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes – Peter Cordes Feb 16 '23 at 20:12