There maybe a very simple solution to this problem but it has been bothering me for a while, so I have to ask.
In our embedded projects, it seems common to have simple get/set functions to many variables in separate C-files. Then, those variables are being called from many other C-files. When I look the assembly listing, those function calls are never replaced with move instructions. Faster way would be to just declare monitored variables as global variables to avoid unnecessary function calls.
Let's say you have a file.c which has variables that need to be monitored in another C-file main.c. For example, debugging variables, hardware registers, adc-values, etc. Is there a compiler optimization that replaces simple get/set functions with assembly move instructions thus avoiding unnecessary overhead caused by function calls?
file.h
#ifndef FILE_H
#define FILE_H
#include <stdint.h>
int32_t get_signal(void);
void set_signal(int32_t x);
#endif
file.c
#include "file.h"
#include <stdint.h>
static volatile int32_t *signal = SOME_HARDWARE_ADDRESS;
int32_t get_signal(void)
{
return *signal;
}
void set_signal(int32_t x)
{
*signal = x;
}
main.c
#include "file.h"
#include <stdio.h>
int main(int argc, char *args[])
{
// Do something with the variable
for (int i = 0; i < 10; i++)
{
printf("signal = %d\n", get_signal());
}
return 0;
}
If I compile the above code with gcc -Wall -save-temps main.c file.c -o main.exe
, it gives the following assembly listing for main.c. You can always see the call get_signal
even if you compile with -O3 flag which seems silly as we are only reading memory address. Why bother calling such simple function?
Same explanation applies for the simple set function. It is always called even though we would be only writing to one memory location in the function and doing nothing else.
main.s
main:
pushq %rbp
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
subq $48, %rsp
.seh_stackalloc 48
.seh_endprologue
movl %ecx, 16(%rbp)
movq %rdx, 24(%rbp)
call __main
movl $0, -4(%rbp)
jmp .L4
.L5:
call get_signal
movl %eax, %edx
leaq .LC0(%rip), %rcx
call printf
addl $1, -4(%rbp)
.L4:
cmpl $9, -4(%rbp)
jle .L5
movl $0, %eax
addq $48, %rsp
popq %rbp
ret
UPDATED 2023-02-13
Question was closed with several links to inline
and Link-time Optimization-related answers. I don't think the same question has been answered before or at least the solution is not obvious for my get_function
. What is there to inline if a function just returns a value and does nothing else?
Anyways, it seems, as suggested, that one solution to fix this problem is to add compiler flags -O2 -flto
which correctly replaces assembly instruction call get_signal
with move instruction with the following partial output:
main:
subq $40, %rsp
.seh_stackalloc 40
.seh_endprologue
call __main
movl tmp.0(%rip), %edx
movl $10, %eax
.p2align 4,,10
.p2align 3
.L4:
movl signal(%rip), %ecx
addl %ecx, %edx
subl $1, %eax
jne .L4
leaq .LC0(%rip), %rcx
movl %edx, tmp.0(%rip)
call printf.constprop.0
xorl %eax, %eax
addq $40, %rsp
ret
.seh_endproc
Thank you.