3

I have the following C program

int main() {
char string[] = "Hello, world.\r\n";
__asm__ volatile ("syscall;" :: "a" (1), "D" (0), "S" ((unsigned long) string), "d" (sizeof(string) - 1)); }

which I want to run under Linux with with x86 64 bit. I call the syscall for "write" with 0 as fd argument because this is stdout.

If I compile under gcc with -O3, it does not work. A look into the assembly code

    .file   "test_for_o3.c"
.text
.section    .text.startup,"ax",@progbits
.p2align 4,,15
.globl  main
.type   main, @function
main:
.LFB0:
    .cfi_startproc
    subq    $40, %rsp
    .cfi_def_cfa_offset 48
    xorl    %edi, %edi
    movl    $15, %edx
    movq    %fs:40, %rax
    movq    %rax, 24(%rsp)
    xorl    %eax, %eax
    movq    %rsp, %rsi
    movl    $1, %eax
#APP
# 5 "test_for_o3.c" 1
    syscall;
# 0 "" 2
#NO_APP
    movq    24(%rsp), %rcx
    xorq    %fs:40, %rcx
    jne .L5
    xorl    %eax, %eax
    addq    $40, %rsp
    .cfi_remember_state
    .cfi_def_cfa_offset 8
    ret
.L5:
    .cfi_restore_state
    call    __stack_chk_fail@PLT
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0"
    .section    .note.GNU-stack,"",@progbits

tells us that gcc has simply not put the string data into the assembly code. Instead, if I declare "string" as "volatile", it works fine.

However, the idea of "volatile" is just to use it for variables that can change their values by (from the view of the executing function) unexpected events, isn't it? "volatile" can make code much slower, hence it should be avoided if possible.

As I would suppose, gcc must assume that the content of "string" must not be ignored because the pointer "string" is used as an input parameter in the inline assembly (and gcc has no idea what the inline assembly code will do with it).

If this is "allowed" behaviour of gcc, where can I read more about all the formal constraints I have to be aware of when writing code for -O3?

A second question would be what the "volatile" statement along with the inline assembly directive does exactly. I just got used to mark all inline assembly directives with "volatile" because it had not worked otherwise, in some situations.

Kolodez
  • 553
  • 2
  • 9
  • 7
    obviously a meaning of 'simple C hello world' than the one I am used to – pm100 Nov 29 '18 at 23:26
  • Ah, this is because when you pass a pointer through a register it doesn't guarantee the data that the pointer points to has been realized into memory. GCC with optimizations on can exhibit this behavior when using local arrays of data on the stack. A quick fix would be to add a `"memory"` clobber to your inline assembly template. This will ensure any data not realized intomemory by the compiler is updated and then reload from memory if necessary after the inline assembly template is finished. – Michael Petch Nov 29 '18 at 23:30
  • This is a duplicate of another question, I just can't find it. – Michael Petch Nov 29 '18 at 23:34
  • 1
    So try: `__asm__ volatile ("syscall;" :: "a" (1), "D" (0), "S" ((unsigned long) string), "d" (sizeof(string) - 1) : "memory"); }` – Michael Petch Nov 29 '18 at 23:37
  • 3
    @MichaelPetch: I think you're looking for [get string length in inline GNU Assembler](https://stackoverflow.com/a/45656087) where David Wohlferd shows how to use a dummy memory operand along with a pointer input operand, to tell gcc that the contents of the pointed-to memory are also an input. (also [Looping over arrays with inline assembly](https://stackoverflow.com/q/34244185) where I mentioned it). I've been meaning to write a canonical Q&A about pointer input operands, because it's worth its own Q&A. – Peter Cordes Nov 29 '18 at 23:40
  • There's already [Informing clang that inline assembly reads a particular region of memory](https://stackoverflow.com/q/31628554) without an answer, but I plan to at some point write one showing that optimization sees dead stores, like this case with the array init. – Peter Cordes Nov 29 '18 at 23:41
  • 2
    Yep, that's the one. I mentioned the `memory` clobber because its the easiest to get right, but adding the dummy constraint is the preferable option.The alternative is `__asm__ volatile ("syscall;" :: "a" (1), "D" (0), "S" ((unsigned long) string), "d" (sizeof(string) - 1), "m" (*(const char (*)[]) string)); }` – Michael Petch Nov 29 '18 at 23:41
  • @PeterCordes :Come to think of it that isn't the one I'm thinking of. There is an existing question specifically involving syscall or int 0x80 where the fellow had the data on the stack and the optimizer didn't realize it into memory. I remember commenting on or answering that question. In fact I believe it is the first question that had us discovering the issue originally. – Michael Petch Nov 29 '18 at 23:45
  • 2
    "I call the syscall for "write" with 0 as fd argument because this is stdout." Umm, fd 0 is stdin. stdout is fd 1. – ottomeister Nov 30 '18 at 03:54
  • Thank you! If I just add `"m" (string)`, it is also working. Is it necessary to cast it to an array of `const char` instead of array of unqualified `char`? – Kolodez Nov 30 '18 at 09:20

0 Answers0