0

I'm trying to write a "hello world" program to test inline assembler in g++. (still leaning AT&T syntax)

The code is:

#include <stdlib.h>
#include <stdio.h>

# include <iostream> 

using namespace std;

int main() {
    int c,d;

    __asm__ __volatile__ (
        "mov %eax,1;    \n\t"
        "cpuid;         \n\t"
        "mov %edx, $d;  \n\t"
        "mov %ecx, $c;  \n\t"
    );

    cout << c << " " << d << "\n";

    return 0;
}

I'm getting the following error:

inline1.cpp: Assembler messages:
inline1.cpp:18: Error: unsupported instruction `mov'
inline1.cpp:19: Error: unsupported instruction `mov'

Can you help me to get it done?

Tks

Chocksmith
  • 1,188
  • 2
  • 12
  • 40
  • Compiling with: g++ inline1.cpp -o test – Chocksmith Jan 25 '17 at 12:50
  • 3
    AT&T syntax has almost all operands reversed. Instead of `dst, src` it is `src, dst`. Your code clobbers registers but doesn't use an extended assembler template to tell GCC that. Have you considered the CPUID intrinsic? – Michael Petch Jan 25 '17 at 12:51
  • I do not want to use intrinsics. Just want to learn/test assembler inlining. I was trying to write a simple code before testing something more complex and use an extended template. – Chocksmith Jan 25 '17 at 12:53
  • 2
    If you [read more about the GAS AT&T syntax](https://en.wikibooks.org/wiki/X86_Assembly/GAS_Syntax) you will soon learn that most instructions have a *suffix* to tell the size of the operation. For example `movb` to move a byte. And of course (as already mentioned) reverses the order of operands. – Some programmer dude Jan 25 '17 at 12:54
  • 4
    If you want to learn assembly I highly recommend not using inline assembly. It is hard to get it right with _GCC_ and fraught with gotchyas. You can always write a separate assembly file and assemble it and link it with your _C_ code. https://gcc.gnu.org/wiki/DontUseInlineAsm – Michael Petch Jan 25 '17 at 12:56
  • Just trying to learn "inline assembler" in g++. I have many thousands of hours of experience programming assembler in micro controllers. – Chocksmith Jan 25 '17 at 12:58
  • 3
    `int c,d; int a = 1; __asm__ __volatile__ ( "cpuid" : "=c"(c), "=d"(d), "+a"(a) :: "ebx" ); ` – Michael Petch Jan 25 '17 at 13:08
  • 2
    https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html – Michael Petch Jan 25 '17 at 13:24
  • 2
    Make sure you actually **need** inline assembly, as opposed to standalone. – Jester Jan 25 '17 at 13:26

2 Answers2

3

Your assembly code is not valid. Please carefully read on Extended Asm. Here's another good overview. Here is a CPUID example code from here:

static inline void cpuid(int code, uint32_t* a, uint32_t* d)
{
    asm volatile ( "cpuid" : "=a"(*a), "=d"(*d) : "0"(code) : "ebx", "ecx" );
}

Note the format:

  • first : followed by output operands: : "=a"(*a), "=d"(*d); "=a" is eax and "=b is ebx
  • second : followed by input operands: : "0"(code); "0" means that code should occupy the same location as output operand 0 (eax in this case)
  • third : followed by clobbered registers list: : "ebx", "ecx"
AMA
  • 4,114
  • 18
  • 32
  • movl and -m32 did not work. Still trying to understand what the code does :-) – Chocksmith Jan 25 '17 at 13:19
  • @Chocksmith rewrote the answer once again, hopefully it is more helpful now :) – AMA Jan 25 '17 at 14:34
  • *"Your target is 64 bits and mov expects a 64-bit type."* No, in GAS syntax, `mov` expects the size the instruction's suffix indicates that it should expect. I think you probably know this, but the bottom part of your answer is either confusing or wrong. – Cody Gray - on strike Jan 25 '17 at 19:06
  • @CodyGray I've edited the answer. Do you think it is less confusing now? – AMA Jan 25 '17 at 20:41
  • 1
    Yes, much better. Unfortunately, the downvote wasn't mine, so I can't remove it! Presumably, the reason it isn't able to deduce that `movl` was meant is because (A) the operands were in the wrong order, and (B) registers have to be escaped with a *double* `%`. – Cody Gray - on strike Jan 26 '17 at 11:10
  • 1
    Very useful! Thanks for your answer! – Chocksmith Jan 26 '17 at 13:19
  • One nitpick is that the OP wanted _ECX_ and _EDX_ in this case. – Michael Petch Jan 27 '17 at 00:29
  • @CodyGray: The actual source of the `unsupported instruction 'mov"` error in the question's code is `mov %edx, $d`, i.e. an immediate destination. (It's GNU C Basic Asm, no operands, so the single `%` isn't a meta character.) I was hoping this would be a useful duplicate for [Unsupported instruction \`mov' (SOLVED)](https://stackoverflow.com/q/72071627) since it's the same bug, although it seems it was a different reason for having the bug. (Some mistaken idea about what `$` does, rather than not swapping the operand-order for Intel -> AT&T) – Peter Cordes May 01 '22 at 02:41
  • The 2nd half of this answer is still wrong, if it's talking about the code in the question, not some updated attempt. Nothing to do with operand-size. – Peter Cordes May 01 '22 at 02:42
  • @PeterCordes removed the second part. – AMA May 02 '22 at 07:23
  • @Chocksmith feel free to remove the approval if you feel it does not answer the question. No problem with me there :) – AMA May 02 '22 at 07:23
0

I kept @AMA answer as accepted one because it was complete enough. But I've put some thought on it and I concluded that it is not 100% correct.

The code I was trying to implement in GCC is the one below (Microsoft Visual Studio version).

int c,d;
_asm
{
  mov eax, 1;
  cpuid;
  mov d, edx;
  mov c, ecx;
}

When cpuid executes with eax set to 1, feature information is returned in ecx and edx.

The suggested code returns the values from eax ("=a") and edx (="d"). This can be easily seen at gdb:

(gdb) disassemble cpuid
Dump of assembler code for function cpuid(int, uint32_t*, uint32_t*):
  0x0000000000000a2a <+0>:  push   %rbp
  0x0000000000000a2b <+1>:  mov    %rsp,%rbp
  0x0000000000000a2e <+4>:  push   %rbx
  0x0000000000000a2f <+5>:  mov    %edi,-0xc(%rbp)
  0x0000000000000a32 <+8>:  mov    %rsi,-0x18(%rbp)
  0x0000000000000a36 <+12>: mov    %rdx,-0x20(%rbp)
  0x0000000000000a3a <+16>: mov    -0xc(%rbp),%eax
  0x0000000000000a3d <+19>: cpuid   
  0x0000000000000a3f <+21>: mov    -0x18(%rbp),%rcx
  0x0000000000000a43 <+25>: mov    %eax,(%rcx)        <== HERE
  0x0000000000000a45 <+27>: mov    -0x20(%rbp),%rax
  0x0000000000000a49 <+31>: mov    %edx,(%rax)        <== HERE
  0x0000000000000a4b <+33>: nop
  0x0000000000000a4c <+34>: pop    %rbx
  0x0000000000000a4d <+35>: pop    %rbp
  0x0000000000000a4e <+36>: retq   
End of assembler dump.

The code that generates something closer to what I want is (EDITED based on feedbacks on the comments):

static inline void cpuid2(uint32_t* d, uint32_t* c)
{
   int a = 1;
   asm volatile ( "cpuid" : "=d"(*d), "=c"(*c), "+a"(a) :: "ebx" );
}

The result is:

(gdb) disassemble cpuid2
Dump of assembler code for function cpuid2(uint32_t*, uint32_t*):
   0x00000000000009b0 <+0>:   push   %rbp
   0x00000000000009b1 <+1>:   mov    %rsp,%rbp
   0x00000000000009b4 <+4>:   push   %rbx
   0x00000000000009b5 <+5>:   mov    %rdi,-0x20(%rbp)
   0x00000000000009b9 <+9>:   mov    %rsi,-0x28(%rbp)
   0x00000000000009bd <+13>:  movl   $0x1,-0xc(%rbp)
   0x00000000000009c4 <+20>:  mov    -0xc(%rbp),%eax
   0x00000000000009c7 <+23>:  cpuid  
   0x00000000000009c9 <+25>:  mov    %edx,%esi
   0x00000000000009cb <+27>:  mov    -0x20(%rbp),%rdx
   0x00000000000009cf <+31>:  mov    %esi,(%rdx)
   0x00000000000009d1 <+33>:  mov    -0x28(%rbp),%rdx
   0x00000000000009d5 <+37>:  mov    %ecx,(%rdx)
   0x00000000000009d7 <+39>:  mov    %eax,-0xc(%rbp)
   0x00000000000009da <+42>:  nop
   0x00000000000009db <+43>:  pop    %rbx
   0x00000000000009dc <+44>:  pop    %rbp
   0x00000000000009dd <+45>:  retq   
End of assembler dump.

Just to be clear... I know that there are better ways of doing it. But the purpose here is purely educational. Just want to understand how it works ;-)

-- edited (removed personal opinion) ---

Chocksmith
  • 1,188
  • 2
  • 12
  • 40
  • This code only happens to generate what you want by sheer luck. `"0"(1)` on the last parameter means that it will use the same constraint as the 0th one which is `"=d"(*d)` this means that 1 is going to be passed into the assembler template through register _EDX_ but you really need to pass the value 1 through _EAX_! You got lucky that the value 1 is already in _EAX_ and then copied to _EDX_ – Michael Petch Jan 26 '17 at 22:47
  • As well any registers (or memory) that is modified but not made known to _GCC_ (through the output constraints) has to be specified in the clobber list. CPUID will destroy the values in EAX, EBX, ECX, EDX. You haven't told _GCC_ that all of them are being overwritten. – Michael Petch Jan 26 '17 at 22:51
  • Based on the code in my comment under youroriginal question and adjusting it to work within the function you created in this answer the following code would work properly `static inline void cpuid2(uint32_t* d, uint32_t* c) { int a = 1; __asm__ __volatile__ ( "cpuid" : "=c"(*c), "=d"(*d), "+a"(a) :: "ebx" ); }` – Michael Petch Jan 26 '17 at 22:55
  • In that case _ECX_ and _EDX_ are marked as outputs which is good. So the compiler knows that those two registers can't be relied on to have the same values as they had before entering your assembly template. The `"+a"` says that the value in _EAX_ will be used as both input and output. The value upon input will be the 1 (which we placed in the variable a) and because it is listed as an output as well the compiler knows that it can't rely on the value in _EAX_ being the same. The problem is that GCC doesn't know that _EBX_ has been modified. You need to specify separately in the clobber list. – Michael Petch Jan 26 '17 at 22:58
  • Failure to get this all correctly can actually cause the code to be generated in a way that it may appear to work but as code is added (or you turn optimizations on) you may discover the code no longer works, or even worse seems to work but then the rest of the program behaves unexpectedly at times – Michael Petch Jan 26 '17 at 23:00
  • @MichaelPetch, many thanks. I'll edit the answer. Before, that... asm volatile ( "cpuid" : "=d"(*d), "=c"(*c) : "a"(1) : "ebx" ) would be a valid answer as well, right? (no need of a local variable). – Chocksmith Jan 26 '17 at 23:13
  • No it wouldn't because _EAX_ gets clobbered by CPUID. You have to specify that _EAX_ is actually both an input and an output because CPUID will overwrite _EAX_. And if you tell it to make it an output you need a temporary variable to store the value back into it (even though that value won't be used). – Michael Petch Jan 26 '17 at 23:18
  • 1
    You are right. I didn't know that CPUID also overwrite EAX when EAX=1. It is documented here: http://x86.renejeschke.de/html/file_module_x86_id_45.html I will edit the answer again – Chocksmith Jan 26 '17 at 23:21
  • looks right now. It is tricky. Thanks for your help! By the way I see no need for a downvote on the answer... It is constructive. Tks! – Chocksmith Jan 26 '17 at 23:28
  • 1
    I removed my downvote. Someone else downvoted it as well. The only reason I don't upvote is because yuor answer doesn't really explain how the assembler template works and what all the inputs and output constraints are doing and why. – Michael Petch Jan 26 '17 at 23:29
  • The example compiler output would be much better with optimization enabled, so it would only be the actually necessary instructions, not all this debug-mode noise of other stores. Also, this doesn't need to be `asm volatile`. CPUID with EAX=1 will give the same outputs every time you run it. (Unless some of the things in that leaf can differ across CPU cores, or after VM migration?) Otherwise the asm statement only needs to run to produce the output. I assume you don't care about it as a serializing instruction (extra slow memory barrier) since there's no `"memory"` clobber. – Peter Cordes May 01 '22 at 02:49