0

I'm trying to compile a simple C program (Win7 32bit, Mingw32 Shell and GCC 5.3.0). The C code is like this:

#include <stdio.h>
#include <stdlib.h>

#define _set_tssldt_desc(n,addr,type) \
__asm__ ("movw $104,%1\n\t" \
    :\
    :"a" (addr),\
     "m" (*(n)),\
     "m" (*(n+2)),\
     "m" (*(n+4)),\
     "m" (*(n+5)),\
     "m" (*(n+6)),\
     "m" (*(n+7))\
    )

#define set_tss_desc(n,addr) _set_tssldt_desc(((char *) (n)),addr,"0x89")


char *n;
char *addr;

int main(void) {
  char *n = (char *)malloc(100*sizeof(int));
  char *addr =  (char *)malloc(100*sizeof(int));
  set_tss_desc(n, addr);
  free(n);
  free(addr);
  return 0;
}

_set_tssldt_desc(n,addr,type) is a macro and its body is assembly code. set_tss_desc(n,addr) is another macro very similar to _set_tssldt_desc(n,addr,type). The set_tss_desc(n,addr) macro is called in main function.

When I'm trying to compile this code, the compiler's showing me the following error:

$ gcc test.c
    test.c: In function 'main':
    test.c:5:1: error: 'asm' operand has impossible constraints
     __asm__ ("movw $104,%1\n\t" \
     ^
    test.c:16:30: note: in expansion of macro '_set_tssldt_desc'
     #define set_tss_desc(n,addr) _set_tssldt_desc(((char *) (n)),addr,"0x89")
                                  ^
    test.c:25:3: note: in expansion of macro 'set_tss_desc'
       set_tss_desc(n, addr);
       ^

The strange thing is, if I comment invoke point out in main function, the code compiled successfully.

int main(void) {
  char *n = (char *)malloc(100*sizeof(int));
  char *addr =  (char *)malloc(100*sizeof(int));
  //I comment it out and code compiled.
  //set_tss_desc(n, addr); 
  free(n);
  free(addr);
  return 0;
}

Or, if I delete some variables in output part of assembly code, it also compiled.

#include <stdio.h>
#include <stdlib.h>

#define _set_tssldt_desc(n,addr,type) \
__asm__ ("movw $104,%1\n\t" \
    :\
    :"a" (addr),\
     "m" (*(n)),\
     "m" (*(n+2)),\
     "m" (*(n+4)),\
     "m" (*(n+5)),\
     "m" (*(n+6))\
    )
//I DELETE "m" (*(n+7)) , code compiled

#define set_tss_desc(n,addr) _set_tssldt_desc(((char *) (n)),addr,"0x89")


char *n;
char *addr;

int main(void) {
  char *n = (char *)malloc(100*sizeof(int));
  char *addr =  (char *)malloc(100*sizeof(int));
  set_tss_desc(n, addr); 
  free(n);
  free(addr);
  return 0;
}

Can someone explain to me why that is and how to fix this?

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • I'd guess out of registers. You're compiling on *mingw*, in hosted, under a certain calling convention. – Antti Haapala -- Слава Україні Sep 02 '17 at 18:18
  • Upgrading your compiler may help. GCC 5 is fairly old. GCC 7.2 is the latest. I'd recommend the [Nuwen Mingw distribution](https://nuwen.net/mingw.html) as an alternative small development distro for Windows. – tambre Sep 02 '17 at 18:26
  • 2
    Any code that is using user-defined inline assembler is not a 'simple C program'. – Jonathan Leffler Sep 02 '17 at 19:07
  • If I look beyond the fact that you are modifying memory via a constraint but declare it as read only - the issue is that with no optimizations the compiler is running out of general purpose registers. One register is being used for each memory reference. You should consider passing the array `n` in as a single constraint. But the bigger question is why you are going to be using inline assembly. Inline assembly should be used as a last resort and only if you know what is going and all the nuances. – Michael Petch Sep 02 '17 at 19:35
  • 1
    I think though based on the function description you are doing some kind of OS development. You should do all the TSS structure manipulation in _C_ and then use inline assembly to call the `lidt` instruction passing the TSS structure address as a constraint. I could be wrong about my guess as to what your actual intentions are, but anything that limits inline assembly usage is a good thing. – Michael Petch Sep 02 '17 at 20:05
  • When I said `lidt` above, I meant `lgdt` since the TSS descriptor is an entry in the GDT - sorry. I'm assuming the macro argument you never use in the example of 0x89 is meant to be the flags for present / non busy task. Maybe you should ask another question about a way of creating an array of structures in _C_ that can act as a descriptor table. Then it is a matter fo just filling in the data structure using traditional _C_ code and ultimately passing that to `lgdt` instruction via inline assembly. – Michael Petch Sep 02 '17 at 20:16

1 Answers1

4

As @MichealPetch says, you're approaching this the wrong way. If you're trying to set up an operand for lgdt, do that in C and only use inline-asm for the lgdt instruction itself. See the tag wiki, and the tag wiki.

Related: a C struct/union for messing with Intel descriptor-tables: How to do computations with addresses at compile/linking time?. (The question wanted to generate the table as static data, hence asking about breaking addresses into low / high halves at compile time).

Also: Implementing GDT with basic kernel for some C + asm GDT manipulation. Or maybe not, since the answer there just says the code in the question is problematic, without a detailed fix.

Linker error setting loading GDT register with LGDT instruction using Inline assembly has an answer from Michael Petch, with some links to more guides/tutorials.


It's still useful to answer the specific question, even though the right fix is https://gcc.gnu.org/wiki/DontUseInlineAsm.

This compiles fine with optimization enabled.

With -O0, gcc doesn't notice or take advantage of the fact that the operands are all small constant offsets from each other, and can use the same base register with an offset addressing mode. It wants to put a pointer to each input memory operand into a separate register, but runs out of registers. With -O1 or higher, CSE does what you'd expect.

You can see this in a reduced example with the last 3 memory operands commented, and changing the asm string to include an asm comment with all the operands. From gcc5.3 -O0 -m32 on the Godbolt compiler explorer:

#define _set_tssldt_desc(n,addr,type)     \
__asm__ ("movw $104,%1\n\t"               \
    "#operands: %0, %1, %2, %3\n"         \
     ...

void simple_wrapper(char *n, char *addr) {
  set_tss_desc(n, addr);  
}


        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ebx
        movl    8(%ebp), %eax
        leal    2(%eax), %ecx
        movl    8(%ebp), %eax
        leal    4(%eax), %ebx
        movl    12(%ebp), %eax
        movl    8(%ebp), %edx
#APP    # your inline-asm code
        movw $104,(%edx)
        #operands: %eax, (%edx), (%ecx), (%ebx)
#NO_APP
        nop                    # no idea why the compiler inserted a literal NOP here (not .p2align)
        popl    %ebx
        popl    %ebp
        ret

But with optimization enabled, you get

simple_wrapper:
        movl    4(%esp), %edx
        movl    8(%esp), %eax
#APP
        movw $104,(%edx)
        #operands: %eax, (%edx), 2(%edx), 4(%edx)
#NO_APP
        ret

Notice how the later operands use base+disp addressing modes.


Your constraints are totally backwards. You're writing to memory that you've told the compiler is an input operand. It will assume that the memory is not modified by the asm statement, so if you load from it in C, it might move that load ahead of the asm. And other possible breakage.

If you had used "=m" output operands, this code would be correct (but still inefficient compared to letting the compiler do it for you.)

You could have written your asm to do the offsetting itself from a single memory-input operand, but then you'd need to do something to tell the compiler about that the memory read by the asm statement; e.g. "=m" (*(struct {char a; char x[];} *) n) to tell it that you write the entire object starting at n. (See this answer).

AT&T syntax x86 memory operands are always offsetable, so you can use 2 + %[nbase] instead of a separate operand, if you do

asm("movw $104,    %[nbase]\n\t"
    "movw $123, 2 + %[nbase]\n\t"
    : [nbase] "=m" (*(struct {char a; char x[];} *) n)
    : [addr] "ri" (addr)
);

gas will warn about 2 + (%ebx) or whatever it ends up being, but that's ok.

Using a separate memory output operand for each place you write will avoid any problems about telling the compiler which memory you write. But you got it wrong: you've told the compiler that your code doesn't use n+1 when in fact you're using movw $104 to store 2 bytes starting at n. So that should be a uint16_t memory operand. If this sounds complicated, https://gcc.gnu.org/wiki/DontUseInlineAsm. Like Michael said, do this part in C with a struct, and only use inline asm for a single instruction that needs it.

It would obviously be more efficient to use fewer wider store instructions. IDK what you're planning to do next, but any adjacent constants should be coalesced into a 32-bit store, like mov $(104 + 0x1234 << 16), %[n0] or something. Again, https://gcc.gnu.org/wiki/DontUseInlineAsm.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • I missed that. If you take a page from Linus you can do what you want without a warning and achieve the same behaviour. But you probably already know the trick Linus uses in the kernel. The trick will work for clang and gcc. – Michael Petch Sep 02 '17 at 21:12