2

According to this answer numeric constants passed to variadic functions are always treated as int if they fit in one. This makes me wonder why the following code works with both, int and long long. Consider the following function call:

testfunc(4, 1000, 1001, 1002, 1003);

testfunc looks like this:

void testfunc(int n, ...)
{
    int k;
    va_list marker;

    va_start(marker, n);
    for(k = 0; k < n; k++) {
        int x = va_arg(marker, int);
        printf("%d\n", x);
    }
    va_end(marker); 
}

This works fine. It prints 1000, 1001, 1002, 1003. But to my surprise, the following code works as well:

void testfunc(int n, ...)
{
    int k;
    va_list marker;

    va_start(marker, n);
    for(k = 0; k < n; k++) {
        long long x = va_arg(marker, long long);
        printf("%lld\n", x);
    }
    va_end(marker); 
}

Why is that? Why does it work with long long too? I thought that numeric integer constants were passed as int if they fit in one? (cf. link above) So how can it be that it works with long long too?

Heck, it's even working when alternating between int and long long. This is confusing the heck out of me:

void testfunc(int n, ...)
{
    int k;
    va_list marker;

    va_start(marker, n);
    for(k = 0; k < n; k++) {

        if(k & 1) {
            long long x = va_arg(marker, long long);
            printf("B: %lld\n", x);
        } else {
            int x = va_arg(marker, int);
            printf("A: %d\n", x);
        }
    }
    va_end(marker); 
}

How can this be? I thought all my parameters were passed as int... why can I arbitrarily switch back and forth between int and long long with no trouble at all? I'm really confused now...

Thanks for any light shed onto this!

Community
  • 1
  • 1
Andreas
  • 9,245
  • 9
  • 49
  • 97
  • 3
    What chip are you using? What's the ABI (Application Binary Interface) for it? You may be getting away with it because you're on a 64-bit machine and the first N arguments of the variadic arguments are passed in registers which are 64-bit registers, big enough for `long long`. Try calling the function with 20 arguments — or more than 32 arguments. The idea being that if there are too many arguments to fit in the registers, the extras will be passed on the stack, as `int`. – Jonathan Leffler Oct 30 '16 at 15:46
  • I'm on 64-bit Ubuntu with gcc. – Andreas Oct 30 '16 at 15:46
  • 2
    Undefined behavior includes behavior that works by accident. It certainly will on a 64-bit Intel processor, the ABI demands 8 bytes per argument and the upper 32-bits will always be zero and the processor is little-endian. Tinker with any of these accidental properties and you'll have an accident. – Hans Passant Oct 30 '16 at 15:48
  • Alright, increasing the argument count to 40 leads to trouble all the way. Thanks, question answered... – Andreas Oct 30 '16 at 15:49
  • If you dive into the assembly code, you'll be able to see exactly what's happening. – jdigital Oct 30 '16 at 15:50

1 Answers1

6

That has nothing to do with C. It is just that the system you used (x86-64) passes the first few arguments in 64-bit registers, even for variadic arguments.

Essentially, on the architecture you used, the compiler produces code that uses a full 64-bit register for each argument, including variadic arguments. This is the ABI agreed upon the architecture, and has nothing to do with C per se; all programs, no matter how produced, are supposed to follow the ABI on the architecture it is supposed to run.

If you use Windows, x86-64 uses rcx, rdx, r8, and r9 for the four first (integer or pointer) arguments, in that order, and stack for the rest. In Linux, BSD's, Mac OS X, and Solaris, x86-64 uses rdi, rsi, rdx, rcx, r8, and r9 for the first six (integer or pointer) arguments, in that order, and stack for the rest.

You can verify this with a trivial example program:

extern void func(int n, ...);

void test_int(void)
{
    func(0, 1, 2);
}

void test_long_long(void)
{
    func(0, 1LL, 2LL);
}

If you compile the above to x86-64 assembly (e.g. gcc -Wall -O2 -march=x86-64 -mtune=generic -S) in Linux, BSDs, Solaris, or Mac OS (X or later), you get approximately (AT&T syntax, source,target operand order)

test_int:
        movl    $2, %edx
        movl    $1, %esi
        xorl    %edi, %edi
        xorl    %eax, %eax
        jmp     func

test_long_long:
        movl    $2, %edx
        movl    $1, %esi
        xorl    %edi, %edi
        xorl    %eax, %eax
        jmp     func

i.e. the functions are identical, and do not push the arguments to the stack. Note that jmp func is equivalent to call func; ret, just simpler.

However, if you compile for x86 (-m32 -march=i686 -mtune=generic), you get approximately

test_int:
        subl    $16, %esp
        pushl   $2
        pushl   $1
        pushl   $0
        call    func
        addl    $28, %esp
        ret

test_long_long:
        subl    $24, %esp
        pushl   $0
        pushl   $2
        pushl   $0
        pushl   $1
        pushl   $0
        call    func
        addl    $44, %esp
        ret

which shows that the x86 calling conventions in Linux/BSDs/etc. involve passing the variadic arguments on stack, and that the int variant pushes 32-bit constants to the stack (pushl $x pushes a 32-bit constant x to the stack), and the long long variant pushes 64-bit constants to the stack.

Therefore, because of the underlying ABI of the operating system and architecture you use, your variadic function shows the "anomaly" you observed. To see the behaviour you expect from the C standard alone, you need to work around the underlying ABI quirk -- for example, by starting your variadic functions with at least six arguments, to occupy the registers on x86-64 architectures, so that the rest, your truly variadic arguments, are passed on the stack.

Nominal Animal
  • 38,216
  • 5
  • 59
  • 86