3

I was trying to implement my own variadic function using this code. Instead I got UB.

#include <stdio.h>

void test(int a, ...)
{
    char* arg_a = (char*)&a;
    char* arg_b = arg_a + sizeof(int);
    printf("%c", *arg_b);
}

int main(){
    test(1, 'a');
}

So why does this program not print the letter a? Is it not expected that argument 1 ( of test()) will be written in the function stack frame in a low address ex: 0000 0004 (since 0000 0000 will be reserved for return address) then followed by arg_a in the higher address following first arg?

I guess this result is because of something of compiler optimization, or is there is something else?

ryyker
  • 22,849
  • 3
  • 43
  • 87
KMG
  • 1,433
  • 1
  • 8
  • 19
  • 5
    (a) It is common for C implementations to pass some arguments in registers, in which case they will not be on the stack at all. (If the address of a parameter is taken, the compiler will store the corresponding register to the stack to create an address for it. No address was taken in the code in the question, so the compiler did not do this.) (b) Even if arguments are passed on the stack, the C standard does not define the behavior of attempting to access them using pointer hacks like this. The compiler is permitted to optimize the code without regard to what is attempted here. – Eric Postpischil Jul 23 '20 at 14:18
  • 4
    Basically, the variable argument mechanism must be accessed through the facilities declared in ``. It is not possible to implement it through pointer hacks. – Eric Postpischil Jul 23 '20 at 14:18
  • 2
    In addition, stacks can grow downward in memory. And there is probably a return address on the stack somewhere. And those local variables, too. Var args are probably passed in a more complicated way than just pushed onto the stack one at a time... I would stick with `va_start, va_end, va_arg`. – 001 Jul 23 '20 at 14:26
  • @ericpostpischil you say that no address was taken in this code do you mean for VAR_ARGS or int a or for both. – KMG Jul 23 '20 at 14:30
  • 1
    @KhaledGaber: In `test`, the value `'a'` is passed as an argument. That argument corresponds to the `...` in the declaration. Nothing in the `test` routine takes the address of that argument. (And nothing can, since there is no parameter name for which an `&name` expression can be formed, because there is only the `...` in the declaration, and the `` features for accessing the arguments are not used.) So the argument is simply left in a register; it is not copied to the stack. – Eric Postpischil Jul 23 '20 at 14:37
  • 1
    When I build this code, using gcc, I get the following warning occurring at the line: `printf("%c", *arg_b);`: _warning: '*((void *)&a+4)' is used uninitialized in this function [-Wuninitialized]|_, and even though it builds, the Code::Blocks disassembler fails to produce the assembly for the `void test()` function. – ryyker Jul 23 '20 at 14:38
  • Related: [how-are-variable-arguments-implemented-in-gcc](https://stackoverflow.com/questions/12371450/how-are-variable-arguments-implemented-in-gcc) – Felix G Jul 23 '20 at 14:39
  • @FelixG - I do not believe the two questions are that similar. The details of what this OP is asking differ from that in the link. – ryyker Jul 23 '20 at 14:40
  • @FelixG yeah actually this question that me think i can implement this code since it said that variables was on the stack . Maybe they just meant large variables :) – KMG Jul 23 '20 at 14:40
  • @ryyker True... I changed my comment to "related" instead – Felix G Jul 23 '20 at 14:41
  • @EricPostpischil so what if i passed a large struct instead would it be stored on function stack or somewhere else. – KMG Jul 23 '20 at 14:45
  • @KhaledGaber: ABIs typically specify to pass structures up to a certain size in registers and larger structures on the stack. Nonetheless, if you have two arguments that **must** be passed on the stack, and you attempt to access one using a pointer derived from the address of the other, the compiler is permitted to optimize the code in such a way that the access will not work. – Eric Postpischil Jul 23 '20 at 14:47
  • Thank's you all it's clear now – KMG Jul 23 '20 at 14:50

1 Answers1

3

Is it not expected that argument 1 ( of test()) will be written in the function stack frame in a low address ex: 0000 0004 (since 0000 0000 will be reserved for return address) then followed by arg_a in the higher address following first arg?

This is how it used to work a long time ago, but essentially all ABIs for full-fat processors (as opposed to microcontrollers), defined since the mid-1990s, put the first several arguments in registers instead, to make function calls faster. They do this whether or not the callee is variadic. You can't access registers with pointer arithmetic, so the thing you're trying to do is flat-out impossible. Because of this change, if you look at the contents of the stdarg.h provided by any current-generation compiler, you will see that va_start, va_arg, and va_end are defined using compiler intrinsics, something like

#define va_start(ap, last_named_arg) __builtin_va_start(ap, last_named_arg)
// etc

You might have been confused into thinking that arguments still go on the stack because most of the 32-bit x86 ABIs were defined in the late 1980s (contemporary to the 80386 and 80486) and they do put all the arguments on the stack. The only exception I remember offhand is Win32 "fastcall". The 64-bit x86 ABIs, however, were defined in the early 2000s (contemporary to the AMD K8) and they put arguments in registers.

Your code will not work reliably even if you compile it for 32-bit x86 (or any other old ABI that does put all the arguments on the stack), because it breaks the rules in the C standard about offsetting pointers. The pointer arg_b doesn't point to "whatever happens to be in memory next to a", it points to nothing. (Formally, it points one element past the end of a one-element array, because all non-array objects are treated as the sole element of a one-element array for pointer arithmetic purposes. You are allowed to do the arithmetic that computes this pointer, but not to dereference it.) Dereferencing arg_b gives the program undefined behavior, which means the compiler is allowed to arbitrarily "miscompile" it.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • Thank's alot zwol. But if code is compiled on 32 bit os what break rule of c offsetting pointer that make program not to even run on these platforms – KMG Jul 23 '20 at 15:20
  • @KhaledGaber See the last sentence of my answer. By way of illustration, on my computer clang 10 generates assembly language that would decompile to `void test(int unused, ...) { putchar('\0') }` for your function `test`. – zwol Jul 23 '20 at 15:46