Why does the spec prohibit passing class types to variable-argument C++ functions?

Question

Passing non-PODs to variable argument functions such as printf is undefined behaviour (1, 2), but I don't understand why the C++ standard was set this way. Is there anything inherent in variable arg functions that prevents them from accepting classes as arguments?

The variable-arg callee indeed knows nothing about their type - but nor does it know anything about built-in types or plain PODs it accepts.

Also, these are necessarily cdecl functions, so the caller can be responsible e.g. for copying them upon passing and destroying them on return.

Any insight would be appreciated.

EDIT: I still see no reason why the suggested variadic semantics won't work, but zneak's answer demonstrates well what it would take to adjust compilers to it - so I accepted it. Ultimately, it might be some historical glitch.

`these are necessarily cdecl functions` well at least the C++ standard does not require this. — cpplearner, Aug 24 '16 at 19:15
Now, we have variadic template, so why bother with old ellipsis C-argument ? — Jarod42, Aug 24 '16 at 19:18
@cpplearner: True. The standard has no concept of `cdecl` functions - but in practise for a varargs function, the call**er** has to be responsible for cleanup, because the call**ee** doesn't know how many args were passed: `printf("%s", "a", "b");` is entirely legal. — Martin Bonner supports Monica, Aug 24 '16 at 19:18
It's not clear how they should behave, so it's forbidden. Similarly, before C++11 you couldn't use `union`s with non-trivial types until it was agreed upon how should they work. — milleniumbug, Aug 24 '16 at 19:19
@milleniumbug: Why not "copy construct into the argument list, and destruct afterwards"? Sounds the most natural analog of how they work for pods. — Martin Bonner supports Monica, Aug 24 '16 at 19:22
The references in the question don't contain any link about the standard, or I cannot see that. *"why the C++ standard was set this way"*, where in the standard has been expressed such thing? — BiagioF, Aug 24 '16 at 19:24
Are you asking why you can't pass a *class* to a varargs function, or why you can't pass a non-POD object? i.e. given `class Foo {}; Foo a; void f(int a, ...) {}`, are you asking about `f(42, Foo);` or `f(42, a);`? — Ray, Aug 24 '16 at 19:28
My guess: In the old old days of C, the compiler understood only three types for function arguments - `int`, `double`, and `pointer`. It was able to convert any arguments used in a function call to one of the above types. It couldn't do that, you were SOL. Hence, `struct`s were completely out of the reckoning as a type that could be used in variable argument function calls. — R Sahu, Aug 24 '16 at 19:28
@Ray I'm asking about non-POD object arguments. In your terminology you can't pass a class as argument to *any* function, variadic or not. — Ofek Shilon, Aug 24 '16 at 19:31
This is only UB in C++03 and earlier. It's conditionally-supported in C++11. — T.C., Aug 24 '16 at 19:32
@BiagioFesta the 2nd link directly quotes the C++11 standard, which is a bit less stern then 'undefined behaviour' but still leaves me wondering: "C++11 5.2.2/7: Passing a potentially-evaluated argument of class type having a non-trivial copy constructor, a non-trivial move constructor, or a non-trivial destructor, with no corresponding parameter, is conditionally-supported with implementation-defined semantics." — Ofek Shilon, Aug 24 '16 at 19:33
Can you clarify what you mean by a "cdecl function"? That's not a term I've ever seen (in either language's standard). Do you mean `extern "C"`? And the C++ and C standards say nothing about register allocation - that's purely a platform ABI issue (and certainly varies between platforms). — Toby Speight, Aug 24 '16 at 19:34
@OfekShilon I suspected that was the question, but wanted to confirm. Since you're asking why you *can't* do it, it might be a good idea to edit the title to ask which of the illegal statements you're actually asking about. — Ray, Aug 24 '16 at 19:37
@TobySpeight here's a meatier discussion about this exactly: http://stackoverflow.com/questions/2512746/in-c-do-variadic-functions-those-with-at-the-end-of-the-parameter-list — Ofek Shilon, Aug 24 '16 at 19:37
@OfekShilon What you've just cited is about *"C++11"* and there is no mention about an undefined behaviour anyway. Moreover what about C++ < 11? Again: there is no reference to any kind of standard about undefined behaviour. — BiagioF, Aug 24 '16 at 19:39
@BiagioFesta as commented above, it is UB in C++03 (which I can't quote), but the question remains: can you shed some light on why it is only conditionally supported in C++11? Why is it even problematic? — Ofek Shilon, Aug 24 '16 at 19:42
@RSahu To clarify, that was in pre-ANSI C. You can pass a structure to a function, variadic or not, in any version of standard C. The only thing va_arg doesn't work with is function pointers, due to the manner in which it constructs the pointer type from its parameter (it appends a *). — Ray, Aug 24 '16 at 19:54
@BiagioFesta: see [section 5.2.2](http://www.open-std.org/Jtc1/sc22/wg21/docs/papers/2005/n1804.pdf), 6-7: _"... If the argument has a non-POD class type (clause9), the behavior is undefined. ..."_ — Akos Bannerth, Aug 24 '16 at 20:02
@AkosBannerth What you've just linked is *11* years old. The new C++17 std is almost out of there, can we think for a moment that C++ is a evolving language? Anyway [section 5.2.2](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf) 7. What you've cited does not exists any more. But if you guys want, just change the question title and write *"Why C++ didn't support no-pod class 10 years ago?"* — BiagioF, Aug 24 '16 at 20:11
@BiagioFesta does C++17 make any guarantees about passing class objects to variadic functions? If not - do you know why? — Ofek Shilon, Aug 24 '16 at 20:13
@Ray: You should be able to use a function pointer type using a `typedef` name. — Keith Thompson, Aug 24 '16 at 20:18
@KeithThompson Probably. (It certainly *works* in gcc, as do other ways; they just might be implementation-dependent.) C99 6.7.7.7 Example 4 seems to support the idea that you can append * to a function type decl and end up with a pointer to function, but I can't find where that syntax is actually discussed. — Ray, Aug 24 '16 at 20:41
Note that you *can* pass a `class` object to a varargs function in MSVC++ (not in GCC, though). — dan04, Aug 24 '16 at 22:37
Is it possible complications arise with C++ exception semantics? — Ofek Shilon, Aug 25 '16 at 10:24
The C++ standard allows implementations wide latitude about the low-level details of both non-PODS types and variadic argument passing, some implementations document how they handle such things, and some programs rely upon such details. Some implementations could not support passing non-PODS types to variadic functions without changing how non-PODS types or variadic functions work *even in programs that don't pass non-PODS types to variadic functions*, and mandating such support would thus be a breaking change. — supercat, Aug 25 '16 at 21:53

zneak · Accepted Answer · 2016-08-25T21:05:53.540

12

The calling convention does specify who does the low-level stack dance, but it doesn't say who's responsible for "high-level" C++ bookkeeping. At least on Windows, a function that accepts an object by value is responsible for calling its destructor, even though it is not responsible for the storage space. For instance, if you build this:

#include <stdio.h>

struct Foo {
    Foo() { puts("created"); }
    Foo(const Foo&) { puts("copied"); }
    ~Foo() { puts("destroyed"); }
};

void __cdecl x(Foo f) { }

int main() {
    Foo f;
    x(f);
    return 0;
}

you get:

x:
    mov     qword ptr [rsp+8],rcx
    sub     rsp,28h
    mov     rcx,qword ptr [rsp+30h]
    call    module!Foo::~Foo (00000001`400027e0)
    add     rsp,28h
    ret

main:
    sub     rsp,48h
    mov     qword ptr [rsp+38h],0FFFFFFFFFFFFFFFEh
    lea     rcx,[rsp+20h]
    call    module!Foo::Foo (00000001`400027b0) # default ctor
    nop
    lea     rax,[rsp+21h]
    mov     qword ptr [rsp+28h],rax
    lea     rdx,[rsp+20h]
    mov     rcx,qword ptr [rsp+28h]
    call    module!Foo::Foo (00000001`40002780) # copy ctor
    mov     qword ptr [rsp+30h],rax
    mov     rcx,qword ptr [rsp+30h]
    call    module!x (00000001`40002810)
    mov     dword ptr [rsp+24h],0
    lea     rcx,[rsp+20h]
    call    module!Foo::~Foo (00000001`400027e0)
    mov     eax,dword ptr [rsp+24h]
    add     rsp,48h
    ret

Notice how main constructs two Foo objects but destroys only one; x takes care of the other one. That obviously wouldn't work if the object was passed as a vararg.

EDIT: Another problem with passing objects to functions with variadic parameters is that in its current form, regardless of the calling convention, the "right thing" requires two copies, whereas normal parameter passing requires just one. Unless C++ extended C variadic functions by making it possible to pass and/or accept references to objects (which is extremely unlikely to ever happen, given that C++ solves the same problem in a type-safe way using variadic templates), the caller needs to make one copy of the object, and va_arg only allows the callee to get a copy of that copy.

Microsoft's CL tries to get away with one bitwise copy and one full copy construction of that bitwise copy at the va_arg site, but it can have nasty consequences. Consider this example:

struct foo {
    char* ptr;

    foo(const char* ptr) { this->ptr = _strdup(ptr); }
    foo(const foo& that) { ptr = _strdup(that.ptr); }
    ~foo() { free(ptr); }

    void setPtr(const char* ptr) {
        free(this->ptr);
        this->ptr = _strdup(ptr);
    }
};

void variadic(foo& a, ...)
{
    a.setPtr("bar");

    va_list list;
    va_start(list, a);
    foo b = va_arg(list, foo);
    va_end(list);

    printf("%s %s\n", a.ptr, b.ptr);
}

int main() {
    foo f = "foo";
    variadic(f, f);
}

On my machine, this prints "bar bar", even though it would print "foo bar" if I had a non-variadic function whose second parameter accepted another foo by copy. This is because a bitwise copy of f happens in main at the call site of variadic, but the copy constructor is only invoked when va_arg is called. Between the two, a.setPtr invalidates the original f.ptr value, which is however still present in the bitwise copy, and by pure coincidence _strdup returns that same pointer (albeit with a new string inside). Another outcome of the same code could be a crash in _strdup.

Note that this design works great for POD types; it only falls apart when constructors and destructors need side effects.

The original point that calling conventions and parameter passing mechanisms don't necessarily support non-trivial construction and destruction of objects still stands: this is exactly what happens here.

EDIT: answer originally said that the construction and destruction behavior was specific to cdecl; it is not. (Thanks Cody!)

edited Aug 25 '16 at 21:05

answered Aug 24 '16 at 20:56

zneak

134,922
42
253
328

Thanks for your answer. If this is the case, how come `std::string str("Hello");printf("%s", str)` doesn't leak memory? I just tested and it doesn't - there isn't even a std::string copy ctor called. – Ofek Shilon Aug 25 '16 at 10:12
3

The terminology in this answer is somewhat misleading. Although there is a `__cdecl` calling convention on the Windows platform, it exists only for 32-bit code. Your object code example is obviously 64-bit code, since it uses the 64-bit registers. There is no `__cdecl` for 64-bit platforms on Windows. In fact, there are only two calling conventions for AMD64: (a) the standard Microsoft 64-bit calling convention, which doesn't have a name, and (b) `__vectorcall`. – Cody Gray - on strike Aug 25 '16 at 10:13
It just so happens that if you pass an object with a destructor, the 32-bit `__cdecl`, like the 32-bit `__stdcall` and `__fastcall`, all transfer a copy of that entire object on the stack. This changes the object's ownership, making the recipient responsible for managing its lifetime. The Microsoft 64-bit calling convention, like `__vectorcall`, is the same in spirit, but different in implementation because it passes a copy of the destructor-containing object in a register, only falling back to the stack if all registers are used. – Cody Gray - on strike Aug 25 '16 at 10:18
@OfekShilon most STLs implement [small string optimizations](https://github.com/elliotgoodrich/SSO-23/blob/master/README.md). In MSVC's case (at least for 64 bits), strings under 16 characters don't need a dynamic allocation. I only have a Windows machine at work so I can't check the constructor/destructor situation. – zneak Aug 25 '16 at 14:28
@zneak The same holds for a had crafted toy class without such optimizations. It's hard to demo in a comment - implement some TestVariadic(int, ...) and test it with a 2nd argument of class C with ctor, dtor and copy ctor. When you'd set breakpoints or inspect the disassembly (MSVC 2015, no optimizations) you'd see no copy ctor calls. – Ofek Shilon Aug 25 '16 at 14:33
@zneak I mean to say: it seems *variadic* functions don't behave like regular cdecl functions of the sort you demonstrated. – Ofek Shilon Aug 25 '16 at 14:46
@OfekShilon, I get that part. I'll check that once I get to a Windows machine, probably in a little more than an hour. – zneak Aug 25 '16 at 14:47
@zneak this doesn't seem like a windows artifact. I'm able to reproduce disassembly pretty much identical to yours for your (non variadic) example – Ofek Shilon Aug 25 '16 at 14:49
@OfekShilon, CL special-cases parameter passing for variadic functions (as that native scheme cannot work). The object is bitwise-copied to a new location on the stack and the address of the bitwise copy is passed in. The variadic function then invokes the copy constructor passing that address at the `va_arg` site. This most likely corresponds to the implementation-defined semantics of the newer C++ standards. However, the delayed copy causes all sorts of strange things that I can only categorize as UB. For instance, [this program](http://pastebin.com/CEgV038K) prints "bar bar". – zneak Aug 25 '16 at 17:05
Or maybe this is the closest to copying an object before passing it as a parameter that CL can do because of structural limitations, and the scenario is just unsupported because it doesn't map closely enough to the parameter passing semantics that CL wants. – zneak Aug 25 '16 at 17:33
@zneak: "At least on Windows..." - things like that are determined by the implementation (by the compiler), not by the OS. – AnT stands with Russia Aug 25 '16 at 21:08
@zneak it seems the problem in your program isn't variadic class arguments, but rather that variadic() takes the first foo by reference - which breaks a different language law. MSVC doesn't even let you compile it unless you define `_CRT_NO_VA_START_VALIDATION`. If you remove the reference from the first foo the program seems to run fine. – Ofek Shilon Aug 26 '16 at 11:30
@OfekShilon, the problem is absolutely how MSVC passes objects as variadic parameters. If you prefer, I can take the first parameter by pointer and get the same result. I could also pass multiple objects that have pointers or references to one another. – zneak Aug 26 '16 at 14:10
@zneak Thanks for your efforts, but I'm still not sure this is the answer. You've shown how POD-specific optimizations break when non-PODs are plugged in. Still, couldn't a different C++ standard prohibit these optimizations and work? Is there anything inherent in variadic functions that prevents the expected semantics? (caller constructs and destructs the object arguments) – Ofek Shilon Aug 26 '16 at 15:43
1

@OfekShilon, this is the only right thing to do for PODs, not an optimization. The extension of this mechanism for non-POD types requires two full copies of each object passed as a variadic argument. You might not be able to fix the double copy issue without changing how va_lists work in C. Ultimately, there might be no better answer than "it broke a bunch of assumptions and nobody cares enough". – zneak Aug 26 '16 at 16:14
@zneak the bitwise copy is definitely a POD-specific optimization that could be generalized to 'copy construction' if the standard mandated it. But I agree that this is probably the best answer out there.. – Ofek Shilon Aug 26 '16 at 16:38
What's to stop the C++ standard from defining its own calling convention? I don't see how the fact that no convenient calling convention currently exists is a deal breaker. – Jordan Melo Aug 31 '16 at 17:25
@JordanMelo, see second half of the answer and the double-copy problem. – zneak Aug 31 '16 at 17:34
@zneak The second half is definitely an interesting reason why the standard probably wouldn't adopt this, and I like how you brought up that variadic templates are a better tool for the same job, but I feel like the first half, in my opinion, is not very compelling as the calling convention doesn't have to be static. – Jordan Melo Aug 31 '16 at 18:22

score 9 · Answer 2 · edited Jun 20 '20 at 09:12

I'm recording this, because it's too big to be a comment, and it was reasonably time consuming to hunt this down, so no one else wastes time looking down this route.

The text was first changed to something similar to the current wording in the draft standard in N2134 released 2006-11-03.

With some effort, I was able to trace back the wording to DR506.

Paper J16/04-0167=WG21 N1727 suggests that passing a non-POD object to ellipsis be ill-formed. In discussions at the Lillehammer meeting, however, the CWG felt that the newly-approved category of conditionally-supported behavior would be more appropriate.

The paper referenced (N1727), says very little on the subject:

The existing wording (5.2.2¶7) makes it undefined behavior to pass a non-POD object to an ellipsis in a function call:

{Snip}

Once again, the CWG saw no reason not to require implementations to issue a diagnostic in such cases.

However, this doesn't tell me very much about why it was the way it was to begin with, which is what you want to know. Turning the clock back further to when that language was first written is not possible for me, because the oldest freely available draft standard is from 2005 and already has the wording you're wondering about, all standards prior to this either require authentication or are simply contentless.

In C++98 and C++03 there is "If the argument has a non-POD class type (clause 9), the behavior is undefined.". I guess this would have been for C compatibility: C code that passed structs via varargs should continue to work. — M.M, Aug 25 '16 at 21:17

Leon · Answer 3 · 2016-08-24T20:42:41.963

I guess the problem is/was the breach of type safety. Generally, passing a derived class object where a base class object is expected should be safe. If the base class object is taken by value, then the derived class object will be simply sliced. If it is taken by pointer/reference - the pointer/reference to the derived class object is adjusted properly during compilation. This doesn't work with variable-argument functions, where interpretation of the input types is performed by the code rather than by the compiler.

Example:

struct A { char c; };
struct B { int i; };
struct D : A, B { double d; };

// This is similar to printf, but also handles the
// format specifier %b assuming an object of type B
void non_pod_printf(const char* fmt, ...);

D d1, d2;

// I bet that the code inside non_pod_printf will fail to correctly
// handle the d1 and d2 arguments even though the language rules
// ensure that D is a B
non_pod_printf("%d %b %b", 123, d1, d2);

EDIT

As a now deleted comment pointed out, A, B and D in the example above are actually POD types. However, the problem that I am bringing to your attention has to do with inheritance, which, although allows POD types, but in the majority of cases involves non-POD types.

Why does the spec prohibit passing class types to variable-argument C++ functions?

3 Answers3