Does the C++ specification permit closure implementation by passing stack-frame address offsets?

Question

I don't have access to the C++ language specification at-present, but the non-authoritative cppreference.com website says:

http://en.cppreference.com/w/cpp/language/lambda

The lambda expression constructs an unnamed prvalue temporary object of unique unnamed non-union non-aggregate type, known as closure type.

I know the specification also states that only non-capturing lambdas may decompose to function pointers (copied from A positive lambda: '+[]{}' - What sorcery is this?):

The closure type for a lambda-expression with no lambda-capture has a public non-virtual non-explicit const conversion function to pointer to function having the same parameter and return types as the closure type’s function call operator. The value returned by this conversion function shall be the address of a function that, when invoked, has the same effect as invoking the closure type’s function call operator.

In C# and C++, a lambda method with variable capture (i.e. a closure) looks like this when used (warning: C#/C++-style pseudocode ahead):

class Foo {

    void AConventionalMethod() {

        String x = null;
        Int32 y = 456;

        y = this.arrayOfStringsField.IndexOfFirstMatch( (element, index) => {
            x = this.DoSomethingWithAString( element );
            y += index;
            return index > 5; // bool return type
        } );
    }
}

This can be considered roughly equivalent to doing this:

class Closure {

    Foo foo;
    String x;
    Int32 y;

    Boolean Action(String element, Int32 index) {
        this.x = this.foo.DoSomethingwithAString( element );
        this.y += index;
        return index > 5;
    }
}

class Foo {

    void AConventionalMethod() {

        String x = null;
        Int32 y = 456;

        {
            Closure closure = new Closure() { foo = this, x = x, y = y };
            Int32 tempY = this.arrayOfStringsField.IndexOfFirstMatch( closure.Action ); // passes `closure` as `this` inside the Delegate
            x = closure.x;
            y = closure.y;
            y = tempY;
        }
    }
}

Note the hidden this pointer/reference parameter in the Closure::Action method. This is fine in C# where all function-pointers are Delegate types that can always handle a this member, but in C++ it means you can only use lambda variable capture with a std::function parameter (which I understand, has the this parameter) - or when using a lambda as an argument to a C-style function-pointer parameter you cannot use variable-capture, and no capture means no closure, which means no need for a this parameter.

But I noticed that in many cases the use of this anonymous closure value is an information-theoretic waste of space, as the necessary closure information is already on the stack - all the lambda function needs is the stack address of the callee from which it can access and mutate those captured variables. In the case where a this-saving Delegate or std::function is used then the hidden parameter need only store the callee's stack frame's memory address:

void AConventionalMethod() {

    void* frame;
    frame = &frame; // get current stack frame 'base address'

    string x = nullptr;
    int32_t y = 456;

    y = this.arrayOfStrings( lambdaAction );
}

static bool lambdaAction(void* frame, string element, int32_t index) {

    Foo* foo = reintepret_cast<Foo*>( frame + 0 );
    string* x = reintepret_cast<string*>( frame + 4 ); // the compiler would know what the `frame + n` offsets are)
    int32_t* y = reintepret_cast<int32_t*>( frame + 8 );

    *x = foo->doSomethingWithAString( element );
    *y = *y + index;
    return index > 5;
}

But the above doesn't improve anything: the information-equivalence here is the same as the closure example, except the Closure type stores the actual full addresses or values, which saves the pointer-arithmetic inside the lambdaAction function - and this approach still requires the (ab)use of a hidden parameter.

But as I saw that all you need is to pass a single value: a memory address, I realised that the address could be written as a literal value in a newly emitted wrapper function, which then invokes the original lambda. This emitted function could even exist on the stack if the execution environment permits executable code on the stack (unlikely in today's environment, but possible in a constrained or bare-metal system, or kernel-mode code):

void AConventionalMethod() {

    void* frame;
    frame = &frame;

    String x = null;
    Int32 y = 456;

    // the below code would actually be generated by the compiler:
    char wrapper[] = __asm {
        push frame         ; does not alter any existing arguments
                           ; but pushes/adds 'frame' as a new
                           ; argument, and in a right-to-left calling-
                           ; convention order this means that 'frame'
                           ; becomes the first argument for 'lambdaAction'
        call lambdaAction
        return             ; EAX return value preserved
    };

    y = a_higher_order_function_with_C_style_function_pointer_parameter( wrapper ); // a
}

static bool lambdaAction(void* frame, string element, int32_t index) {
    // same as previous example's lambdaAction
}

In the event that stack is non-executable, the short wrapper dynamic function could be allocated in a separate stack-strategy allocator (given its lifespan is strictly limited by the lifetime of the parent AConventionalMethod function) which has memory pages marked for writing and execution.

Now for my question:

Given that I've explained this alternative strategy for implementing lambda variable capture and closures - and given my personal belief that this is feasible - why does the C++ specification proscribe variable capture for lambdas used with C-style function-pointer parameter arguments instead of leaving it implementation-defined? Is this strategy infeasible or does it have other shortcomings (though I believe it would work in reentrant and recurisve-call scenarios)?

_"I don't have access to the C++ language specification at-present"_ You've noticed that the website you're citing has this section accessible through the bottom link _External links_: [Useful Resources](http://en.cppreference.com/w/cpp/links)? — user0042, Sep 24 '17 at 07:05
I doesn't matter whether you can look it up in the specs. It's an interesting and useful question regardless. — Henri Menke, Sep 24 '17 at 07:34
"why does the C++ specification proscribe variable capture for lambdas used with C-style function-pointer parameter arguments". Where? What part of the C++ specification do you have in mind? The question is not very clear. What are "C-style function-pointer parameter arguments"? — n. m. could be an AI, Sep 24 '17 at 07:46
Stack frames are an implementation detail, the standard doesn't mention them at all, so it certainly doesn't care how your compiler implements the requirements. IMHO, the biggest thing a compiler might do is to optimize away the lambda completely (i.e. inline the parent function and everything with it). Otherwise, requiring that a function depends on the stack arrangement of a *different* function is something that I don't see ever happening in any reasonable compiler implementation. — vgru, Sep 24 '17 at 08:11

score 7 · Answer 1 · answered Sep 24 '17 at 07:50

The C++ standard requires the lambda with state behave as-if it was an object of unspecified size with an operator() that can be invoked upon it.

If you provably never interact with it in way that requires it to be an object, it need not exist. Eliminating inlined lambdas is both easy and common in C++.

Now, a lambda with capture may not be converted to a bare function pointer. Even if it ends up capturing nothing. It shouldn't have an operator convert to function pointer.

As a general rule, C++ compilers may not add heap allocations unrequested by code; heap allocation can fail at runtime, and an operation throwing an out of memory error in observable behaviour. So an allocating operation does not behave as-if it did not allocate. Only when the standard leaves it open to the implementation if an operation can throw is adding heap allocation permitted. You can allocate, catch any exceptions, and have a backup plan (there is even an algorithm whose performance requirements basically require that implementation).

In short, creating dynamic functions to execute is not banned by C++ but it must behave as-if you captured state. Using a common offset pointer to replace the lambda's this is also not banned. If a reference is captured by reference this common offset pointer won;t generally work, as the lambda reference is to the original data (not the captured reference) which isn't at a static offset from the stack frame at moment of lambda creation.

References in C++ have very flexible runtime existence. They have very few memory "existence" and layout requirements, and are easy to optimize out of existence. Converting a pile of references with static mutual offset into one pointer is legal in C++ to the best of my knowledge. I am unaware of a compiler that tries, however; usually when you know that much, you are a baby step away from elininating the references entirely.

Does the C++ specification permit closure implementation by passing stack-frame address offsets?

1 Answers1