Is lazy evaluation efficient/optimizable?

Question

I've found many uses for lazy evaluation, such as a tool for optimization (e.g. matrices).

Another use is syntactic sugar. But before I go overboard and make my code look at lot cleaner at the cost of runtime overhead, do compilers know how to optimize this kind of stuff? Or should I just only use it when the potential overhead is faster than not using lazy evaluation?

Following is an example of what I mean. It's not my actual use case, just a simple version of non-lazy vs lazy eval.

// original
template < class out,
           class in >
out && lazy_cast( in && i )
{
    return ( out && )( i );
}

// usage
char c1 = 10;
int c2 = lazy_cast< int >( c1 );

// lazy
template < class in >
class C_lazy_cast
{
public:
    in && i;

    template < class out && >
    operator out &&( )
    {
        return ( out && )( i );
    }
};

template < class in >
C_lazy_cast< in > lazy_cast( in && i )
{
    return { std::forward< in >( i ) };
}

// usage
char c1 = 10;
int c2 = lazy_cast( c1 );

For the sake of completeness, information about MSVC, GCC, and clang should be enough.

I don't quite understand what you intend to do with `lazy_cast`. You... passed around some references? Why not just pass around the reference itself? — Passer By, Aug 26 '20 at 06:49
@PasserBy It's just a wrapper around c-style casting. Theoretically, it should result in 0 lines of code. — j__, Aug 26 '20 at 22:16
But why won't you just pass the reference around? What's the point of having the cast not at the point of use? — Passer By, Aug 27 '20 at 09:04
@PasserBy By passing the reference around, we can eliminate the need for template arguments as they are deduced during the implicit conversion. — j__, Aug 27 '20 at 16:00

Benny K · Accepted Answer · 2020-08-27T08:04:19.197

The compiler can do a lot more than you may think. I'm not sure what you were going for in your example, but consider the following piece of code:

template <typename TLazyChar>
struct lazyUpperImpl{
    TLazyChar in;
    lazyUpperImpl(TLazyChar in_):in(in_){}
    char constexpr operator()(){
        auto c = in();
        if (c >= 'a' && c <= 'z'){
            c = c - 'a' + 'A';
        }
        return c;
    }
};

template <typename TLazyNumeric>
struct lazyAdd5Impl{
    TLazyNumeric in;
    lazyAdd5Impl(TLazyNumeric in_):in(in_){}
    int constexpr operator()(){
        return in() + 5;
    }
};

template <typename Tout, typename TLazyIn>
struct lazyCastImpl {
    TLazyIn in;
    lazyCastImpl(TLazyIn in_):in(in_){}

    Tout constexpr operator()(){
        return static_cast<Tout>(in());
    }
};

template <typename Tout, typename TLazyIn>
auto constexpr lazyCast(TLazyIn in){
    return lazyCastImpl<Tout, TLazyIn>(in);
}

template <typename TLazyChar>
auto constexpr lazyUpper(TLazyChar in){
    return lazyUpperImpl<TLazyChar>(in);
}

template <typename TLazyNumeric>
auto constexpr lazyAdd5(TLazyNumeric in){
    return lazyAdd5Impl<TLazyNumeric>(in);
}

int foo(int in){
    auto lazyInt = [in](){return in;};
    auto x =
        lazyAdd5(
            lazyCast<int>(
                lazyUpper(
                    lazyCast<char>(lazyInt)
                )
            )
        ) ();

    return x;
}

int main(){
    return foo(109);
}

With gcc 10.2, clang 10.0.1 and msvc 19.24 the code for foo becomes a simple set of instructions - conditionally subtract 26 and always add 5.

For example, the assembly generated by msvc is:

    movzx   eax, cl
    cmp     cl, 97                      ; 00000061H
    jl      SHORT $LN26@foo
    cmp     cl, 122                     ; 0000007aH
    jg      SHORT $LN26@foo
    lea     eax, DWORD PTR [rcx-32]
$LN26@foo:
    movsx   eax, al
    add     eax, 5
    ret     0

The output from msvc is arguably the least elaborate (and thus most easy to understand) of the three.

Moreover, if you begin with an input value known at compilation time, this is inlined to a single return new_value instruction.

Note that in the example above, the compiler cannot avoid the if condition, and the 'lazyCast' is just a no-op.

Here is another interesting example where two if statements and two mathematic expressions are simply canceled out.

For real programs, actual efficiency is very difficult/impossible to predict from the assembly - it depends on the machine executing the program, and on the state this machine is in (e.g. is it running different programs simultaneously?). Even then, you'll need to run test to be sure which lines work best.

And, of course, best is subjective. You'll need to decide if you prefer optimizing the running time(which may be broken down even further), the executable file's size or the power requirements?

This requires a lot of studying to get right, but you can rest assured that there are some very talented people working on these questions for a living, and they generally do a very good job.

Thank you for this answer! That helps me feel more confident in compilers. Would you mind posting godbolt links for other compilers and discussing the results in your answer? — j__, Aug 26 '20 at 22:18
@lajoh90686 Edited the answer, and the code in the first link. — Benny K, Aug 27 '20 at 08:06

Is lazy evaluation efficient/optimizable?

1 Answers1