0

I would like to understand better about overloading && and || operators and the loss of its short-circuit behaviour.

This question stems from my attempt at implementing a lazy-evaluated data value holder of template type T. Consider such a struct, e.g.

template <typename T>
struct Value {
  // value holder & accessor
  T m_val;
  virtual T getValue() {return m_val}; // trivial, but potentially complicated computation in derived classes

  // will come into play below
  template <typename U, typename V>
  struct LogicalAnd;
};

And the value of T that any instance holds is accessed by getValue(). The types I am interested in are primarily arithmetic data types, and the use case for this class will be that its value can be accessed by getValue(), but also that I could define an arbitrary derived class MyValue<T> later down the road that could perform some heavy computation to store & return different values throughout the runtime of the program.

Given that, I'd like to be able to "book" operations, e.g. &&, between two instances to reflect their lazily-updated values. A straightforward implementation of operator overloading does not achieve this, since it will return the basic data type value as evaluated at that line:

// overload && operator
template <typename U, typename V>
bool operator&&(Value<U>& a, Value<V>& b) {
    return (a && b);    
}  // returns a basic data type bool evaluated at above line. X

So, I instead can write a proxy class representing the operation that I want and overload the operator such that it returns an instance of the proxy.

// addition of two existing Value
template <typename T>
template <typename U, V>
struct Value<T>::LogicalAnd {
  virtual T getValue() override {
    return m_valA->getValue() && m_valB->getValue();
  }
  Value<U>* m_valA;
  Value<V>* m_valB;
};

// overload && operator
template <typename U, typename V>
auto operator&&(Value<U>& a, Value<V>& b) {
    return typename Value<decltype(a.m_val && b.m_val)>::template LogicalAnd<U,V>{&a, &b};          
}  // returns a Value<X>::LogicalAnd<U,V>, X is most likely bool.

Then, I can get the lazy evaluation I want by calling getValue() of this proxy instance:

Value<double> a{1.0};
Value<int> b{0.0};
c = a&&b; // c is Value<bool>::LogicalAnd<double,int>
c.getValue(); // = false
b.m_val = 1.0; // update the value of b
c.getValue(); // = true!!!

The question is regards to whether this would still be considered losing short-circuit behaviour of the operation. As I see it, it should be at least effectively preserved, as the built-in && operator is being used by LogicalAnd implementation and heavy computation is done on-the-fly by the operands' getValue() methods. In other words, whenever I call c.getValue(), if a.getValue() is already false, then b.getValue() would not be called. Would a use case such as this warrant a "good-enough" practice for overloading these operators?

  • Thanks for the clarification about the first part of my question/assumption. But if the primary goal I want to achieve is the lazy evaluation of the RHS operand, doesn't the above concept effectively achieve this by returning the proxy `struct` instead of a `bool`? Indeed both `expression1` and `expression2` are being "evaluated", only in the sense that their pointers are being stored. And afterwards, their `getValue()` are called to return basic data types, which will be short-circuited since I am using the built-in operation there. Is this not the case? – taehyounpark Oct 08 '21 at 01:56
  • 1
    Why introduce the `LogicalAnd` template instead of having `operator&&` return `a.getValue() && b.getValue()`? In either case, short-circuiting would avoid the second call to `getValue()` if the first call returned something that converts to `false`, assuming that `operator&&(const T&, const T&)` has not been defined (both cases make this assumption). – JaMiT Oct 08 '21 at 01:58
  • If I simply return `a.getValue() && b.getValue()`, then the result is just a basic data type `bool`. But a `Value` struct is hypothetically intended to calculate & store values which could change throughout run-time and I want the `&&` operation to reflect that. So then the easiest way (I see) is to define another class deriving from `Value` that performs my desired operation on`Value::getValue()` and `Value::getValue()` at any given time. – taehyounpark Oct 08 '21 at 02:04
  • 1
    You find it easier to **1)** Define a template to handle this task, **2)** invoke `operator&&` at a time other than when you need its result, and **3)** call something other than `operator&&` when you need the result of `operator&&`; than to simply call `operator&&` when you need its result? – JaMiT Oct 08 '21 at 02:11
  • 1
    *"then the result is just a basic data type `bool`."* -- Well, yeah. You said that `LogicalAnd` uses the built-in `&&` operator. The built-in operator returns a `bool`, so the result *is* "just" a `bool`. – JaMiT Oct 08 '21 at 02:14
  • Impressively, yes to all of 1,2,3 because the point of my question is to achieve lazy evaluation of data types such that arbitrary operations can booked at an initialization step rather than invoking them inside the actual processing of a dataset. In a similar vein, the end result are both "just" bools as you point out, but one is literally evaluated at that line versus the other is invoked through `getValue()` which may return different values. – taehyounpark Oct 08 '21 at 02:28
  • 1
    Huh? I think the root problem with your question is that you have defined your approach via code rather than text. As is typical for such questions, your code does **not** convey what you think it conveys. Please use your code only as an illustration of the description that precedes the code. Write out an explanation of what you are doing and why you believe it works. Only after you have fully explained your situation in words should you provide your code. The code should serve to fill in gaps in the explanation, not to be the explanation. – JaMiT Oct 08 '21 at 02:34
  • 1
    The root cause is that it makes everyone second guess themselves. Look at this comment thread, for example. I had to read your post, your example code, and the entire comment thread, and then I *still* had to think about what your code was doing and why it was doing it. Code is written once and read many times. It's not sustainable if I have to read a thesis to understand what you're doing. – Silvio Mayolo Oct 08 '21 at 02:39
  • Apologies for the confusion and thanks for the comments, I've made edits to the question in the hope that the context of my question is clearer based on the questions and discussions as had in this thread. – taehyounpark Oct 08 '21 at 03:23
  • Your question is clearer, but your code example is a bit of a mess. To get your code to compile without warnings, I had to add a semicolon, add `typename`, change `0.0` to `0`, add a type for `c`, remove `override`, and wrap the final code block inside a function definition. (I also needed to add constructors, which surprised me.) Then there's the issue of your "straightforward implementation" exhibiting infinite recursion, so I suspect a change is warranted there as well. That's a lot of work that should not be asked of the reader. – JaMiT Oct 08 '21 at 16:48

2 Answers2

1

The question is regards to whether this would still be considered losing short-circuit behaviour of the operation.

You still lose short-circuit behavior as both side are evaluated:

If you do something like (not sure you handle r-value though)

c = a && Value<int>{ complex_function() };
d = a && compute_value();

You will still compute complex_function and compute_value.

But indeed, your lazy evaluation might don't call rhs.getValue() when a.getValue() is false.

Jarod42
  • 203,559
  • 14
  • 181
  • 302
0

The question is regards to whether this would still be considered losing short-circuit behaviour of the operation.

Yes, you lose the short-circuiting. In the line

c = a&&b;

both a and b are evaluated. This could be important if there is a possibility that evaluating b might be invalid. (Perhaps instead of b, the second operand could be *p where p was initialized as Value<int> * p = nullptr. Do you want to evaluate *nullptr?)

As I see it, it should be at least effectively preserved,

You do get the majority of the performance benefit in your scenario, yes. I'd be wary of calling it "short-circuiting" though (at least without additional qualification), as others would likely misunderstand the intent and infer more than you intend.


Would a use case such as this warrant a "good-enough" practice for overloading these operators?

First of all, this question is tangential to the performance question. Whether or not you should overload && should be based more on how surprising the behavior is to programmers than on how efficient the implementation is. See point 1 of The Three Basic Rules of Operator Overloading in C++. In this respect, I would be surprised to have operator&& return an object that has to be again tested to get a boolean value.

So I would go with: no, this is not a good way to overload the operator. A better approach would be to define a named function that returns this proxy. Also, perhaps instead of having to call a function to evaluate the proxy as a bool, the proxy could define a user-defined conversion function?

Another consideration is over-engineering. You've written an auxiliary template and changed how operator&& is used; for what benefit? Your stated justification is "I can get the lazy evaluation I want by calling getValue() of this proxy instance". However, you can get the exact same functionality without the overhead of a proxy. You just need to make your original operator&& non-recursive.

// overload && operator
template <typename U, typename V>
bool operator&&(Value<U>& a, Value<V>& b) {
    return  a.getValue()  &&  b.getValue();  // CHANGE: Call getValue to avoid recursion
}

int main()
{
    Value<double> a{1.0};
    Value<int> b{0};       // CHANGE: initialize with an integer

    std::cout << (a&&b) << '\n'; // 0 means false
    b.m_val = 1; // update the value of b
    std::cout << (a&&b) << '\n'; // 1 means true!!!
}

There might be a reason for your proxy in your full code, but there is no reason to introduce it for this question. You've made the question more complex than necessary, hence less accessible to those who might benefit from it. With the proxy or without, the result is the evaluation of a.getValue() && b.getValue() at the given line. When getValue() returns a type for which operator&& is not overridden, the result is bool and b.getValue() is invoked only when a.getValue() evaluates to truth.

If getValue() returns a type for which operator&& is overridden, then the result might be something other than bool, but at the same time you lose the short-circuiting you've been striving for. So ignore this case? If you need to handle this case, just change the return type in the operator&& template from bool to auto. Exact same end result as your approach with a proxy, but with more familiar syntax and less typing.

As far as this question goes, your approach is over-engineered. So, no, it is not good practice for overloading.

JaMiT
  • 14,422
  • 4
  • 15
  • 31