18

I'm new to programming and have a question about using multiple operators on a single line.

Say, I have

int x = 0;
int y = 1;
int z = 2;

In this example, I can use a chain of assignment operators: x = y = z;

Yet how come I can't use: x < y < z;?

  • 31
    You **can** use `x – Marc Glisse Oct 12 '19 at 20:12
  • FWIW, changing the behaviour of `x < y < z` to be more like Python has been proposed ("Consistent Comparisons" is the name that comes to mind). You could look more into that to see if there are public notes on why it was rejected. – chris Oct 12 '19 at 20:13
  • 1
    [Is (4 > y > 1) a valid statement in C++? How do you evaluate it if so?](https://stackoverflow.com/q/8889522/995714), [Why does (0 < 5 < 3) return true?](https://stackoverflow.com/q/4089284/995714) – phuclv Oct 13 '19 at 05:11
  • 4
    There is an arbitrary choice to make here. You can either decide to make the rules consistent between all your operators in your language, so `x < y < z` is treated just like `x+y+z` so you compute `x+y` and then you `+z` to its result, or you define that syntax to be consistent with the usual mathematical notation which means you treat comparison operators inconsistently with respect to other operators and you make `x < y < z` mean `tmp = y; x < tmp and tmp < z` (I add `tmp` to signify that the `y` expression is only evaluated once). This is an arbitrary choice. – Bakuriu Oct 13 '19 at 08:52
  • @Bakuriu: A language where comparison operators would yield a Boolean result which is not usable with other comparison operators could be extended without creating ambiguity by having them yield a `__ComparisonThanResult` type which is convertible to Boolean but encapsulates both the comparison result and a const-ref to the right-hand operand, and overload comparison operators to accept that type [so `{result,&rhs} < value` would be equivalent to `{result && (rhs < value) : &rhs}`. – supercat Oct 13 '19 at 18:14
  • One can do `(x < y) == (y < z)` or use `!=` etc – Jay Oct 27 '19 at 23:04

8 Answers8

22

You can do that, but the results will not be what you expect.

bool can be implicitly casted to int. In such case, false value will be 0 and true value will be 1.

Let's say we have the following:

int x = -2;
int y = -1;
int z = 0;

Expression x < y < z will be evaluated as such:

x < y < z
(x < y) < z
(-2 < -1) < 0
(true) < 0
1 < 0
false

Operator = is different, because it works differently. It returns its left hand side operand (after the assignment operation), so you can chain it:

x = y = z
x = (y = z)
//y holds the value of z now
x = (y)
//x holds the value of y now

gcc gives me the following warning after trying to use x < y < z:

prog.cc:18:3: warning: comparisons like 'X<=Y<=Z' do not have their mathematical meaning [-Wparentheses]
   18 | x < y < z;
      | ~~^~~

Which is pretty self-explanatory. It works, but not as one may expect.



Note: Class can define it's own operator=, which may also do unexpected things when chained (nothing says "I hate you" better than operator which doesn't follow basic rules and idioms). Fortunately, this cannot be done for primitive types like int

class A
{
public:
    A& operator= (const A& other) 
    {
        n = other.n + 1;
        return *this;
    }

    int n = 0;
};

int main()
{
    A a, b, c;
    a = b = c;
    std::cout << a.n << ' ' << b.n << ' ' << c.n; //2 1 0, these objects are not equal!
}

Or even simpler:

class A
{
public:
    void operator= (const A& other) 
    {
    }

    int n = 0;
};

int main()
{
    A a, b, c;
    a = b = c; //doesn't compile
}
Yksisarvinen
  • 18,008
  • 2
  • 24
  • 52
  • 3
    The `=` operator does not return the same type and same value that was “provided” to it. The result of an assignment expression is an lvalue referring to the left operand. When the left operand is not of class type, the value assigned is the right operand converted to the type of the left operand. For example, given `int y`, the value of `y = 3.5` is 3, not 3.5. – Eric Postpischil Oct 12 '19 at 22:28
  • 1
    @EricPostpischil Changed the wording. – Yksisarvinen Oct 13 '19 at 10:00
14

x = y = z

You can think of the built-in assignment operator, =, for fundamental types returning a reference to the object being assigned to. That's why it's not surprising that the above works.

y = z returns a reference to y, then
x = y

x < y < z

The "less than" operator, <, returns true or false which would make one of the comparisons compare against true or false, not the actual variable.

x < y returns true or false, then
true or false < z where the boolean gets promoted to int which results in
1 or 0 < z


Workaround:

x < y < z should be written:
x < y && y < z

If you do this kind of manual BinaryPredicate chaining a lot, or have a lot of operands, it's easy to make mistakes and forget a condition somewhere in the chain. In that case, you can create helper functions to do the chaining for you. Example:

// matching exactly two operands
template<class BinaryPredicate, class T>
inline bool chain_binary_predicate(BinaryPredicate p, const T& v1, const T& v2)
{
    return p(v1, v2);
}

// matching three or more operands
template<class BinaryPredicate, class T, class... Ts>
inline bool chain_binary_predicate(BinaryPredicate p, const T& v1, const T& v2,
                                   const Ts&... vs)
{
    return p(v1, v2) && chain_binary_predicate(p, v2, vs...);
}

And here's an example using std::less:

// bool r = 1<2 && 2<3 && 3<4 && 4<5 && 5<6 && 6<7 && 7<8
bool r = chain_binary_predicate(std::less<int>{}, 1, 2, 3, 4, 5, 6, 7, 8); // true
Ted Lyngmo
  • 93,841
  • 5
  • 60
  • 108
  • Wait. That is really confusing to me. What do you mean operator= –  Oct 12 '19 at 20:18
  • @JamesMiller When you assign a value to something using `=` in class types, what happens is that the member function `T& operator=(const T&);` (or similar) is called. Fundamental types, like `int`, has this built-in, but it helps to think that the assignment will return a reference to the object being assigned to even in those cases. I removed `operator=` from the answer. It's not needed to answer the question. – Ted Lyngmo Oct 12 '19 at 20:22
  • are these member functions built into the C++ library for basic data types? –  Oct 12 '19 at 20:25
  • and the core languages themselves cannot be changed? (not talking about overriding functions) –  Oct 12 '19 at 20:27
  • @JamesMiller Exactly! – Ted Lyngmo Oct 12 '19 at 20:28
  • Don't booleans benefit from integer promotion ?? – Christophe Oct 13 '19 at 13:09
  • @Christophe Yes, I added that step too to the explanation too.. – Ted Lyngmo Oct 13 '19 at 15:44
5

It is because you see those expressions as "chain of operators", but C++ has no such concept. C++ will execute each operator separately, in an order determined by their precedence and associativity (https://en.cppreference.com/w/cpp/language/operator_precedence).

(Expanded after C Perkins's comment)

James, your confusion comes from looking at x = y = z; as some special case of chained operators. In fact it follows the same rules as every other case.

This expression behaves like it does because the assignment = is right-to-left associative and returns its right-hand operand. There are no special rules, don't expect them for x < y < z.

By the way, x == y == z will not work the way you might expect either.

See also this answer.

g.kertesz
  • 434
  • 3
  • 10
  • Words matter. And if we really consider what "chain of operators" means, then what about a construct present in every C++ hello world program: `cout << "This " << " is a " << "chain of stream insertion operators"`. `x = y = z` already pointed out in the question is a "chain of assignment operators". **C++ definitely has that concept**, just that it is perhaps different than other modern languages and/or many people's intuitive idea of what it means. – C Perkins Oct 13 '19 at 07:24
  • C Perkins, looks like we disagree on the meaning of "concept". Obviously you can have a chain of identical operations. But C++ just does not care. The overloaded << obeys the same precedence and associativity rules, see the same link, "Operator precedence is unaffected by operator overloading." Chaining works in this case because the overloaded << returns the left hand argument, not because of some hidden "chain rule" for successive identical operators. – g.kertesz Oct 13 '19 at 08:31
  • Like I said, words matter. Your explanation in the comment using the phrase "chain rule" was much clearer and more direct than what's in the answer, even after the edit. "Chain of operators" still accurately describes the syntactical use, which exists in many (most?) languages. A programming language involves more than technical compiler implementations. C++ operators and standard libraries were designed to support such chaining, so the concept is integral despite operators being implemented according to precedence and associativity rules. – C Perkins Oct 13 '19 at 16:23
  • 1
    @CPerkins They are saying that it is a chain of operators, but that it is not a specific concept in C++ in the way that they think (as a special case). The way the compiler reads your code is not a technical detail by any means, it's a fundamental part of the language. If OP were to understand the basics of how expressions are parsed, which is beginner concept, they would know the answer to their question. – Burak Oct 28 '19 at 17:04
5

C and C++ don't actually have the idea of "chained" operations. Each operation has a precedence, and they just follow the precedence using the results of the last operation like a math problem.

Note: I go into a low level explanation which I find to be helpful.

If you want to read a historical explanation, Davislor's answer may be helpful to you.

I also put a TL;DR at the bottom.


For example, std::cout isn't actually chained:

std::cout << "Hello!" << std::endl;

Is actually using the property that << evaluates from left to right and reusing a *this return value, so it actually does this:

std::ostream &tmp = std::ostream::operator<<(std::cout, "Hello!");
tmp.operator<<(std::endl);

(This is why printf is usually faster than std::cout in non-trivial outputs, as it doesn't require multiple function calls).

You can actually see this in the generated assembly (with the right flags):

#include <iostream>

int main(void)
{
    std::cout << "Hello!" << std::endl;
}

clang++ --target=x86_64-linux-gnu -Oz -fno-exceptions -fomit-frame-pointer -fno-unwind-tables -fno-PIC -masm=intel -S

I am showing x86_64 assembly below, but don't worry, I documented it explaining each instruction so anyone should be able to understand.

I demangled and simplified the symbols. Nobody wants to read std::basic_ostream<char, std::char_traits<char> > 50 times.

    # Logically, read-only code data goes in the .text section. :/
    .globl main
main:
    # Align the stack by pushing a scratch register.
    # Small ABI lesson:
    # Functions must have the stack 16 byte aligned, and that
    # includes the extra 8 byte return address pushed by
    # the call instruction.
    push   rax

    # Small ABI lesson:
    # On the System-V (non-Windows) ABI, the first two
    # function parameters go in rdi and rsi. 
    # Windows uses rcx and rdx instead.
    # Return values go into rax.

    # Move the reference to std::cout into the first parameter (rdi)

    # "offset" means an offset from the current instruction,
    # but for most purposes, it is used for objects and literals
    # in the same file.
    mov    edi, offset std::cout

    # Move the pointer to our string literal into the second parameter (rsi/esi)
    mov    esi, offset .L.str

    # rax = std::operator<<(rdi /* std::cout */, rsi /* "Hello!" */);
    call   std::operator<<(std::ostream&, const char*)

    # Small ABI lesson:
    # In almost all ABIs, member function calls are actually normal
    # functions with the first argument being the 'this' pointer, so this:
    #   Foo foo;
    #   foo.bar(3);
    # is actually called like this:
    #   Foo::bar(&foo /* this */, 3);

    # Move the returned reference to the 'this' pointer parameter (rdi).
    mov     rdi, rax

    # Move the address of std::endl to the first 'real' parameter (rsi/esi).
    mov     esi, offset std::ostream& std::endl(std::ostream&)

    # rax = rdi.operator<<(rsi /* std::endl */)
    call    std::ostream::operator<<(std::ostream& (*)(std::ostream&))

    # Zero out the return value.
    # On x86, `xor dst, dst` is preferred to `mov dst, 0`.
    xor     eax, eax

    # Realign the stack by popping to a scratch register.
    pop     rcx

    # return eax
    ret

    # Bunch of generated template code from iostream

    # Logically, text goes in the .rodata section. :/
    .rodata
.L.str:
    .asciiz "Hello!"

Anyways, the = operator is a right to left operator.

struct Foo {
    Foo();
    // Why you don't forget Foo(const Foo&);
    Foo& operator=(const Foo& other);
    int x; // avoid any cheating
};

void set3Foos(Foo& a, Foo& b, Foo& c)
{
    a = b = c;
}
void set3Foos(Foo& a, Foo& b, Foo& c)
{
    // a = (b = c)
    Foo& tmp = b.operator=(c);
    a.operator=(tmp);
}

Note: This is why the Rule of 3/Rule of 5 is important, and why inlining these is also important:

set3Foos(Foo&, Foo&, Foo&):
    # Align the stack *and* save a preserved register
    push    rbx
    # Backup `a` (rdi) into a preserved register.
    mov     rbx, rdi
    # Move `b` (rsi) into the first 'this' parameter (rdi)
    mov     rdi, rsi
    # Move `c` (rdx) into the second parameter (rsi)
    mov     rsi, rdx
    # rax = rdi.operator=(rsi)
    call    Foo::operator=(const Foo&)
    # Move `a` (rbx) into the first 'this' parameter (rdi)
    mov     rdi, rbx
    # Move the returned Foo reference `tmp` (rax) into the second parameter (rsi)
    mov     rsi, rax
    # rax = rdi.operator=(rsi)
    call    Foo::operator=(const Foo&)
    # Restore the preserved register
    pop     rbx
    # Return
    ret

These "chain" because they all return the same type.

But < returns bool.

bool isInRange(int x, int y, int z)
{
    return x < y < z;
}

It evaluates from left to right:

bool isInRange(int x, int y, int z)
{
    bool tmp = x < y;
    bool ret = (tmp ? 1 : 0) < z;
    return ret;
}
isInRange(int, int, int):
    # ret = 0 (we need manual zeroing because setl doesn't zero for us)
    xor    eax, eax
    # (compare x, y)
    cmp    edi, esi
    # ret = ((x < y) ? 1 : 0);
    setl   al
    # (compare ret, z)
    cmp    eax, edx
    # ret = ((ret < z) ? 1 : 0);
    setl   al
    # return ret
    ret

TL;DR:

x < y < z is pretty useless.

You probably want the && operator if you want to check x < y and y < z.

bool isInRange(int x, int y, int z)
{
    return (x < y) && (y < z);
}
bool isInRange(int x, int y, int z)
{
    if (!(x < y))
        return false;
    return y < z;
}
EasyasPi
  • 430
  • 8
  • 8
  • The point about chaining stream operation is inaccurate. The correct interpretation of the expression according to the standard should be chained function calls: `std::ostream::operator<<(std::ostream::operator<<(std::cout, "Hello!"), std::endl);` – Christophe Oct 13 '19 at 13:49
  • `std::ostream& std::operator<< >(std::ostream&, char const*)` and `std::ostream::operator<<(std::ostream& (*)(std::ostream&))` are the functions called in the assembly – EasyasPi Oct 13 '19 at 14:12
  • The rules of the standard are the only thing that matter here. Any reference to an assembly is necessarily implementation specific and only true for sure for this specific compiler and version. Another compiler could generate a different code, as long as it produces the same results than the standard. – Christophe Oct 13 '19 at 14:59
2

The historical reason for this is that C++ inherited these operators from C, which inherited them from an earlier language named B, which was based on BCPL, based on CPL, based on Algol.

Algol introduced “assignations” in 1968, which made assignments into expressions that returned a value. This allowed an assignment statement to pass its result along to the right-hand side of another assignment statement. This allowed chaining assignments. The = operator had to be parsed from right to left for this to work, which is the opposite of every other operator, but programmers had been used to that quirk since the ’60s. All the C-family languages inherited this, and C introduced a few others that work the same way.

The reason that serious bugs like if (euid = 0) or a < b < c compile at all is because of a simplification made in BCPL: truth values and numbers have the same type and can be used interchangeably. The B in BCPL stood for “Basic,” and the way it made itself so simple was to ditch the type system. All expressions were weakly-typed and the size of a machine register. Just one set of operators &, |, ^ and ~ did double duty for both integer and Boolean expressions, which let the language eliminate the Boolean type. Thus, a < b < c converts a < b into the numeric value of true or false, and compares that to c. In order for ~ to work as both bitwise and logical not, BCPL needed to define true as ~false, which is ~0. On most machines, that represents -1, but on some, it could be INT_MIN, a trap value, or -0. So, you could pass the “rvalue” of true to an arithmetic expression, but it wouldn’t be meaningful.

B, the predecessor of C, decided to keep the general idea, but go back to the Algol value of 1 for TRUE. This meant that ~ no longer changed TRUE to FALSE or vice versa. Since B didn’t have strong typing that could determine at compile time whether to use logical or bitwise not, it needed to create a separate ! operator. It also defined all nonzero integer values as truthy. It kept using bitwise & and |, even though these were now broken (1&2 is false even though both operands are truthy).

C added the && and || operators, to allow short-circuit optimization and, secondarily, to fix that problem with AND. It chose not to add a logical-xor, true to their philosophy of letting us shoot ourselves in the foot, so ^ breaks if we use it on a pair of different truthy numbers. (If you want a robust logical-xor, !!p ^ !!q.) Then, the designers made the very dubious choice not to add back a Boolean type, even though they had completely undone every benefit of eliminating it in the first place, and not having one now made the language more complicated, not less. Both C++ and the C standard library would later define bool, but by then it was too late. They were stuck with three more operators than they’d started with, and they had made typing = when you meant == into a deadly trap that has caused many security bugs.

Modern compilers try to mitigate the problems by assuming that any use of =, < and so on that violates most coding standards is probably a typo, and at least warning you about it. If you really meant to do that—one common example is if (errcode = library_call()) to both check if the call failed and save the error code in case it did—the convention is that an extra pair of parentheses tells the compiler you really meant it. So, a compiler would accept if ( 0 != (errcode = library_call()) ) without complaint. In C++17, you could also write if ( const auto errcode = library_call() ) or if ( const auto errcode = library_call(); errcode != 0 ). Similarly, the compiler would accept (foo < bar) < baz, but what you probably meant is foo < bar && bar < baz.

Davislor
  • 14,674
  • 2
  • 34
  • 49
1

Even though it looks like you are assigning to multiple variables at the same time, it is actually a chain of sequential assignments. Specifically, y = z is evaluated first. The built-in = operator assigns the value of z to y and then returns an lvalue reference to y (source). That reference is then used to assign to x. So the code is basically equivalent to this

y = z;
x = y;

Applying the same logic to the comparison statement, with the difference that this one is evaluated left to right (source), we get the equivalent of

const bool first_comparison = x < y;
first_comparison < z;

Now, bool can be cast to int, but that is not what you want most of the time. As to why the language doesn't do what you want, it's because these operators are only defined as binary operators. Chained assignment just works because it can spare the return value so it was designed to return a reference to enable these semantics, but comparisons are required to return a bool and therefore they cannot be chained in a meaningful way without introducing new potentially breaking features to the language.

patatahooligan
  • 3,111
  • 1
  • 18
  • 27
  • I believe for the record `x = (y = z)` is valid and so is `x < (y < z)` but the latter will decay to a bool so you lose precision in the expression (y < z) becomes 0 or 1 but as a BOOL, yet one could do something like `x < (y|z)` or better yet `x &= (y &= z)` which is also valid with other binary ops and will not decay unless the operator is overridden to do so. – Jay Oct 27 '19 at 22:48
  • And one could emulate this with `(x < y) == (y < z)` – Jay Oct 27 '19 at 23:02
1

You can use x<y<z, but it does not get the result that you expect !

x<y<z is evaluated as (x<y)<z. Then x<y results in a boolean that will be either true or false. When you try to compare a boolean with the integer z, it gets integer promotion, with false being 0 and true being 1 (this is clearly defined by the C++ standard).

Demonstration:

int x=1,y=2,z=3;
cout << "x<y:   "<< (x<y) << endl;  // 1 since 1 is smaller than 2
cout << "x<y<z: "<< (x<y<z) <<endl; // 1 since boolean (x<y) is true, which is 
                                    //   promoted to 1, which is smaller than 3

z=1; 
cout << "x<y<z: "<< (x<y<z) <<endl; // 1 since boolean (x<y) is true, which is
                                    //   promoted to 1, which is not smaler than 1 

You can use x=y=z, but it might not be what you expect either!

Be aware that = is the assignment operator and not the comparison for equality! = works right to left, copying the value on the right into the "lvalue" on the left. So here, it copies the value of z into y, then copies the value in y into x.

If you use this expression in a conditional (if, while, ...), it will be true if x is in the end something different from 0 and false in all other cases, whatever the initial values of x, y and z. ``

Demonstration:

int x=1,y=2,z=3;  

if (x=y=z) 
    cout << "Ouch! it's true and now all variables are 3" <<endl; 

z=0; 
if (x=y=z)
    cout <<"Whatever"<<end; 
else 
    cout << "Ouch! it's false and now all the variables are 0"<<endl; 

You can use x==y==z, but it might still not be what you expect!

Same as for x<y<z except that the comparison is for equality. So you'll end up comparing a promoted boolean with and integer value, and not at all that all values are equal!

Conclusions

If you want to compare more than 2 items in a chained way, just rewrite the expression comparing termes two by two:

(x<y && y<z)     // same truth than mathematically x<y<z  
(x==y && y==z)   // true if and only if all three terms are equal

Chaining the assignment operator is allowed, but tricky. It is sometimes used to initialize several variables at once. But it's not to be recommended as a general practice.

int i, j; 
for (i=j=0; i<10 && j<5; j++)      // trick !! 
    j+=2;  

for (int i=0, j=0; i<10 && j<5; j++)  // comma operator is cleaner
    j+=2;  
Christophe
  • 68,716
  • 7
  • 72
  • 138
  • `(x < y) == (y < z)` or a variation with `!=` would also werk, there is also `(x < y && y < z)` which is cleaner, better or faster depends on the situation. – Jay Oct 28 '19 at 04:14
0

I can use x = y = z. Why not x < y < z?

You're essentially asking about syntax-idiomatic consistency here.

Well, just take consistency in the other direction: You should just avoid using x = y = z. After all, it is not an assertion that x, y and z are equal - it is rather two consecutive assignments; and at the same time, because it's reminiscent of indication of equality - this double-assignment a bit confusing.

So, just write:

y = z;
x = y;

instead, unless there's a very particular reason to push everything into a single statement.

einpoklum
  • 118,144
  • 57
  • 340
  • 684