25

I've read that lvalues are "things with a defined storage location".

And also that literals and temporaries variables are not lvalues, but no reason is given for this statement.

Is it because literals and temporary variables do not have defined storage location? If yes, then where do they reside if not in memory?

I suppose there is some significance to "defined" in "defined storage location", if there is (or is not) please let me know.

roschach
  • 8,390
  • 14
  • 74
  • 124
pasha
  • 2,035
  • 20
  • 34
  • 1
    It would be helpful if there is some good reference explaining lvalue and rvalues – pasha Feb 12 '19 at 15:20
  • 7
    I find an easier rule of thumb is that lvalues can be named and their address can be taken. Maybe that's what they meant by "things with a defined storage location". They have an address that can be obtained. – François Andrieux Feb 12 '19 at 15:20
  • 1
    @pasha [Value categories](https://en.cppreference.com/w/cpp/language/value_category). – François Andrieux Feb 12 '19 at 15:21
  • 1
    temporaries do not need a defined storage location by definition. Consider `a = foo();` there is no need to store the result of `foo()` anywhere but in `a` – 463035818_is_not_an_ai Feb 12 '19 at 15:21
  • 17
    _"I've read that lvalues are 'things with a defined storage location.'"_ That's an oversimplification, and the problem with oversimplifications is that they are, sometimes, wrong. You should find the proper definition instead. – Lightness Races in Orbit Feb 12 '19 at 15:21
  • @Blaze you could argue that this is covered by the as-if rule. You can still get that behavior with LValues. –  Feb 12 '19 at 15:22
  • 1
    @pasha, [amen!](https://stackoverflow.com/questions/3601602/what-are-rvalues-lvalues-xvalues-glvalues-and-prvalues). – Duck Dodgers Feb 12 '19 at 15:24
  • @Frank true that. It was more of an answer to the "if it's not in memory, where else can it be" bit, which is of course only touching this tangentially. – Blaze Feb 12 '19 at 15:24
  • @Blaze, agreed 5 is stored in the operation, but operation itself should be stored in memory, right? (this makes sense if I think in terms of what Francois commented, they have an address that can be taken) – pasha Feb 12 '19 at 15:25
  • @FrançoisAndrieux thanks for the reference, it might take some time for me to read and understand. – pasha Feb 12 '19 at 15:26
  • 1
    @pasha If the operand is cooked into the assembly, then it doesn't really have any storage. [Example](https://godbolt.org/z/VNfo8t). Edit : I mean, it doesn't have any storage that's accessible from c++. – François Andrieux Feb 12 '19 at 15:28
  • @FrançoisAndrieux It was clear until I saw the comment in one of the answers by Jarod42 mentioning, const int& v = 5;, Does 5 have a storage in this case? – pasha Feb 12 '19 at 15:47
  • 1
    *Value category* is a property of an **expression**, not of a "variable", "literal" or some other entity, as has been stated many times already. There's no such thing as "temporary variables and literals are not lvalues" in C++. This wording makes no sense at all. – AnT stands with Russia Feb 12 '19 at 15:47
  • 1
    @pasha When you initialize a `const` reference with `5`, that reference has a name and using that name is an lvalue expression. Edit : Taking a reference to something is not the same as taking it's address. – François Andrieux Feb 12 '19 at 15:49
  • @AnT Is it possible to craft an expression where a literal would be an lvalue? Same question with a temporary. If not, then "temporary variables and literals are not lvalues" is a simplification that makes sense to me. – YSC Feb 12 '19 at 15:58

5 Answers5

22

And also that literals and temporaries variables are not lvalues, but no reason is given for this statement.

This is true for all temporaries and literals except for string literals. Those are actually lvalues (which is explained below).

Is it because literals and temporaries variables do not have defined storage location? If yes, then where do they reside if not in memory?

Yes. The literal 2 doesn't actually exist; it is just a value in the source code. Since it's a value, not an object, it doesn't have to have any memory associated to it. It can be hard coded into the assembly that the compiler creates, or it could be put somewhere, but since it doesn't have to be, all you can do is treat it as a pure value, not an object.

There is an exemption though and that is string literals. Those actually have storage since a string literal is an array of const char[N]. You can take the address of a string literal and a string literal can decay into a pointer, so it is an lvalue, even though it doesn't have a name.

Temporaries are also rvalues. Even if they exist as objects, their storage location is ephemeral. They only last until the end of the full expression they are in. You are not allowed to take their address and they also do not have a name. They might not even exist: for instance, in

Foo a = Foo();

The Foo() can be removed and the code semantically transformed to

Foo a(); // you can't actually do this since it declares a function with that signature.

so now there isn't even a temporary object in the optimized code.

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • why do string literals have storage but not other? Is it because they can't be embedded in single assembly instruction? – pasha Feb 12 '19 at 16:15
  • 10
    @pasha Because C. In C, you can do nothing with a string literal other than working with its address. It has to be in memory. It has to have an address. So C++ adapted. This is really an edge case. – YSC Feb 12 '19 at 16:16
  • 3
    @pasha Unlike other literals strings don't have a single value. Each character is a value and since you need to combine of them together they are stored in an array. Because it is an array it is no longer an rvalue because it is an object. – NathanOliver Feb 12 '19 at 16:32
10

Why are literals and temporary variables not lvalues?

I have two answers: because it wouldn't make sense (1) and because the Standard says so (2). Let's focus on (1).

Is it because literals and temporaries variables do not have defined storage location?

This is a simplification that doesn't fit here. A simplification that would: literals and temporary are not lvalues because it wouldn't make sense to modify them1.

What is the meaning of 5++? What is the meaning of rand() = 0? The Standard says that temporaries and literals are not lvalues so those examples are invalid. And every compiler developer is happier.


1) You can define and use user-defined types in a way where the modification of a temporary makes sense. This temporary would live until the evaluation of the full-expression. François Andrieux makes a nice analogy between calling f(MyType{}.mutate()) on one hand and f(my_int + 1) on the other. I think the simplification holds still as MyType{}.mutate() can be seen as another temporary as MyType{} was, like my_int + 1 can be seen as another int as my_int was. This is all semantics and opinion-based. The real answer is: (2) because the Standard says so.

YSC
  • 38,212
  • 9
  • 96
  • 149
  • 1
    Your reasoning is a little flawed. Yes, modifying 5 is nonsense but you are allowed to modify temporaries of non built in types. – NathanOliver Feb 12 '19 at 15:37
  • @NathanOliver Yes, so side effects can happen. I think the reasoning (a simplification, let us not forget) holds still: except for specific cases (user-defined types with side effects on modification, string literals, ...) it means nothing to modify a pr-value. – YSC Feb 12 '19 at 15:43
  • 2
    In C, which yes is a different language, you can indeed write `(int){ 5 }++` and that means something; yes there is a particular logic behind that, but it's a possible example of how the two reasons are interlinked - a literal not making sense as an lvalue is essentially down to design decisions made for the Standard - if the Standard *wanted* all values to be lvalues, they would be and it would make sense in that language as-designed. – Alex Celeste Feb 12 '19 at 20:20
  • Another reason this reasoning is flawed is that a string literal *is* an lvalue, but "a" = 0 doesn't make sense. – prl Feb 13 '19 at 04:42
  • @prl Well, yes... like any other _constant_ lvalue (`"a"` has type `char const[2]`)... and this is completely outside of the scope of this answer. In the same vein, `const int n = 0; n = 1;` is illegal, but frankly I don't see where it plays within my explanation. – YSC Feb 13 '19 at 12:53
8

There are a lot of common misconceptions in the question and in the other answers; my answer hopes to address that.

The terms lvalue and rvalue are expression categories. They are terms that apply to expressions. Not to objects. (A bit confusingly, the official term for expression categories is "value categories" ! )

The term temporary object refers to objects. This includes objects of class type, as well as objects of built-in type. The term temporary (used as a noun) is short for temporary object. Sometimes the standalone term value is used to refer to a temporary object of built-in type. These terms apply to objects, not to expressions.

The C++17 standard is more consistent in object terminology than past standards, e.g. see [conv.rval]/1. It now tries to avoid saying value other than in the context value of an expression.


Now, why are there different expression categories? A C++ program is made up of a collection of expressions, joined to each other with operators to make larger expressions; and fitting within a framework of declarative constructs. These expressions create, destroy, and do other manipulations on objects. Programming in C++ could be described as using expressions to perform operations with objects.

The reason that expression categories exist is to provide a framework for using expressions to express operations that the programmer intends. For example way back in the C days (and probably earlier), the language designers figured that 3 = 5; did not make any sense as part of a program so it was decided to limit what sort of expression can appear on the left-hand side of =, and have the compiler report an error if this restriction wasn't followed.

The term lvalue originated in those days, although now with the development of C++ there are a vast range of expressions and contexts where expression categories are useful, not just the left-hand side of an assignment operator.

Here is some valid C++ code: std::string("3") = std::string("5");. This is conceptually no different from 3 = 5;, however it is allowed. The effect is that a temporary object of type std::string and content "3" is created, and then that temporary object is modified to have content "5", and then the temporary object is destroyed. The language could have been designed so that the code 3 = 5; specifies a similar series of events (but it wasn't).


Why is the string example legal but the int example not?

Every expression has to have a category. The category of an expression might not seem to have an obvious reason at first, but the designers of the language have given each expression a category according to what they think is a useful concept to express and what isn't.

It's been decided that the sequence of events in 3 = 5; as described above is not something anyone would want to do, and if someone did write such a thing then they probably made a mistake and meant something else, so the compiler should help out by giving an error message.

Now, the same logic might conclude that std::string("3") = std::string("5") is not something anyone would ever want to do either. However another argument is that for some other class type, T(foo) = x; might actually be a worthwhile operation, e.g. because T might have a destructor that does something. It was decided that banning this usage could be more harmful to a programmer's intentions than good. (Whether that was a good decision or not is debatable; see this question for discussion).


Now we are getting closer to finally address your question :)

Whether or not there is memory or a storage location associated is not the rationale for expression categories any more. In the abstract machine (more explanation of this below), every temporary object (this includes the one created by 3 in x = 3;) exists in memory.

As described earlier in my answer, a program consists of expressions that manipulate objects. Each expression is said to designate or refer to an object.

It's very common for other answers or articles on this topic to make the incorrect claim that an rvalue can only designate a temporary object, or even worse , that an rvalue is a temporary object , or that a temporary object is an rvalue. An expression is not an object, it is something that occurs in source code for manipulating objects!

In fact a temporary object can be designated by an lvalue or an rvalue expression; and a non-temporary object can be designated by an lvalue or an rvalue expression. They are separate concepts.

Now, there's an expression category rule that you can't apply & to an expression of the rvalue category. The purpose of this rule and these categories is to avoid errors where a temporary object is used after it is destroyed. For example:

int *p = &5;    // not allowed due to category rules
*p = 6;         // oops, dangling pointer

But you could get around this:

template<typename T> auto f(T&&t) -> T& { return t; }
// ...
int *p = f(5); // Allowed
*p = 6;        // Oops, dangling pointer, no compiler error message.

In this latter code, f(5) and *p are both lvalues that designate a temporary object. This is a good example of why the expression category rules exist; by following the rules without a tricky workaround, then we would get an error for the code that tries to write through a dangling pointer.

Note that you can also use this f to find the memory address of a temporary object, e.g. std::cout << &f(5);


In summary, the questions you actually ask all mistakenly conflate expressions with objects. So they are non-questions in that sense. Temporaries are not lvalues, because objects are not expressions.

A valid but related question would be: "Why is the expression that creates a temporary object an rvalue (as opposed to being an lvalue?)"

To which the answer is as was discussed above: having it be an lvalue would increase the risk of creating dangling pointers or dangling references; and as in 3 = 5;, would increase the risk of specifying redundant operations that the programmer probably didn't intend.

I repeat again that the expression categories are a design decision to help with programmer expressiveness; not anything to do with memory or storage locations.


Finally, to the abstract machine and the as-if rule. C++ is defined in terms of an abstract machine, in which temporary objects have storage and addresses too. I gave an example earlier of how to print the address of a temporary object.

The as-if rule says that the output of the actual executable the compiler produces must only match the output that the abstract machine would. The executable doesn't actually have to work in the same way as the abstract machine, it just has to produce the same result.

So for code like x = 5; , even though a temporary object of value 5 has a memory location in the abstract machine; the compiler doesn't have to allocate physical storage on the real machine. It only has to ensure that x ends up having 5 stored in it and there are much easier ways to do this that don't involve extra storage being created.

The as-if rule applies to everything in the program, even though my example here only refers to temporary objects. A non-temporary object could equally well be optimized out, e.g. int x; int y = 5; x = y; // other code that doesn't use y could be changed to int x = 5;.

The same applies for class types without side-effects that would alter the program output. E.g. std::string x = "foo"; std::cout << x; can be optimized to std::cout << "foo"; even though the lvalue x denoted an object with storage in the abstract machine.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • +1 I always assumed the l and r in lvalue/rvalue stood for left and right in the context of an assignment. Is this correct? If so then I would write this very conspicuously in the answer since it's pretty much a very short and to-the-point answer to the question. – user541686 Feb 13 '19 at 03:17
  • @Mehrdad, that is where the names come from, and that is what they originally meant, but then X3J11 introduced the notion of a modifiable lvalue and things got murkier. – prl Feb 13 '19 at 04:38
  • @Mehrdad my second section addresses that (and I don't feel your suggestion would be a correct answer, let alone a complete one) – M.M Feb 13 '19 at 05:32
  • Another example of why it makes no sense to allow taking the address of a value: `assert(&5 == &5);` – assuming that `&5` was a legal expression, would or should this hold? Would it even be useful in some way to ask whether `&5 == &5`? I for one cannot possibly imagine what the point would be. – Arne Vogel Feb 13 '19 at 14:55
  • Nice answer (+1) but very easy to get lost (maybe it's just me) so I have the following observations 1)`lvalue`s and `rvalue`s are expression categories, not strictly relate to temporary or non-temporary object and `In fact a temporary object can be designated by an lvalue or an rvalue expression; and a non-temporary object can be designated by an lvalue or an rvalue expression.` It is clear from your answer what `lvalues` and `rvalues` are not, but could you point out (more clearly maybe) what they are then? 2) Can you explain the syntax `-> T& { return t; }`? – roschach Feb 18 '19 at 15:28
  • 3) You showed on how to get an `lvalue` from a temporary object (the less obvious case) but can you make explicit also the other cases: hot to get `lvalue` (and `rvalue`) from temporary (and non temporary) objects ? In some of them are obvious but in this way the whole picture is clear – roschach Feb 18 '19 at 15:31
  • @FrancescoBoi (1) lvalues and rvalues are expressions, (2) trailing return type, (3) lvalue designating temporary is already shown; rvalue designating temporary: `int()`, lvalue designating non-temporary: `x`, rvalue designating non-temporary: `std::move(x)` – M.M Feb 18 '19 at 23:20
  • 1) Yeah I got they are expressions categories and that they are defined apriori (aren't they?) by the language designers but how were they defined? I mean they went through every possible expression and decided what expression is what or there are some common "characteristics" ? And what are the consequences of an expression being an `rvalue` rather than an `lvalue`? If you have an external reference that explains clearly it is good too. – roschach Feb 19 '19 at 00:30
  • @FrancescoBoi the c++ standard might be a good place to read, the Expressions chapter starts off with this topic – M.M Feb 19 '19 at 00:39
  • I'll check it but I think it requires an advanced level even beyond a good C++ programmer. – roschach Feb 19 '19 at 09:35
5

lvalue stands for locator value and represents an object that occupies some identifiable location in memory.

The term locator value is also used here:

C

The C programming language followed a similar taxonomy, except that the role of assignment was no longer significant: C expressions are categorized between "lvalue expressions" and others (functions and non-object values), where "lvalue" means an expression that identifies an object, a "locator value"[4].

Everything that is not an lvalue is by exclusion an rvalue. Every expression is either an lavalue or rvalue.

Originally lvalue term was used in C to indicate values that can stay on the left side of assignment operator. However with the const keywork this changed. Not all lvalues can be assigned to. Those that can are called modifiable lvalues.

And also that literals and temporaries variables are not lvalues, but no reason is given for this statement.

According to this answer literals can be lvalues in some cases.

  • literals of scalar types are rvalue because they are of known size and are very likely to be embedded directly into the machine commands on the given hardware architecture. What would be the memory location of 5?
  • On the contrary, strangely enough, string literals are lvalues since they have unpredictable size and there is no other way to represent them apart from as objects in memory.

An lvalue can be converted to an rvalue. For example in the following instructions

int a =5;
int b = 3;
int c = a+b;

the operator + takes two rvalues. So a and b are converted to rvalues before getting summed. Another example of conversion:

int c = 6;
&c = 4; //ERROR: &c is an rvalue

On the contrary you cannot convert an rvalue to an lvalue.

However you can produce a valid lvalue from an rvalue for example:

int arr[] = {1, 2};
int* p = &arr[0];
*(p + 1) = 10;   // OK: p + 1 is an rvalue, but *(p + 1) is an lvalue

In C++11 rvalues reference are related to the move constructor and move assignment operator.

You can find more details in this clear and well-explained post.

roschach
  • 8,390
  • 14
  • 74
  • 124
  • 1
    I added another link in the answer check the `C` paaragraph: https://en.cppreference.com/w/cpp/language/value_category – roschach Feb 12 '19 at 16:29
  • 1
    `"Hello world\n"` has a definite address. – NathanOliver Feb 12 '19 at 16:34
  • 2
    It's `&"Hello world\n"` see: http://coliru.stacked-crooked.com/a/c19f1a7fa216cdc5 – NathanOliver Feb 12 '19 at 16:36
  • "On the contrary you cannot convert an rvalue to an lvalue" - yes you can: write `template auto f(T&& t) -> T& { return t; }` , then calling `f` on an rvalue will yield an lvalue referring to the same object. E.g. `cout << &f(5);` – M.M Feb 12 '19 at 23:05
  • @M.M Honestly those lines are not really clear to me so I cannot modify my answer to explain them. If you have some reference I will gladly read it and try to modify my answer accordingly. Also feel free to modify my answer yourself if worth it. – roschach Feb 13 '19 at 08:40
  • @NathanOliver not really, That is the address of a (possibly different) copy of the same string. At least in C? – Lorraine Feb 13 '19 at 11:25
  • I think it just confuses the language-design issue to use the term "hidden lvalue". The fact that most ISAs lack floating-point immediates is just an implementation detail. That might make a fun fact footnote, but doesn't belong in your bullet list. In C++, `5.5` is purely an rvalue, even though (if it doesn't get optimized away or changed by constant-propagation) it will probably end up in read-only static storage. But `x * 2.0` will be compiled as `x+x` by most good compilers on most ISAs. Where's your "hidden lvalue" now? It's just an asm implementation detail, unrelated to C++ rules. – Peter Cordes Feb 13 '19 at 12:15
  • You're still bolding this bogus term "hidden `lvalues`" without any double-quotes to indicate that it's not a real term, and doesn't mean anything in C++ terms. It *just* an implementation detail, and doesn't even apply to all FP constants. (e.g. the `2.0 * x` compiling as `x+x` example). – Peter Cordes Feb 13 '19 at 13:11
  • @Wilson The Q is tagged as C++. – NathanOliver Feb 13 '19 at 13:55
  • @NathanOliver You mean to say it is not the case in C++ also? That's news to me – Lorraine Feb 13 '19 at 13:56
  • @Wilson In C++ `"test"` is required to be an lvalue. I'm not sure what C has to say about it (but I expect it would be the same). Since the Q is tagged as C++ though it doesn't really matter, since the answers need to be correct for C++ – NathanOliver Feb 13 '19 at 13:59
  • @Wilson C doesn't have references, let alone rvalue references, there's only one `T` object there. All `f` is doing is changing the value category of it's argument – Caleth Feb 13 '19 at 14:01
  • done. AnT's answer is pretty much fine, though: it already says "(even though the language does not expose them as such) right after that. Unlike your answer where you had that term highlighted and without quotes around the "hidden", making it sound more like a real thing. – Peter Cordes Feb 13 '19 at 14:15
3

Where do they reside if not in memory?

Of course they reside in memory*, there's no way around it. The question is, can your program determine where exactly in memory do they reside. In other words, is your program allowed to take the address of the thing in question.

In a simple example a = 5 the value of five, or an instruction representing an assignment of the value of five, is somewhere in memory. However, you cannot take the address of five, because int *p = &5 is illegal.

Note that string literals are an exception from the "not an lvalue" rule, because const char *p = "hello" produces an address of a string literal.


* However, it may not necessarily be data memory. In fact, they may not even be represented as a constant in the program memory: for example, an assignment short a; a = 0xFF00 could be represented as a an assignment of 0xFF in the upper octet, and clearing out the lower octet in memory.
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523