2

Lets compile the following top-level declaration

const int& ri = 5;

with clang++. With -std=c++14 and below it places the temporary object (and the pointer representing the reference) into the .rodata section:

        .type   _ZGR2ri_,@object        # @_ZGR2ri_
        .section        .rodata,"a",@progbits
        .p2align        2
_ZGR2ri_:
        .long   5                       # 0x5
        .size   _ZGR2ri_, 4

        .type   ri,@object              # @ri
        .globl  ri
        .p2align        3
ri:
        .quad   _ZGR2ri_
        .size   ri, 8

But if we change the standard version to -std=c++17 (or above), the object will be placed into the .data section (the pointer is still in the .rodata, though):

        .type   _ZGR2ri_,@object        # @_ZGR2ri_
        .data
        .p2align        2
_ZGR2ri_:
        .long   5                       # 0x5
        .size   _ZGR2ri_, 4

        .type   ri,@object              # @ri
        .section        .rodata,"a",@progbits
        .globl  ri
        .p2align        3
ri:
        .quad   _ZGR2ri_
        .size   ri, 8

What is the reason of such behavior? Is it a bug? The fact that it still replaces all uses of ri in the same TU by its initial value 5 suggests that it is a bug.

My hypothesis is that in [dcl.init.ref]/5.2

If the converted initializer is a prvalue, its type T4 is adjusted to type “cv1 T4” ([conv.qual]) and the temporary materialization conversion is applied.

it naïvely discards (or rather do not add) the cv1-qualifier from (to) the prvalue type.

The funny thing is that if replace the initializer expression with a prvalue of non-reference-related, but convertible type

const int& ri = 5.0;

it starts to put the object with the value 5 into the .rodata section again.

Is there anything in the standard that now requires such mutability? In other words:

  • it the object designated by ri modifiable by conforming code? (obviously code involving UB could try to change it and the compiler isn't required to make effort to allow that)
  • is the storage of that object modifiable by conforming code, by reusing it to create another object of a size no bigger than the size of the temporary "aliased" ("references are aliases") by the ri that is sizeof (int)?
curiousguy
  • 8,038
  • 2
  • 40
  • 58
Language Lawyer
  • 3,378
  • 1
  • 12
  • 29
  • 7
    There's really nothing in the standard that would answer this question. It doesn't say anything about `.rodata` or `.data`. – Barry Jan 29 '19 at 03:42
  • @Barry so the behavior for `-std=c++17` was changed just for fun? – Language Lawyer Jan 29 '19 at 03:43
  • 3
    I do not see why you think that you would follow from what I just said – Barry Jan 29 '19 at 03:51
  • 2
    Since you obviously know about it, I’ll ask before voting: how is this not a [duplicate](https://stackoverflow.com/q/54381791/8586227)? – Davis Herring Jan 29 '19 at 04:25
  • 1
    "The fact that it still replaces all uses of ri in the same TU by its initial value 5 suggests that it is a bug." That seems irrelevant to me. The way the reference is used in the TU has nothing to do with these storage decisions, which stem from the fact that you are exporting the `ri` symbol. –  Jan 29 '19 at 07:16
  • @Barry "_It doesn't say anything about .rodata or .data_" The standard says a lot of constness though. – curiousguy Jan 30 '19 at 01:57
  • @LanguageLawyer: The standard says nothing about where in a linked file some object goes. You can infer something about what options a compiler has for some things, within the structure the compiler uses, but that doesn't mean the standard has any direct say over what data section an object goes into. [tag:language-lawyer] questions are supposed to be for questions about standards, not about how a particular implementation chooses to implement them (unless the standard is known to be materially involved in that). – Nicol Bolas Jan 30 '19 at 02:41
  • 2
    @NicolBolas the essence of the question is what have changed between C++14 and C++17, so the tag apply here – Language Lawyer Jan 30 '19 at 02:46
  • @LanguageLawyer: No, the essence of your question is, "When I use different compiler options, I get slightly different results in implementation-specific ways. What was the language change that resulted in this?" My point is that your question assumes that there is such a language change. Unless you can show that there is different *behavior* with regard to these constructs (well-defined behavior in accord with the standard) in the two standards, then your question puts the cart before the horse. – Nicol Bolas Jan 30 '19 at 02:52
  • 2
    @NicolBolas _My point is that your question assumes that there is such a language change_ "Nothing" is also an answer to "what have changed between C++14 and C++17?". I don't have to prove that there is a change to have the right to ask "what have changed between C++14 and C++17?". – Language Lawyer Jan 30 '19 at 03:01
  • @curiousguy Yes? But that's half the thought - yes, the standard says a lot about const, but what do the standards words about const mean about the differentiation between `.data` and `.rodata`? – Barry Jan 30 '19 at 03:24
  • 2
    @NicolBolas - You're essentially asking the OP to answer the question first, and then decide if it's worth asking. *-Wpedantic* – Brett Hale Jan 30 '19 at 03:51
  • 1
    "what changed between C++14 and C++17?" is way too broad. As is "what might have changed to prompt clang developers to change their compiler?". The only people that would know the answer to this are clang developers so it seems to me that LLVM mailing list would be the place to ask , or perhaps go straight to filing a bug report (although the latter is easier said than done as it requires account creation which is currently disabled) – M.M Jan 30 '19 at 04:08
  • @M.M C and C++ makes some stuff immutable (like string literals). What changed in term of mutability of objects in C++ is **not** "way too broad". – curiousguy Jan 30 '19 at 04:16
  • @Barry F.ex. C++ specifies that using a cast to alter an object defined as `const` has UB. Some objects are still modified during ction/dtion but scalars aren't. Some objects have a lifetime that doesn't allow placement in read-only segment. The intent was always that global const objects, with no mutable members and compiler generated trivial c/dtors would be possible candidate for read-only segment. – curiousguy Jan 30 '19 at 04:19
  • 2
    @curiousguy That still has **nothing** to do with where the data is placed? Modifying `ri` is UB period - that's totally irrespective of this. This fundamentally is not a language question. It's a: "why did clang make this (perfectly conforming!) implementation choice" question. Which might itself be better directed to the clang list – Barry Jan 30 '19 at 04:24
  • @curiousguy the mutability of the object referred to by `const int& ri = 5;` did not change though – M.M Jan 30 '19 at 04:28
  • @Barry Are you sure there is nothing in the std that makes it possible to perform any modification in the representation of `ri`? That would be a zero answer. Zero answers ("well, duh") with a justification are good answers (well, there are sometimes deleted but still) – curiousguy Jan 30 '19 at 04:34
  • 1
    Holy shit, I _just_ realized that the question you're _actually intending_ on asking is: "Is the underlying object that `ri` is bound to created as an `int` or a `const int`?" That is... almost impossible to discern from the wording of the title and the entire body of the question. – Barry Jan 30 '19 at 04:45
  • @Barry was so in hurry to assert himself teaching me that there is no such thing as `.rodata` in the standard that not bothered to carefully compare assembly output. BTW, what is "underlying object"? – Language Lawyer Jan 30 '19 at 05:12
  • @LanguageLawyer "underlying object" = the result of "temporary materialization conversion" (I guess) – curiousguy Jan 30 '19 at 05:32
  • @Barry "_underlying object that ri is bound to created as an int or a const int?_" My answer (which really is a question because I can't follow the std completely) answers that. – curiousguy Jan 30 '19 at 05:33
  • 1
    @LanguageLawyer: "*was so in hurry to assert himself teaching me that there is no such thing as .rodata in the standard that not bothered to carefully compare assembly output.*" Well, `rodata` and so forth are what your question is literally asking about, not the question of whether the object being referenced by the temporary is `const` or not. Those are two different and separate issues. – Nicol Bolas Jan 30 '19 at 06:26
  • @Barry "_Modifying ri is UB period_" the object called `ri` or the bytes in `ri`? – curiousguy Jan 30 '19 at 08:21
  • "_The fact that it still replaces all uses of ri in the same TU by its initial value 5 suggests that it is a bug._" It does suggest that the compiler believes that a const object indeed isn't modifiable and that `ri == 5` is an invariant whenever `ri` is used as an object (f.ex. converted to rvalue) and `&ri` is used not "as a `void*` pointer". That optimization does **not** mean that the compiler writer thinks that reusing storage is UB. – curiousguy Feb 03 '19 at 03:59
  • @NicolBolas "_`rodata` and so forth are what your question is literally asking about_" Let's be kind and say that the immutability according to the std was there *as the subtext*. The Q was not textually very "language lawyer"-like but the underlying issue is purely an std issue. So I added a "language lawyer" 100% part to the Q. – curiousguy Feb 03 '19 at 04:04
  • @curiousguy now this question looks like a duplicate (TBH, that question is a duplicate of my questions which were voted to delete) – Language Lawyer Feb 03 '19 at 07:33
  • @LanguageLawyer 1) Every technical Q is voted to delete now. 2) It isn't a duplicate, the question of constness of the object and whether the compiler can put it in read only memory is different. – curiousguy Feb 03 '19 at 08:45

2 Answers2

1

Let's analyse

const int& ri = 5;

From the C++ draft: initialization of references [dcl.init.ref]/5

A reference to type “cv1 T1” is initialized by an expression of type “cv2 T2” as follows:

Here cv1 = const, T1 = int, cv2 = "", T2 = int

skipping the unapplicable clauses, we get here [dcl.init.ref]/5.3:

Otherwise, if the initializer expression (5.3.1) is an rvalue (but not a bit-field) (...) and “cv1 T1” is reference-compatible with “cv2 T2”, or (...) then the value of the initializer expression (...) is called the converted initializer.

The converted initializer is 5 a prvalue.

If the converted initializer is a prvalue, its type T4 is adjusted to type “cv1 T4” ([conv.qual]) and the temporary materialization conversion ([conv.rval]) is applied. In any case, the reference is bound to the resulting glvalue (...)

cv1 T4 = const int

So an object of type const int is created and the reference is bound to it.

"Temporary materialization conversion" is new concept explained here [conv.rval]:

A prvalue of type T can be converted to an xvalue of type T. This conversion initializes a temporary object ([class.temporary]) of type T from the prvalue by evaluating the prvalue with the temporary object as its result object, and produces an xvalue denoting the temporary object. T shall be a complete type.

So we have a conversion prvalue -> xvalue -> lvalue.

The lifetime of the temporary is described in [class.temporary]/6:

The temporary object to which the reference is bound or (...) persists for the lifetime of the reference if the glvalue to which the reference is bound was obtained through one of the following:

(6.1) a temporary materialization conversion ([conv.rval]), (...)

So this is the case and the lifetime of the temporary "persists for the lifetime of the reference".

[basic.life]/5

A program may end the lifetime of any object by reusing the storage which the object occupies

but not every object storage can be used that way: [basic.memobj]/10

Creating a new object within the storage that a const complete object with static, thread, or automatic storage duration occupies, or within the storage that such a const object used to occupy before its lifetime ended, results in undefined behavior.

Storage duration is defined here [basic.stc]

The storage duration is the property of an object that defines the minimum potential lifetime of the storage containing the object. The storage duration is determined by the construct used to create the object and is one of the following: (1.1) static storage duration (1.2) thread storage duration (1.3) automatic storage duration (1.4) dynamic storage duration 2 Static, thread, and automatic storage durations are associated with objects introduced by declarations ([basic.def]) and implicitly created by the implementation.

But then the text only mentions variables, not objects. I don't see where the storage duration of a temporary is defined!

EDIT: @LanguageLawyer points me to this core defect:

1634. Temporary storage duration

The apparent intent of the reference to 15.2 [class.temporary] is that a temporary whose lifetime is extended to be that of a reference with one of those storage durations is considered also to have that storage duration.

(...) the specification of lifetime extension of temporaries (also in 15.2 [class.temporary] paragraph 5) does not say anything about storage duration. Also, nothing is said in either of these locations about the storage duration of a temporary whose lifetime is not extended.

So there is indeed a missing part in the specification; the lifetime of these objects created by the implementation is not well specified. The specification of lifetime in C++ is difficult as you can see from the many additions in the specification of lifetime, unions, subobjects, and "nested" in the more recent standard; some of these new clauses even apply to code that uses no new C++ feature, code that was intended to be supported (but not well described) in the pre-standard the time of the ARM, such as code doing nothing more than changing the "active member" of a union.

If the specification is interpreted the way the DR claims is the intent, the lifetime of the const int temporary with value 5 would have static storage duration; its memory wouldn't be legally modifiable and could be placed in read-only section.

(Another solution: the committee could also make up a specific storage class for temporaries.)

curiousguy
  • 8,038
  • 2
  • 40
  • 58
  • _But then the text only mentions variables, not objects_ The wording is not perfect, but this can be justified. Storage duration is determined by variables. _I don't see where the storage duration of a temporary is defined!_ It is [CWG1634](http://wg21.link/1634) – Language Lawyer Jan 30 '19 at 17:49
  • "It is CWG1634" -> 404 error link is http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1634 – curiousguy Jan 30 '19 at 23:58
  • _"So an object of type const int is created and the reference is bound to it."_ No the temporary object that has been created is of type `int` not `const int` because after adjusting the converted initializer to `const int` (`cv1 T4`), temporary materialization has to be applied; but before applying it, the const gets dropped from the prvalue per ([expr.type]/2). So you've entered [conv.rval] with `T` equals to `int` not `const int`. And the initialized temporary is of type `int` not `const int`. – mada Aug 31 '22 at 15:39
0

Given this example

const int& ri = 5;

The applicable wording from the standard is [dcl.init.ref]/5

A reference to type “cv1 T1” is initialized by an expression of type “cv2 T2” as follows:

  • (5.1) [..]
  • (5.2) [..]
  • (5.3) Otherwise, if the initializer expression
    • (5.3.1) is an rvalue (but not a bit-field) or function lvalue and “cv1 T1” is reference-compatible with “cv2 T2, or
    • (5.3.2) [..]

then the value of the initializer expression in the first case and the result of the conversion in the second case is called the converted initializer. If the converted initializer is a prvalue, its type T4 is adjusted to type “cv1 T4” ([conv.qual]) and the temporary materialization conversion ([conv.rval]) is applied. In any case, the reference is bound to the resulting glvalue (or to an appropriate base class subobject).

It's already known that const int (cv1 T1) is reference-compatible with int (cv2 T2). And the converted initializer here is of type int (T4); then it's adjusted to const int (cv1 T4); then temporary materialization ([conv.rval]) gets applied. But before applying it, [expr.type]/2 kicks in:

If a prvalue initially has the type “cv T”, where T is a cv-unqualified non-class, non-array type, the type of the expression is adjusted to T prior to any further analysis.

Further analysis here includes temporary materialization conversion.

So the adjusted prvalue has type int not const int. At this point you can enter [conv.rval]:

A prvalue of type T can be converted to an xvalue of type T. This conversion initializes a temporary object ([class.temporary]) of type T [..]

So you have an xvalue denoting a temporary of type int. And the reference ri is bound to the resulting glvalue (i.e, ri is binding to a xvalue denoting a temporary of type int not const int).


Let's try to analyze the second example:

const int& ri2 = 5.0;

A reference to type “cv1 T1” is initialized by an expression of type “cv2 T2” as follows:

  • (5.1) [..]
  • (5.2) [..]
  • (5.3) [..]
  • (5.4) Otherwise,
    • (5.4.1) [..]
    • (5.4.2) Otherwise, the initializer expression is implicitly converted to a prvalue of type “T1”. The temporary materialization conversion is applied, considering the type of the prvalue to be “cv1 T1”, and the reference is bound to the result.

Here, the initialzier expression 5.0 gets implicitly converted to prvalue of type int via floating-integral standard conversion ([conv.fpint]). Then temporary materialization is applied but in this case it's said that "considering the type of the prvalue to be “cv1 T1”". So this means that the temporary materialization conversion results in an xvalue denoting a temporary of type const int. And the reference ri2 is bind to that result.

mada
  • 1,646
  • 1
  • 15
  • The whole point of the `cv1 T4` adjustment and the “initially” in [expr.type]/2 is to materialize a cv-qualified temporary. – Davis Herring Aug 31 '22 at 19:55
  • @DavisHerring - So? – mada Sep 01 '22 at 12:21
  • @DavisHerring The `cv1` in `cv1 T4` is discarded before applying [conv.rval] that's if I understood what you mean. – mada Sep 01 '22 at 12:21
  • No, it’s not—what would be the point of giving it *cv*-qualifiers only to always remove them right away? – Davis Herring Sep 01 '22 at 14:09
  • 1
    @DavisHerring - I will assume that you agree that the temporary is of type `const int` (not `int`) in the first example. If my assumption is correct, how do you enter [conv.rval] with `T = const int`? You can't because of the rule [expr.type]/2: it tells you that the prvalue is converted to cv-unqualified version of its type prior any analysis. Check the second comment in [this answer](https://stackoverflow.com/a/73548957/19729321) – mada Sep 01 '22 at 16:13
  • @DavisHerring - _"what would be the point of giving it cv-qualifiers only to always remove them right away?"_ I have noticed that also; but as far as I know I don't know what's the reason for this; what do you think? – mada Sep 01 '22 at 16:41
  • The reason is that, whatever you think about the quality of the wording here, [expr.type]/2 **does not apply to this case**. – Davis Herring Sep 01 '22 at 18:24
  • @John: That question is based on an understandable but false premise, and the answer is focused on correcting a misunderstanding of what *would* happen if the qualifiers were discarded. – Davis Herring Sep 01 '22 at 23:52
  • @DavisHerring - "_[expr.type]/2 does not apply to this case._" I still need to understand your point of view. Why [expr.type]/2 doesn't apply? – mada Sep 02 '22 at 10:38
  • @DavisHerring - "_The whole point of the cv1 T4 adjustment and the “initially” in [expr.type]/2 is to materialize a cv-qualified temporary._" Why you've said that? and why did you double-quote the word "initially"? Can you clarify this to me if possible? – mada Sep 02 '22 at 10:52
  • 1
    @John: The _reason_ it doesn't apply is quite weak (which is why I mentioned "the quality of the wording"): we interpret the word "initially" (which I quoted, and quote now, because it's a word from that paragraph) as applying _before_ the adjustment in [dcl.init.ref]/5.2. The reason we _know_ it doesn't apply is rather stronger: that very adjustment would be completely useless if it did. – Davis Herring Sep 02 '22 at 16:19
  • @DavisHerring - "_.. as applying before the adjustment in [dcl.init.ref]/5.2_" Do you mean [dcl.init.ref]/5.3? – mada Sep 02 '22 at 17:03
  • @DavisHerring - As far as I understood, the rule [expr.type]/2 doesn't apply because **initially** (before adjustment) the type of the prvalue is `T4` (`int`), i.e, it's cv-unqualified type. After the adjustment to `cv1 T4`, the rule [expr.type]/2 also doesn't apply because `cv1 T4` is not the initial type of the prvalue. Am I correct? – mada Sep 02 '22 at 17:10
  • 1
    @John: Yes, it's /5.3 in the current draft. I think we agree that [expr.type]/2 applies first; the question is whether it _also_ applies after that adjustment ("prior to any further analysis"), and the answer is that it mustn't. – Davis Herring Sep 03 '22 at 04:25
  • @DavisHerring - Thanks a lot. I'm just facing a problem in understanding this line `const int &&r = static_cast(0); `. Here, the same rule /5.3 gets applied _but_ the converted initializer here is an xvalue (not prvalue) which means that the condition "_If the converted initializer is a prvalue, its type `T4` is adjusted to type `cv1 T4`_" does not satisfy: This results in a reference-to-const being bound to a non-const object. Do you think that this is an issue in the wording (like [CWG 2841](https://cplusplus.github.io/CWG/issues/2481.html)) or an issue in my understanding? – mada Sep 03 '22 at 06:50
  • @DavisHerring I said `As far as I understood, the rule [expr.type]/2 doesn't apply` **because initially (before adjustment) the type of the prvalue is T4 (int) i.e, it's cv-unqualified type.** Then said `After the adjustment to cv1 T4, the rule [expr.type]/2 also doesn't apply` **because cv1 T4 is not the initial type of the prvalue** -- I need to know whether those are _correct reasons_ or not? – mada Sep 03 '22 at 07:06
  • @John: The `static_cast` case is just like `int __i=0; const int &&r=__i;`; your interpretation of "initially" is as good as any. – Davis Herring Sep 03 '22 at 18:26
  • @DavisHerring - "_your interpretation of "initially" is as good as any._" So it's considered to be correct? Am just asking you instead of posting a whole question on that. I need a more restricted answer from you: `I need to know whether those are correct reasons or not?` Just Correct or Incorrect. Thanks again. – mada Sep 03 '22 at 19:31
  • @John: "as good as any" and "correct" are not meaningfully different. – Davis Herring Sep 03 '22 at 21:42
  • @DavisHerring Thanks for your patience with me. _""as good as any" and "correct" are not meaningfully different."_ My main language is German, that's why I tell you to tell me "Correct" or "Incorrect". – mada Sep 04 '22 at 06:59