Dangers of using reference-to-pointer as function argument

Question

Can someone tell me if this is safe, because I think it isn't:

class A
{
public:
    A(int*& i) : m_i(i)
    {}

    int*& m_i;
};

class B
{
public:
    B(int* const& i) : m_i(i)
    {}

    int* const & m_i;
};


int main()
{
    int i = 1;
    int *j = &i;

    A a1(j);   // this works (1)
    A a2(&i);  // compiler error (2)
    B b(&i);   // this works again (3)
}

I understand why (1) works. We are passing a pointer, the function accepts it as a reference.

But why doesn't (2) work? From my perspective, we are passing the same pointer, just without assigning it to a pointer variable first. My guess is that &i is an rvalue and has no memory of its own, so the reference cannot be valid. I can accept that explanation (if it's true).

But why the heck does (3) compile? Wouldn't that mean that we allow the invalid reference so b.m_i is essentially undefined?

Am I completely wrong in how this works? I am asking because I am getting weird unit test fails that I can only explain by pointers becoming invalid. They only happen for some compilers, so I was assuming this must be something outside the standard.

So my core question basically is: Is using int* const & in a function argument inherently dangerous and should be avoided, since an unsuspecting caller might always call it with &i like with a regular pointer argument?

Addendum: As @franji1 pointed out, the following is an interesting thought to understand what happens here. I modified main() to change the inner pointer and then print the members m_i:

int main()
{
    int i = 1;
    int *j = &i;  // j points to 1

    A a1(j);
    B b(&i);

    int re = 2;
    j = &re;  // j now points to 2

  std::cout << *a1.m_i << "\n";  // output: 2
  std::cout << *b.m_i << "\n";  // output: 1
}

So, clearly a1 works as intended. However, since b cannot know that j has been modified, it seems to hold a reference to a "personal" pointer, but my worry is that it is not well defined in the standard, so there might be compilers for which this "personal" pointer is undefined. Can anyone confirm this?

In a1, m_i reference is of j. If further down, you re-assigned j to point to, say &k, then a1.m_i would be pointing to k. What specific pointer is a2 referencing? — franji1, May 06 '20 at 18:04
@franji1 I think you are trying to point me towards why (2) does not work and the explanation is straightforward to understand. Thanks. But it still does not explain why (3) works. I think this is the core of my question. My assumption is that according to your explanation a2.m_i is not really referencing anything, or at least something undefined, and that is why I wanted confirmation that this construct is essentially dangerous. — Cerno, May 06 '20 at 18:18
B b(&i); is no more dangerous than A a1(j); Both (1) and (3) can become dangling if the lifetime of i and j is less than a1 and b. Difference between (2) and (3) is that `&i` is not a valid reference to int*, but it is valid const reference to int*, that's all. — Riad Baghbanli, May 06 '20 at 19:00

score 2 · Accepted Answer · answered May 06 '20 at 18:27

2

A's constructor takes a non-const reference to an int* pointer. A a1(j); works, because j is an int* variable, so the reference is satisfied. And j outlives a1, so the A::m_i member is safe to use for the lifetime of a1.

A a2(&i); fails to compile, because although &i is an int*, operator& returns a temporary value, which cannot be bound to a non-const reference.

B b(&i); compiles, because B's constructor takes a reference to a const int*, which can be bound to a temporary. The temporary's lifetime will be extended by being bound to the constructor's i parameter, but will then expire once the constructor exits, thus the B::m_i member will be a dangling reference and not be safe to use at all after the constructor has exited.

answered May 06 '20 at 18:27

Remy Lebeau

555,201
31
458
770

That's exactly what I thought. Thanks for providing the terminology and clearing it up. Could you shed some light on whether using a function parameter of ```int* const&``` is inherently bad (and only accepted in the language for completeness) or if there are actual practical uses? Because I think it's asking quite a bit from the caller to assume that calling the function with a ```&i``` is a bad idea while it would not be a problem if the function only accepted a pointer without the reference. – Cerno May 06 '20 at 18:36
There is nothing inherently bad with using an `int* const &` parameter itself (though, the real question is, why are you using pointers by reference to begin with, and not `int&` instead?). But you are taking it a step further by trying to save what the reference refers to, and that is dangerous when what is being referred to is a temporary value. A reference does not know what it is referring to. Saving a reference of another reference is *legal* from the compiler's perspective, but is questionable depending on the context. – Remy Lebeau May 06 '20 at 18:40
The architecture I am using forces me to provide references to structures that are filled at a later point in time. Since one of these structures is in fact a pointer, I need to pass a reference to a pointer (although a pointer to pointer would work as well). I have to admit that I was a bit out of my depth about const-ness of pointers so I tried different ways until it compiled. That's what I get for not fully understanding the mechanism before using it I guess. – Cerno May 06 '20 at 18:47
My current solution is to get rid of the const which makes sure I don't accidentally pass the value as ```&i```. That way, I make sure that the pointer is an lvalue lives in the scope of the caller and remains valid after construction. I guess it can still be dangerous in certain cases, but I guess it works for me at this point. – Cerno May 06 '20 at 18:50
I would just get rid of the unnecessary pointer altogether: `class A { public: A(int &i) : m_i(i) {} int& m_i; };` `int main() { int i = 1; A a1(i); ... }` – Remy Lebeau May 06 '20 at 18:54
Remy, it is somewhat misleading. It is not reference to a const int*, it is const reference to int*. There will be no temporary rvalue in (3). Compiler considers &i as a valid const reference to int*. – Riad Baghbanli May 06 '20 at 19:07
I wish I could. The architecture goes like this: I have to set up a number of tasks that shall be executed at a later point in time. These tasks are dependent on one another, so the output of one task is used by the next. Since the tasks are executed later, I don't have the output of the first task yet, so give the second task a reference to the output structure of the first task, to be filled in the execution phase. Assuming the first task's job is to select an object from a list and return a pointer to it. Then the second task needs a reference to that pointer. – Cerno May 06 '20 at 19:08
@RiadBaghbanli `const` applies to the thing on its left, or if nothing is there then to its right. In this case, the thing to its left is an `int*` pointer, thus `i` is a reference to a const `int*` pointer, not a const reference to an `int*`. The `const` is not applied to the reference (which is already const by its very nature) – Remy Lebeau May 06 '20 at 19:12
@RemyLebeau, const rule you described is correct. The reason for it being somewhat misleading is `int* const& i` means reference itself is not modifiable, though value pointer points to can be modified. Constant reference to volatile value, as opposed to volatile reference to constant value. So (3) is no more dangling than (1). – Riad Baghbanli May 06 '20 at 19:24
1

@RiadBaghbanli you are incorrect for the reasons shown by both me and Remy Lebeau: `&i` from `B b(&i);` is a prvalue - a temporary. This temporary ends on the constructor end. The `m_i` is now left a dangling reference. – bolov May 06 '20 at 19:39
1

@ceno why not just put the tasks into a queue, then start the 1st task and let it start the 2nd task before exiting, passing its completed data (ie, the selected pointer) as input at that time? Then that 2nd task can start the 3rd task, giving it input as needed. And so on. You don't need to hook up the individual tasks to each other, or references to some external shared data. Just pass the relevant data down the chain, one task at a time. – Remy Lebeau May 06 '20 at 19:51
@RemyLebeau The reason this is so complex is that we are running in an embedded system where the tasks are planned at the beginning of a cycle. Then the tasks are executed by a runtime manager. Once runtime is exceeded, no further tasks are processed, but once the next cycle starts, the remaining tasks will be finished before the next round of tasks are scheduled. This is due to the fact that the number of tasks can vary for each cycle, so one cycle may be overbooked, but the next may have enough time left to complete the unfinished tasks. That's why we have external shared data. – Cerno May 07 '20 at 04:45
@RemyLebeau I appreciate your insight, and we have been thinking about refactoring this mechanism for some time now, because it is giving everyone headaches, but it's the backbone of our system, so change is not just done on a whim, even though potentially necessary. – Cerno May 07 '20 at 04:47

bolov · Answer 2 · 2020-05-06T18:42:06.847

1

j is an lvalue and as such it can be bound to a non-const lvaue reference.

&i is a prvalue and it cannot be bound to non-const lvalue reference. That's why (2) doesn't compile

&i is a prvalue (a temporary) and it can be bound to a const lvalue reference. Bounding a prvalue to a reference extends the lifetime of the temporary to the lifetime of the reference. In this case this temporary lifetime is extended to the lifetime of the constructor parameter i. You then initialize the reference m_i to i (constructor parameter) (which is a reference to the temporary) but because i is an lvalue the lifetime of the temporary is not extended. In the end you end up with a reference member m_i bound to an object which is not alive. You have a dangling reference. Accessing m_i from now on (after the constructor has finished) is Undefined Behavior.

Simple table of what can references bind to: C++11 rvalue reference vs const reference

edited May 06 '20 at 18:42

answered May 06 '20 at 18:27

bolov

72,283
15
145
224

Thanks for the very good explanation. This would explain why using the reference to a const pointer in a function may work without problems, but this case fails because I copy it to a member afterwards. So is it possible that some compilers extend the lifetime of the constructor parameter to the lifetime of the member? Because I only had problems in one specific case. In C++ shell for example, this worked, so I assume their compiler implements this according to my original intuition (which turned out to be wrong in general). – Cerno May 06 '20 at 18:41
@Cerno no. Undefined Behavior is ... well.. undefined. Anything can happen. It can appear to work, it can crash it can do any crazy stuff. – bolov May 06 '20 at 18:42
1

"*So is it possible that some compilers extend the lifetime of the constructor parameter to the lifetime of the member?*" - no, because that is not how lifetime extension is defined to work. – Remy Lebeau May 06 '20 at 18:43
@RemyLebeau Thanks, so, I take it's just coincidence that it worked, as in, some freed memory section may still contain the temporary pointer address so I can still access it but it's not guaranteed, or something similar? – Cerno May 06 '20 at 18:44
@Cerno No it is not coincidence. Difference between (2) and (3) is that &i is not a valid reference to int*, but it is valid const reference to int*. This answer is not accurate. – Riad Baghbanli May 06 '20 at 19:02
@RiadBaghbanli how is my answer not accurate? – bolov May 06 '20 at 19:32
@bolov, there is no more of reference dangling in (3) than in (1). (3) is not an undefined behavior. In (1) variable `j` defined as `int*` is non-const reference passed to constructor, whereas in (3) rvalue `&i` is interpreted as a const reference passed to constructor. (3) is no more dangerous than (1). – Riad Baghbanli May 06 '20 at 19:47
@RiadBaghbanli you are wrong. I will repost my comment from Remy Lebeau's answer: `&i` from `B b(&i);` is a prvalue - a temporary. This temporary ends on the constructor end. The `m_i` from `B` which is a reference to that temporary is now left a dangling reference. – bolov May 06 '20 at 19:50
@RiadBaghbanli the fact that is a pointer or const or non-const is irrelevant to the temporary lifetime rules. – bolov May 06 '20 at 19:51
@RiadBaghbanli https://stackoverflow.com/questions/17362673/temporary-lifetime-extension – bolov May 06 '20 at 19:53
@bolov const vs non-const is the reason why (3) compiles and (2) does not. Lifetime rules are applied to `j = &i` as much as they are to `b(&i)`. The are not dangling references in the code as provided. To make references dangling, lifetime `a1` and `b` should exceed `j` and `i` respectively. `b` is reference to `i`, as simple as that. – Riad Baghbanli May 06 '20 at 20:04
@RiadBaghbanli no. `b` is not a reference. `b` is of type `B`. `j` is not a reference `j` is of type `int *` - not a reference. There is dangling reference because the reference `b.m_i` outlives the temporary created in the expression `&i`. – bolov May 06 '20 at 20:11
I meant `b.m_i` is referencing `i`. In all C++ compilers. As per spec. Any change in value for `i` will result in change in value of `b.m_i`. – Riad Baghbanli May 06 '20 at 20:16
@RemyLebeau no `b.m_i` is not referencing `i` it is referencing the temporary created by the expression `&i` , temporary which is no longer alive – bolov May 06 '20 at 20:21
@bolov I'm not the person who said `b.m_i` is referencing `i`. That was RiadBaghbanli. – Remy Lebeau May 06 '20 at 20:37
@RemyLebeau my apologies, @R autocomplete, didn't pay attention – bolov May 06 '20 at 20:40

Riad Baghbanli · Answer 3 · 2020-05-06T19:09:32.940

-1

Pointer is a memory address. For simplicity, think of a pointer as uint64_t variable holding a number representing the memory address of whatever. Reference is just a alias for some variable.

In example (1) you are passing a pointer to constructor expecting a reference to pointer. It works as intended, as compiler gets the address of memory where the value of pointer is stored and passes it to constructor. The constructor gets that number and creates an alias pointer. As a result you are getting an alias of j. If you modify j to point to something else then m_i will also be modified. You can modify m_i to point to something else too.

In example (2) you are passing a number value to the constructor expecting a reference to pointer. So, instead of an address of an address, constructor gets an address and compiler has no way to satisfy the signature of the constructor.

In example (3) you are passing a number value to constructor expecting a constant reference to pointer. Constant reference is a fixed number, just a memory address. In this case compiler understands the intent and provides the memory address to set in the constructor. As a result you are getting fixed alias of i.

EDIT (for clarity): Difference between (2) and (3) is that &i is not a valid reference to int*, but it is a valid const reference to int*.

edited May 06 '20 at 19:09

answered May 06 '20 at 18:14

Riad Baghbanli

3,105
1
12
20

I am not sure this is correct. If (3) is valid then why does my unit test fail? Any why only for certain compilers? This smells like undefined behaviour to me. – Cerno May 06 '20 at 19:23
Your unit test demonstrates the exact behavior expected as per C++ specs. `b.m_i` is a reference to `i`, not `j`. Modifying `j` will have no effect on `b.m_i`. If you are to modify the value of `i', then `b.m_i` will be modified also. – Riad Baghbanli May 06 '20 at 19:29
are you saying that (3) does **not** end up with a dangling reference? – bolov May 06 '20 at 19:37
(3) is no more dangling than (1). `b.m_i` references `i`, not temporary variable. It passes const value `&i` as a constant reference. Check it for yourself in debugger. – Riad Baghbanli May 06 '20 at 19:57
from you answer in case (3): "but it is a valid const reference to `int*`" no, it's not not. It's a reference to the `&i` temporary created on `B b(&i);`. This temporary ends on when the constructor of `b` ends and `b.m_i` is now a reference to an expired temporary. A dangling referece. – bolov May 06 '20 at 19:57
`b.m_i` doesn't reference `i` it references the temporary created by the expression `&i` from `B b(&i);` – bolov May 06 '20 at 19:58
the fact that seems to work doesn't prove anything. By definition Undefined Behavior can "seem" to work – bolov May 06 '20 at 19:58
@bolov Try it with as many C++ compilers as you wish, result will be the same. This is one of the quirks of C++ semantics. Const references take values as per C++ specs. – Riad Baghbanli May 06 '20 at 20:14

Dangers of using reference-to-pointer as function argument

3 Answers3