173

Consider the following program:

struct ghost
{
    // ghosts like to pretend that they don't exist
    ghost* operator&() const volatile { return 0; }
};

int main()
{
    ghost clyde;
    ghost* clydes_address = &clyde; // darn; that's not clyde's address :'( 
}

How do I get clyde's address?

I'm looking for a solution that will work equally well for all types of objects. A C++03 solution would be nice, but I'm interested in C++11 solutions too. If possible, let's avoid any implementation-specific behavior.

I am aware of C++11's std::addressof function template, but am not interested in using it here: I'd like to understand how a Standard Library implementor might implement this function template.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
James McNellis
  • 348,265
  • 75
  • 913
  • 977
  • 41
    @jalf: That strategy is acceptable, but now that I've punched said individuals in the head, how do I work around their abominable code? :-) – James McNellis Jun 27 '11 at 14:54
  • 5
    @jalf Uhm, sometimes you *need* to overload this operator, and return a proxy object. Though I can’t think of an example just now. – Konrad Rudolph Jun 27 '11 at 15:06
  • 5
    @Konrad: me either. If you need that, I'd suggest that a better option might be to rethink your design, because overloading that operator just causes too many problems. :) – jalf Jun 27 '11 at 15:20
  • See also [this answer](http://stackoverflow.com/questions/2719832/why-is-overloading-operator-prohibited-for-classes-stored-in-stl-containers/2719880#2719880). – sbi Jun 27 '11 at 18:14
  • 2
    @Konrad: In roughly 20 years of C++ programming I have _once_ attempted to overload that operator. That was at the very beginning of those twenty years. Oh, and I failed to make that usable. Consequently, the [operator overloading FAQ entry](http://stackoverflow.com/questions/4421706/operator-overloading/4421719#4421719) says "The unary address-of operator should never be overloaded." You'll get a free beer the next time we meet if you can come up with a convincing example for overloading this operator. (I know you're leaving Berlin, so I can safely offer this `:)`) – sbi Jun 27 '11 at 18:20
  • 5
    `CComPtr<>` and `CComQIPtr<>` have an overloaded `operator&` – Simon Richter Jun 27 '11 at 23:42
  • @Simon: but the important question is **should** they have an overloaded `operator&`? – jalf Jun 28 '11 at 07:00
  • 1
    Well, it allows pointers to them to be passed to functions that expect a pointer to the contained type... But indeed, I'd return a proxy object that is convertible to `T **` and `CComPtr *`. – Simon Richter Jun 28 '11 at 09:00
  • @Simon Richter: I till remember spending a day or so debugging and fixing a problem triggered by this. GAAAH! --- the `operator &` should use an `interface ** OutPtr()` / `interface ** InOutPtr()` instead, that would make it explicit in the call (with acceptable overhead) – peterchen Jun 30 '11 at 12:45
  • Here're two very similar questions http://stackoverflow.com/q/1142607/57428 and http://stackoverflow.com/q/2333321/57428 – sharptooth Jul 06 '11 at 06:47
  • 1
    @curiousguy: Many interesting questions in life tend to be about unpractical things. That said, this question is certainly a practical one for anyone writing a C++ Standard Library implementation. – James McNellis Dec 03 '11 at 07:21
  • @JamesMcNellis "_That said, this question is certainly a practical one for anyone writing a C++ Standard Library implementation_" for what? – curiousguy Dec 03 '11 at 08:05
  • 2
    @curiousguy: `std::addressof` must be able to obtain the address of an object, even if the object is of a type that overloads arbitrary operators, including conversion operators and the unary `&`. Further, the Standard Library containers must be instantiable and usable with those perverse types as well (this requirement is new in C++11; it was not present in C++98/03). – James McNellis Dec 03 '11 at 08:09
  • 1
    OTOH: "Numeric type requirements" [numeric.requirements] "it does not overload unary operator&." – curiousguy Dec 05 '11 at 17:43
  • Don't do it like this. It will trigger an `operator char&()`. – Johannes Schaub - litb Jun 29 '11 at 20:45
  • 1
    @SimonRichter how is CCom*** to be considered something that doesn't need its design rethought??? – Stefano Falasca Oct 15 '14 at 13:17

5 Answers5

105

Use std::addressof.

You can think of it as doing the following behind the scenes:

  1. Reinterpret the object as a reference-to-char
  2. Take the address of that (won’t call the overload)
  3. Cast the pointer back to a pointer of your type.

Existing implementations (including Boost.Addressof) do exactly that, just taking additional care of const and volatile qualification.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 19
    I like this explanation better than the selected on as it can be readily understood. – Sled Jun 29 '11 at 19:37
102

Update: in C++11, one may use std::addressof instead of boost::addressof.


Let us first copy the code from Boost, minus the compiler work around bits:

template<class T>
struct addr_impl_ref
{
  T & v_;

  inline addr_impl_ref( T & v ): v_( v ) {}
  inline operator T& () const { return v_; }

private:
  addr_impl_ref & operator=(const addr_impl_ref &);
};

template<class T>
struct addressof_impl
{
  static inline T * f( T & v, long ) {
    return reinterpret_cast<T*>(
        &const_cast<char&>(reinterpret_cast<const volatile char &>(v)));
  }

  static inline T * f( T * v, int ) { return v; }
};

template<class T>
T * addressof( T & v ) {
  return addressof_impl<T>::f( addr_impl_ref<T>( v ), 0 );
}

What happens if we pass a reference to function ?

Note: addressof cannot be used with a pointer to function

In C++ if void func(); is declared, then func is a reference to a function taking no argument and returning no result. This reference to a function can be trivially converted into a pointer to function -- from @Konstantin: According to 13.3.3.2 both T & and T * are indistinguishable for functions. The 1st one is an Identity conversion and the 2nd one is Function-to-Pointer conversion both having "Exact Match" rank (13.3.3.1.1 table 9).

The reference to function pass through addr_impl_ref, there is an ambiguity in the overload resolution for the choice of f, which is solved thanks to the dummy argument 0, which is an int first and could be promoted to a long (Integral Conversion).

Thus we simply returns the pointer.

What happens if we pass a type with a conversion operator ?

If the conversion operator yields a T* then we have an ambiguity: for f(T&,long) an Integral Promotion is required for the second argument while for f(T*,int) the conversion operator is called on the first (thanks to @litb)

That's when addr_impl_ref kicks in. The C++ Standard mandates that a conversion sequence may contain at most one user-defined conversion. By wrapping the type in addr_impl_ref and forcing the use of a conversion sequence already, we "disable" any conversion operator that the type comes with.

Thus the f(T&,long) overload is selected (and the Integral Promotion performed).

What happens for any other type ?

Thus the f(T&,long) overload is selected, because there the type does not match the T* parameter.

Note: from the remarks in the file regarding Borland compatibility, arrays do not decay to pointers, but are passed by reference.

What happens in this overload ?

We want to avoid applying operator& to the type, as it may have been overloaded.

The Standard guarantees that reinterpret_cast may be used for this work (see @Matteo Italia's answer: 5.2.10/10).

Boost adds some niceties with const and volatile qualifiers to avoid compiler warnings (and properly use a const_cast to remove them).

  • Cast T& to char const volatile&
  • Strip the const and volatile
  • Apply the & operator to take the address
  • Cast back to a T*

The const/volatile juggling is a bit of black magic, but it does simplify the work (rather than providing 4 overloads). Note that since T is unqualified, if we pass a ghost const&, then T* is ghost const*, thus the qualifiers have not really been lost.

EDIT: the pointer overload is used for pointer to functions, I amended the above explanation somewhat. I still do not understand why it is necessary though.

The following ideone output sums this up, somewhat.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • 2
    "What happens if we pass a pointer ?" part is incorrect. If we pass a pointer to some type U the addressof function the type 'T' is inferred to be 'U*' and addr_impl_ref will have two overloads: 'f(U*&, long)' and 'f(U**,int)', obviously the first one will be selected. – Konstantin Oznobihin Jun 27 '11 at 16:01
  • @Konstantin: right, I had thought that the two `f` overloads where function templates, whereas they are regular member functions of a template class, thanks for pointing it out. (Now I just need to figure out what is the use of the overload, any tip ?) – Matthieu M. Jun 27 '11 at 16:50
  • This is a great, well-explained answer. I kind of figured there was a bit more to this than just "cast through `char*`." Thank you, Matthieu. – James McNellis Jun 28 '11 at 13:59
  • @James: I have had much help from @Konstantin who would strike my head with a stick any time I made a mistake :D – Matthieu M. Jun 28 '11 at 17:10
  • @Matthieu: Did I? :D Probably we are just interested in similar questions here, nothing personal. :) – Konstantin Oznobihin Jun 28 '11 at 18:26
  • 3
    Why would it need to work around types that have a conversion function? Would it not prefer the exact match over invoking any conversion function to `T*`? EDIT: Now I see. It would, but with the `0` argument it would end up in a *criss-cross*, so would be ambiguous. – Johannes Schaub - litb Jun 29 '11 at 20:28
  • @James: :D @litb: there are two conversions we wish to avoid. The conversion to `T*` leads to an ambiguity and the conversion to `T&` may point to another object. The latter would really bite us, unnoticed (at compile-time). – Matthieu M. Jun 30 '11 at 06:24
  • @Matthieu, no the conversion to `T&` can never happen because the argument is a `T` already. This is only to avoid the *criss-cross*. – Johannes Schaub - litb Jun 30 '11 at 14:58
  • @James http://stackoverflow.com/questions/3519282/why-is-this-ambiguity-here/3525172#3525172 – Johannes Schaub - litb Jun 30 '11 at 15:16
  • "_then func is a reference to a function_" Hug? There is no reference here! – curiousguy Dec 06 '11 at 15:21
  • In C++11 we can now just use std::addressof – paulm Jan 05 '15 at 14:00
  • @paulm: Right! Edited as the first line. – Matthieu M. Jan 05 '15 at 16:09
  • it can switch int and long ? static inline T * f( T & v, int) { return reinterpret_cast( &const_cast(reinterpret_cast(v))); } static inline T * f( T * v, long) { return v; } – zpeng Apr 13 '16 at 13:35
  • @zpeng: I am not quite sure, to be honest, since a pointer to reference is invalid and a reference to pointer is valid it seems to me it makes sense to privilege the `T*` function and thus force a conversion before access to the `T&`... but maybe I am just paranoid because I cannot think of a counter-example right now. – Matthieu M. Apr 13 '16 at 13:47
49

The trick behind boost::addressof and the implementation provided by @Luc Danton relies on the magic of the reinterpret_cast; the standard explicitly states at §5.2.10 ¶10 that

An lvalue expression of type T1 can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly converted to the type “pointer to T2” using a reinterpret_cast. That is, a reference cast reinterpret_cast<T&>(x) has the same effect as the conversion *reinterpret_cast<T*>(&x) with the built-in & and * operators. The result is an lvalue that refers to the same object as the source lvalue, but with a different type.

Now, this allows us to convert an arbitrary object reference to a char & (with a cv qualification if the reference is cv-qualified), because any pointer can be converted to a (possibly cv-qualified) char *. Now that we have a char &, the operator overloading on the object is no longer relevant, and we can obtain the address with the builtin & operator.

The boost implementation adds a few steps to work with cv-qualified objects: the first reinterpret_cast is done to const volatile char &, otherwise a plain char & cast wouldn't work for const and/or volatile references (reinterpret_cast cannot remove const). Then the const and volatile is removed with const_cast, the address is taken with &, and a final reinterpet_cast to the "correct" type is done.

The const_cast is needed to remove the const/volatile that could have been added to non-const/volatile references, but it does not "harm" what was a const/volatile reference in first place, because the final reinterpret_cast will re-add the cv-qualification if it was there in first place (reinterpret_cast cannot remove the const but can add it).

As for the rest of the code in addressof.hpp, it seems that most of it is for workarounds. The static inline T * f( T * v, int ) seems to be needed only for the Borland compiler, but its presence introduces the need for addr_impl_ref, otherwise pointer types would be caught by this second overload.

Edit: the various overloads have a different function, see @Matthieu M. excellent answer.

Well, I'm no longer sure of this either; I should further investigate that code, but now I'm cooking dinner :) , I'll have a look at it later.

Community
  • 1
  • 1
Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • Matthieu M. explanation regarding passing pointer to addressof is incorrect. Don't spoil your great answer with such edits :) – Konstantin Oznobihin Jun 27 '11 at 16:04
  • "good appetit", further investigation shows that the overload is called for reference to functions `void func();` `boost::addressof(func);`. However removing the overload does not prevent gcc 4.3.4 from compiling the code and producing the same output, so I still don't understand why it is *necessary* to have this overload. – Matthieu M. Jun 27 '11 at 17:13
  • @Matthieu: It looks to be a bug in gcc. According to 13.3.3.2 both T & and T * are indistinguishable for functions. The 1st one is an Identity conversion and the 2nd one is Function-to-Pointer conversion both having "Exact Match" rank (13.3.3.1.1 table 9). So it's necessary to have additional argument. – Konstantin Oznobihin Jun 27 '11 at 18:21
  • @Matthieu: Just tried it with gcc 4.3.4 (http://ideone.com/2f34P) and got ambiguity as expected. Did you tried overloaded member functions like in addressof implementation or free function templates? The latter one (like http://ideone.com/vjCRs) will result in 'T *' overload to be chosen due to temlate argument deduction rules (14.8.2.1/2). – Konstantin Oznobihin Jun 27 '11 at 18:57
  • @Konstantin: I add not thought of using template functions, I did thought that both function pointer and function reference were indistinguishable but did not dig up the Standard, thanks for the reference. – Matthieu M. Jun 28 '11 at 06:09
  • @KonstantinOznobihin "_The 1st one is an Identity conversion and the 2nd one is Function-to-Pointer conversion_" so the compiler should prefer the first one. – curiousguy Dec 06 '11 at 20:51
  • 2
    @curiousguy: Why do you think it should? I've referenced specific C++ standard parts prescribing what should compiler do and all compilers I have access to (including but not limited to gcc 4.3.4, comeau-online, VC6.0-VC2010) report ambiguity just as I've described. Could you please elaborate your reasoning regarding this case? – Konstantin Oznobihin Dec 07 '11 at 12:30
  • @KonstantinOznobihin "_Why do you think it should?_" Because I did not checked the issue completely. My bad. "_I've referenced specific C++ standard parts prescribing what should compiler_" Actually, you only mentioned part of the story. There is more than the "rank", and the table 9 you mentioned has more than one column. – curiousguy Dec 07 '11 at 14:28
  • @curiousguy: Well, I think, comments are just not suitable enough for fully elaborated discussion of such stuff. If you like you could ask corresponding question here and I'm sure you'll get elaborated and detailed answers. Still, the most relevant part of standard is 13.3.3.2 describing ordering relation for standard conversion sequences. Everything else should be easily found using references provided in this part. – Konstantin Oznobihin Dec 07 '11 at 16:17
12

I've seen an implementation of addressof do this:

char* start = &reinterpret_cast<char&>(clyde);
ghost* pointer_to_clyde = reinterpret_cast<ghost*>(start);

Don't ask me how conforming this is!

Luc Danton
  • 34,649
  • 6
  • 70
  • 114
  • 6
    Legal. `char*` is the listed exception to type aliasing rules. – Puppy Jun 27 '11 at 15:23
  • 6
    @DeadMG I'm not saying this is not conforming. I'm saying that you should not ask me :) – Luc Danton Jun 27 '11 at 15:28
  • 2
    @DeadMG There is no aliasing problem here. The question is: is `reinterpret_cast` well defined. – curiousguy Dec 07 '11 at 15:05
  • 3
    @curiousguy and the answer is yes, it's always allowed to cast any pointer type to `[unsigned] char *` and thereby read the object representation of the pointed-at object. This is another area where `char` has special privileges. – underscore_d Jul 16 '16 at 18:09
  • @underscore_d Just because a cast is "always allowed" doesn't mean you can do anything with the result of the cast. – curiousguy Jul 17 '16 at 07:34
  • 1
    @curiousguy I should've been more clear about things that are implicit in my comment, in order to avoid ambiguity or pedantry: What is specifically allowed is to _dereference the resulting pointer_, which is of course a prerequisite to read the object representation. In most other cases, dereferencing a `reinterpret_cast`ed pointer (unless it's since been cast _back_) is implementation-defined behaviour _if we're lucky_... or worse. – underscore_d Jul 17 '16 at 07:48
5

Take a look at boost::addressof and its implementation.

houbysoft
  • 32,532
  • 24
  • 103
  • 156
Konstantin Oznobihin
  • 5,234
  • 24
  • 31
  • 1
    The Boost code, while interesting, does not explain how its technique works (nor does it explain why two overloads are needed). – James McNellis Jun 27 '11 at 14:47
  • do you mean 'static inline T * f( T * v, int )' overload? Looks like it needed for Borland C workaround only. Approach used there is pretty straightforward. The only subtle (nonstandard) thing there is conversion of 'T&' to 'char&'. Although standard, allows cast from 'T*' to 'char*' there seems to be no such requirements for reference casting. Nevertheless, one might expect it to work exactly the same on most compilers. – Konstantin Oznobihin Jun 27 '11 at 15:10
  • @Konstantin: the overload is used because for a pointer, `addressof` returns the pointer itself. It's arguable whether it's what the user wanted or not, but it's how it specified. – Matthieu M. Jun 27 '11 at 15:28
  • @Matthieu: are you sure? As far as I can tell, any type (including pointer types) is wrapped inside an `addr_impl_ref`, so the pointer overload should never be called... – Matteo Italia Jun 27 '11 at 15:36
  • @Matthieu: ok, now that you explained it in your answer it makes sense. – Matteo Italia Jun 27 '11 at 15:47
  • @Matteo, @Konstantin: I got it wrong (I thought, don't know why, that the two overloads within `addressof_impl` had different template parameters (auto detected)... Looking further... – Matthieu M. Jun 27 '11 at 16:49
  • 1
    @KonstantinOznobihin this doesn't really answer the question, as all you say is to *where to* look for the answer, not *what is the answer*. –  Sep 21 '15 at 00:30