13

[namespace.std] disallows taking the address of, or a reference to, most functions in the std namespace. This is a big pitfall, as it often seems to work to pass a standard-library function as an argument, even though this could stop working, or worse, on a different compiler.

Presumably, this was done to allow implementations to optimize the standard library specially. This restriction makes C++ harder to use.

Can you give explicit examples of how C++ implementations benefit from this restriction on the std namespace?

If these optimizations are so important as to warrant making C++ harder to use, why don't some non-system libraries need the same thing?

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
user3188445
  • 4,062
  • 16
  • 26
  • 1
    Related: [Can a pointer be formed to a non-addressable function from STD in an unevaluated context?](https://stackoverflow.com/q/66522399/11082165). There answer there touches upon one reason for the rule. Also mentioned in [Holding or passing around non-addressable-functions since C++20](https://stackoverflow.com/q/62251497/11082165) – Brian61354270 Jul 01 '22 at 16:55
  • The rationale as I've understood it is to allow implementations freedom. I have myself been bitten by this in the past when I moved "working" code and tried compiling it on a different platform where the "functions" were compiler built-ins that one couldn't possibly take the addresses of since they were not even real functions. – Ted Lyngmo Jul 01 '22 at 16:56
  • 9
    Most probably it's not about optimization, but about allowing the library to silently overload or template some functions. – HolyBlackCat Jul 01 '22 at 16:56
  • 5
    Some non-addressable functions have traditionally been implemented as assembly instructions emitted in place. Kind of a "hard coded" inline— a kind of *intrinsic*. Other compilers may implement them as "proper" functions, however relying on that will be non-portable. (I recall one compiler implemented some of them as C-preprocessor *macros*.) As a legit workaround, you can wrap them in a lambda to make them addressable (via the lambda). – Eljay Jul 01 '22 at 16:57
  • 4
    i once read something along the line of what Eljay wrote. In my words from memory: The standard only specifies what happens when you call a function with certain parameters, eg `foo(1,1)`, but that does not necessarily imply that there is such a function `void (int,int)`, it could be `void(int,int,int=0)` or something else entirely – 463035818_is_not_an_ai Jul 01 '22 at 17:01
  • 1
    "harder to use" is a very generic term and also very subjective. I understand your curiosity and the question is completely reasonable. However your reasoning that "this restriction makes C++ harder to use" is a personal opinion and the resulting question that is built on that opinion "warrant making C++ harder to use" has no real objective answer. Hence no one really tackled that part of your question. – vnagy Jul 01 '22 at 17:27
  • 1
    btw I would suggest to remove the `language-lawyer` tag and use `language-design` instead. This isnt an answer you can find by reading and lawyering about the standard – 463035818_is_not_an_ai Jul 01 '22 at 17:29
  • @vnagy It makes the language specification longer (one more rule) and it makes the language less consistent (instead of one concept, a function, we now have two, an addressable and a non-addressable function). I think it's hard to argue that eliminating non-addressable functions wouldn't make the language easier to use. My question is at what cost? – user3188445 Jul 01 '22 at 17:33
  • @Eljay given that modern compilers can inline incredibly complex things--e.g., I've even seen gcc inline calls to function pointers in a `constexpr` array of function pointers --I wonder if this rationale of compiler intrinsics being easier to inline was previously valid but is now just out of date. – user3188445 Jul 01 '22 at 17:36
  • @HolyBlackCat that is a good theory, so I'm curious if there are examples. – user3188445 Jul 01 '22 at 17:38
  • An *intrinsic* is not the same as an *inline*. When you use the SSE2 *intrinsic* `_mm256_zeroall()`, it's injecting the SSE2 assembly instructions in place. There is no function; it's not inlining a function. It's more like (very old school) a bunch of `emit` code that blindly outputs arbitrary bytes into the function (presumably those bytes are carefully curated machine code bytes). – Eljay Jul 01 '22 at 17:53
  • @Eljay Yes, but given how good inlining has gotten, there's really no penalty for throwing an inline function around an intrinsic. E.g., suppose that `std::copy` can be implemented more efficiently as an intrinsic. Why not have `__builtin_std_copy` as the intrinsic and `std::copy` as an inline wrapper around that? – user3188445 Jul 01 '22 at 18:02
  • Ahh, I see. Yes, that *could* be done, and would allow *intrinsic* NAFs and *macro* NAFs as implementation details to be made **addressable** and while easily allowing for inlining (because optimizers are amazingly powerful these days) without any loss of expressivity and performance. Due to backwards compatibility, I hold no hope that WG21 would be on board with that kind of change for the NAFs called out in the standard. Alas. (I've been burned by NAFs. Likely all of us here have been burned by them, at one time or another. Learning opportunity.) – Eljay Jul 01 '22 at 18:15

1 Answers1

9

Firstly, it's worth noting that this design did not originate in C; it's entirely new in C++.

For the same syntactic reason, it is permitted to take the address of a library function even if it is also defined as a macro.89

89 This means that an implementation must provide an actual function for each library function, even if it also provides a macro for that function.

- C89 Standard, 4.1.6 Use of library functions

These strict guarantees would be very restrictive in C++ though, for a number of reasons. As a disclaimer, I haven't been able to find quotes from Bjarne himself, so everything that I'm about to say is a collection of community consensus and personal experience.

1. Adding overloads may break source compatiblity

Say you have a function:

bool is_even(int x) { return x % 2 == 0; }

It may be initially safe to call std::partition(begin, end, is_even), but if an overload for long and long long was added in addition to int, then the use of is_even would become ill-formed.

Essentially, any addressable function cannot receive extra overloads in the future because it breaks existing code. This is why [namespace.std] specifically says "possibly ill-formed".

2. Signatures are more prone to change than in C

Another way to break compatibility is to make an existing function more generic. For example, it lets the standard library make its functions more generic, such as turning:

// possible historical implementation in <math.h> until C++11
#define isnan(x) implementation-defined

into

bool isnan(long double); //possibly with overloads for float and double

and subsequently into

bool isnan(std::floating_point auto);

With features such as function overloading and templates, the implementation of a function can change drastically over time.

Of course, no one could have foreseen these drastic changes in the math library, but the restrictions on non-addressable functions have made them possible without breaking any conforming code.

3. Functions may not have an address

There are two possible reasons why a function might not have any address:

  • it is an intrinsic function
    • this means a function call just tells the compiler to produce some IR instructions, and an actual function might not even exist
  • it is an immediate function (i.e. consteval, C++20)

The former reason may have been a significant contributor to the decision. Nowadays, there is usually an inline function wrapper around any intrinsics used in the standard library, but this would turn into common practice was not obvious back in the day.

A more modern example of intrinsic standard library functions is std::move, which was made "kinda intrinsic" in the MSVC STL. See Improving the State of Debug Performance in C++.

4. "Functions" may be function objects

In C++, it is also possible to implement a function as a function object, such as:

// inline has only been added in C++17, but compilers could have supported its
// functionality before that
inline const struct {
    float operator()(float x) const;
} sqrt;

If a function is actually a function object, then there would be a difference in behavior when taking its address, as you wouldn't get a function pointer. However, calling it would behave the same (except ADL doesn't take place).

This is yet another form of flexibility that is made possible by making functions non-addressable.

Conclusion

Making it possible to take the address of standard library functions would have significantly reduced the flexibility of implementers. Almost any change, such as adding overloads would break compatibility, effectively freezing language progress.

This is not a big issue in C, where function signatures are frozen anyway, but would have significant negative consequences in C++.

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96