53

Per 7.5,

[errno] expands to a modifiable lvalue175) that has type int, the value of which is set to a positive error number by several library functions. It is unspecified whether errno is a macro or an identifier declared with external linkage. If a macro definition is suppressed in order to access an actual object, or a program defines an identifier with the name errno, the behavior is undefined.

175) The macro errno need not be the identifier of an object. It might expand to a modifiable lvalue resulting from a function call (for example, *errno()).

It's not clear to me whether this is sufficient to require that &errno not be a constraint violation. The C language has lvalues (such as register-storage-class variables; however these can only be automatic so errno could not be defined as such) for which the & operator is a constraint violation.

If &errno is legal C, is it required to be constant?

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • 3
    Not necessarily. It's an lvalue, so for example you can assign to it. Being able to take the address is not so far-fetched from there. – R.. GitHub STOP HELPING ICE Oct 18 '12 at 00:40
  • 1
    Perhaps this is relevant: what type is the lvalue `(union {signed int x:32;}){0}.x`? (assuming 32 is the width of `int`; replace it as needed). – R.. GitHub STOP HELPING ICE Oct 18 '12 at 00:43
  • 1
    It must be legal to take the address of it if it's modifiable and has type `int` right? Now whether that address is valid later is debatable; somehow an implementation might make it valid only in the expression where it's used or something. – Seth Carnegie Oct 18 '12 at 00:44
  • 1
    *The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.* (6.5.3.2) – R.. GitHub STOP HELPING ICE Oct 18 '12 at 00:45
  • 2
    @SethCarnegie: I don't think so. You're expressly forbidden from taking the address of variables with `register` storage class, but they can still be used as lvalues. – Kerrek SB Oct 18 '12 at 00:45
  • @R.. yes, but the type is `int`, not bit-field, so `(union {signed int x:32;}){0}.x` is not a valid implementation for `errno`, right? – Seth Carnegie Oct 18 '12 at 00:46
  • @netcoder: I did not edit anything. `errno` is absolutely a modifiable lvalue. The only question is whether taking its address is valid. There has historically been some nonsense of folks tiptoeing around the possibility that it might not be an lvalue or might not be modifiable, but the C language has never allowed these possibilities. – R.. GitHub STOP HELPING ICE Oct 18 '12 at 00:46
  • @KerrekSB ah, I guess it could be `register` and that'd make taking the address wrong. That's the answer then isn't it? – Seth Carnegie Oct 18 '12 at 00:47
  • `register` is not legal for objects of storage durations other than automatic, and automatic storage duration is not really possible for `errno`. – R.. GitHub STOP HELPING ICE Oct 18 '12 at 00:47
  • @R.. Why couldn't an implementation do something funky? Also, if it's not expressly forbidden, then it doesn't matter if it's difficult, right? – Seth Carnegie Oct 18 '12 at 00:49
  • I guess that's my question. When the standard says it's a modifiable lvalue, should that be interpreted as meaning "a modifiable lvalue that can arise from the finitely many ways to construct modifiable lvalues specified in this standard", or "a modifiable lvalue constructed in any implementation-defined way, possibly outside the scope of the standard"? – R.. GitHub STOP HELPING ICE Oct 18 '12 at 00:56
  • @netcoder: I never said "might be". You were the one who said that. – R.. GitHub STOP HELPING ICE Oct 18 '12 at 01:29
  • @R.. I believe your suggested implementation using the GCC extension `register int errno asm ("r37");` is exactly the right counterexample. Much like the macro `offsetof` cannot be portably implemented within the bounds of C itself, so too is `errno` a black box macro whose functioning is entirely up to the compiler. The Standard only requires it to be a modifiable thread-local lvalue of type `int`. It imposes no restrictions on _how_ or even _if_ `errno` is to be declared; Only that `#include ` must successfully provide it for use. A pinned register is a valid implementation of such. – Iwillnotexist Idonotexist May 11 '15 at 01:09

5 Answers5

19

So §6.5.3.2p1 specifies

The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.

Which I think can be taken to mean that &lvalue is fine for any lvalue that is not in those two categories. And as you mentioned, errno cannot be declared with the register storage-class specifier, and I think (although am not chasing references to check right now) that you cannot have a bitfield that has type of plain int.

So I believe that the spec requires &(errno) to be legal C.

If &errno is legal C, is it required to be constant?

As I understand it, part of the point of allowing errno to be a macro (and the reason it is in e.g. glibc) is to allow it to be a reference to thread-local storage, in which case it will certainly not be constant across threads. And I don't see any reason to expect it must be constant. As long as the value of errno retains the semantics specified, I see no reason a perverse C library could not change &errno to refer to different memory addresses over the course of a program -- e.g. by freeing and reallocating the backing store every time you set errno.

You could imagine maintaining a ring buffer of the last N errno values set by the library, and having &errno always point to the latest. I don't think it would be particularly useful, but I can't see any way it violates the spec.

nelhage
  • 2,734
  • 19
  • 14
  • Regarding the topic of whether the address is constant, I know it can differ between threads; my thought is about whether you can rely on storing to and retrieving from `errno` via its address. For example, if you had a function `void print_int_at(int *p);`, would it be valid to call `print_int_at(&errno)` to print the value of `errno`, or could `errno` move to a different address before the function gets to read it (in which case you'd need to store it in a temp, or do something like `print_int_at((int[]){errno});`) – R.. GitHub STOP HELPING ICE Oct 18 '12 at 01:34
  • A bit-field can be a plain `int` type; it just isn't defined by the language whether that is a signed or unsigned type (when applied to a bit-field). Thus, given `struct b { int b0 : 1; }`, it is not clear what the acceptable values are for that bit field. – Jonathan Leffler Oct 18 '12 at 05:22
16

I am surprised nobody has cited the C11 spec yet. Apologies for the long quote, but I believe it is relevant.

7.5 Errors

The header defines several macros...

...and

errno

which expands to a modifiable lvalue(201) that has type int and thread local storage duration, the value of which is set to a positive error number by several library functions. If a macro definition is suppressed in order to access an actual object, or a program defines an identifier with the name errno, the behavior is undefined.

The value of errno in the initial thread is zero at program startup (the initial value of errno in other threads is an indeterminate value), but is never set to zero by any library function.(202) The value of errno may be set to nonzero by a library function call whether or not there is an error, provided the use of errno is not documented in the description of the function in this International Standard.

(201) The macro errno need not be the identifier of an object. It might expand to a modifiable lvalue resulting from a function call (for example, *errno()).

(202) Thus, a program that uses errno for error checking should set it to zero before a library function call, then inspect it before a subsequent library function call. Of course, a library function can save the value of errno on entry and then set it to zero, as long as the original value is restored if errno’s value is still zero just before the return.

"Thread local" means register is out. Type int means bitfields are out (IMO). So &errno looks legal to me.

Persistent use of words like "it" and "the value" suggests the authors of the standard did not contemplate &errno being non-constant. I suppose one could imagine an implementation where &errno was not constant within a particular thread, but to be used the way the footnotes say (set to zero, then check after calling library function), it would have to be deliberately adversarial, and possibly require specialized compiler support just to be adversarial.

In short, if the spec does permit a non-constant &errno, I do not think it was deliberate.

[update]

R. asks an excellent question in the comments. After thinking about it, I believe I now know the correct answer to his question, and to the original question. Let me see if I can convince you, dear reader.

R. points out that GCC allows something like this at the top level:

register int errno asm ("r37");  // line R

This would declare errno as a global value held in register r37. Obviously, it would be a thread-local modifiable lvalue. So, could a conforming C implementation declare errno like this?

The answer is no. When you or I use the word "declaration", we usually have a colloquial and intuitive concept in mind. But the standard does not speak colloquially or intuitively; it speaks precisely, and it aims only to use terms that are well-defined. In the case of "declaration", the standard itself defines the term; and when it uses the term, it is using its own definition.

By reading the spec, you can learn precisely what a "declaration" is and precisely what it is not. Put another way, the standard describes the language "C". It does not describe "some language that is not C". As far as the standard is concerned, "C with extensions" is just "some language that is not C".

Thus, from the standard's point of view, line R is not a declaration at all. It does not even parse! It might as well read:

long long long __Foo_e!r!r!n!o()blurfl??/**

As far as the spec is concerned, this is just as much a "declaration" as line R; i.e., not at all.

So, when C11 spec says, in section 6.5.3.2:

The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.

...it means something very precise that does not refer to anything like Line R.

Now, consider the declaration of the int object to which errno refers. (Note: I do not mean the declaration of the errno name, since of course there might be no such declaration if errno is, say, a macro. I mean the declaration of the underlying int object.)

The above language says you can take the address of an lvalue unless it designates a bit-field or it designates an object "declared" register. And the spec for the underlying errno object says it is a modifiable int lvalue with thread-local duration.

Now, it is true that the spec does not say that the underlying errno object must be declared at all. Maybe it just appears via some implementation-defined compiler magic. But again, when the spec says "declared with the register storage-class specifier", it is using its own terminology.

So either the underlying errno object is "declared" in the standard sense, in which case it cannot be both register and thread-local; or it is not declared at all, in which case it is not declared register. Either way, since it is an lvalue, you may take its address.

(Unless it is a bit-field, but I think we agree that a bit field is not an object of type int.)

Nemo
  • 70,042
  • 10
  • 116
  • 153
  • 1
    Well `register` as defined by the standard is out anyway because it only can be used with automatic storage duration, but GCC's has `register` globals as an extension and they are always (inherently) thread-local. Would that be a legal implementation of `errno`? – R.. GitHub STOP HELPING ICE Oct 31 '12 at 03:29
  • Anyway, I still think this is one of the most informative answers so far. – R.. GitHub STOP HELPING ICE Oct 31 '12 at 03:30
  • @R: Wow, that is a great question. On the one hand, global "register" declarations are technically not well-formed under the standard. On the other hand, the standard does not say `errno` must itself be implemented in the language described by the standard. On the gripping hand, the address-of `register` exemption must be referring to well-formed programs... Mustn't it? This entire question is quite nice. – Nemo Oct 31 '12 at 19:18
  • Normally `errno` is not declared at all. The canonical implementation has the macro expanding to an expression that evaluates to an lvalue by using the `*` operator on an address returned by a function. So I don't see how declarations are involved. – R.. GitHub STOP HELPING ICE Oct 31 '12 at 23:22
  • @R: "...and is not declared with the register storage-class specifier" is the wording that might prevent you from taking the address of the lvalue. (It is not the declaration of `errno`, but the declaration of the errno _object_ that is relevant here.) The word "declared" implies a "declaration". I will rephrase my update to make this clear. – Nemo Oct 31 '12 at 23:26
  • C11 6.2.4 may be relevant: *The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address,33) and retains its last-stored value throughout its lifetime.* Interestingly, by this language, even register variables have addresses; it's just a constraint violation to try to find the address. :-) – R.. GitHub STOP HELPING ICE Nov 01 '12 at 04:08
  • @R. Curious to know your opinion, if any, on my own language lawyer question (http://stackoverflow.com/questions/13150449/). (I will delete this comment eventually) – Nemo Nov 03 '12 at 17:07
  • Your premise that it's not portable to do the cast directly is correct, but I'm not really a C++ expert or even a C++ programmer. Do you think it's equivalent to the C question? Assuming so, the answer is `(x<=INT_MAX ? x : -(int)(UINT_MAX-x)-1)`. Assuming the usual implementation-defined conversion, both branches of the conditional operator simplify to `x` and any sane compiler will optimize out the branch. – R.. GitHub STOP HELPING ICE Nov 03 '12 at 21:23
4

The original implementation of errno was as a global int variable that various Standard C Library components used to indicate an error value if they ran into an error. However even in those days one had to be careful about reentrant code or with library function calls that could set errno to a different value as you were handling an error. Normally one would save the value in a temporary variable if the error code was needed for any length of time due to the possibility of some other function or piece of code setting the value of errno either explicitly or through a library function call.

So with this original implementation of a global int, using the address of operator and depending on the address to remain constant was pretty much built into the fabric of the library.

However with multi-threading, there was no longer a single global because having a single global was not thread safe. So the idea of having thread local storage perhaps using a function that returns a pointer to an allocated area. So you might see a construct something like the following entirely imaginary example:

#define errno (*myErrno())

typedef struct {
    // various memory areas for thread local stuff
    int  myErrNo;
    // more memory areas for thread local stuff
} ThreadLocalData;

ThreadLocalData *getMyThreadData () {
    ThreadLocalData *pThreadData = 0;   // placeholder for the real thing
    // locate the thread local data for the current thread through some means
    // then return a pointer to this thread's local data for the C run time
    return pThreadData;
}

int *myErrno () {
    return &(getMyThreadData()->myErrNo);
}

Then errno would be used as if it were a single global rather than a thread safe int variable by errno = 0; or checking it like if (errno == 22) { // handle the error and even something like int *pErrno = &errno;. This all works because in the end the thread local data area is allocated and stays put and is not moving around and the macro definition which makes errno look like an extern int hides the plumbing of its actual implementation.

The one thing that we do not want is to have the address of errno suddenly shift between time slices of a thread with some kind of a dynamic allocate, clone, delete sequence while we are accessing the value. When your time slice is up, it is up and unless you have some kind of synchronization involved or some way to keep the CPU after your time slice expires, having the thread local area move about seems a very dicey proposition to me.

This in turn implies that you can depend on the address of operator giving you a constant value for a particular thread though the constant value will differ between threads. I can well see the library using the address of errno in order to reduce the overhead of doing some kind of thread local lookup every time a library function is called.

Having the address of errno as constant within a thread also provides backwards compatibility with older source code which used the errno.h include file as they should have done (see this man page from linux for errno which explicitly warns to not use extern int errno; as was common in the old days).

The way I read the standard is to allow for this kind of thread local storage while maintaining the semantics and syntax similar to the old extern int errno; when errno is used and allowing the old usage as well for some kind of cross compiler for an embedded device that does not support multi-threading. However the syntax may be similar due to the use of a macro definition so the old style short cut declaration should not be used because that declaration is not what the actual errno really is.

Richard Chambers
  • 16,643
  • 4
  • 81
  • 106
0

We can find a counterexample: because a bit-field could have type int, errno can be a bit-field. In that case, &errno would be invalid. The behavior of standard is here to do not explicitly say you can write &errno, so the definition of the undefined behavior applies here.

C11 (n1570), § 4. Conformance
Undefined behavior is otherwise indicated in this International Standard by the words ‘‘undefined behavior’’ or by the omission of any explicit definition of behavior.

md5
  • 23,373
  • 3
  • 44
  • 93
0

This seems like a valid implementation where &errno would be a constraint violation:

struct __errno_struct {
    signed int __val:12;
} *__errno_location(void);

#define errno (__errno_location()->__val)

So I think the answer is probably no...

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711