1

It's common in C (and other languages) to use prefixes and suffixes for names of variables and functions. Particularly, one occasionally sees the use of underscores, before or after a "proper" identifier, e.g. _x and _y variables, or _print etc. But then, there's also the common wisdom of avoiding names starting with underscore, so as to not clash with the C standard library implementation.

So, where and where is it ok to use underscores?

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2409.pdf . There is a lot more in POSIX https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html then mentinoed glibc. – KamilCuk Sep 07 '21 at 08:25
  • Nobody asked about POSIX or glibc though, so there's no need to drag those off-topic things into the discussion. I program in C all day but I rarely ever use POSIX or glibc. – Lundin Sep 07 '21 at 08:40
  • @KamilCuk: I'll update my answer. – einpoklum Sep 07 '21 at 08:50

3 Answers3

5

Good-enough rule of thumb

Don't start your identifier with an underscore.

That's it. You might still have a conflict with some file-specific definitions (see below), but those will just get you an error message which you can take care of.

Safe, slightly restrictive, rule of thumb

Don't start your identifier with:

  • An underscore.
  • Any 1-3 letter prefix, followed by an underscore, which isn't a proper word (e.g. a_, st_)
  • memory_ or atomic_.

and don't end your identifier with either _MIN or _MAX.

These rules forbid a bit more than what is actually reserved, but are relatively easy to remember.

More detailed rules

This is based on the C2x standard draft (and thus covers previous standards' reservations) and the glibc documentation.

Don't use:

  • The prefix __ (two underscores).
  • A prefix of one underscore followed by a capital letter (e.g. _D).
  • For identifiers visible at file scope - the prefix _.
  • The following prefixes with underscores, when followed by a lowercase letter: atomic_, memory_, memory_order_, cnd_, mtx_, thrd_, tss_
  • The following prefixes with underscores, when followed by an uppercase letter : LC_, SIG_, ATOMIC, TIME_
  • The suffix _t (that's a POSIX restriction; for C proper, you can use this suffix unless your identifier begins with int or uint)

Additional restrictions are per-library-header-file rather than universal (some of these are POSIX restrictions):

If you use header file... You can't use identifiers with ...
dirent.h Prefix d_
fcntl.h Prefixes l_, F_, O_, and S_
grp.h Prefix gr_
limits.h Suffix _MAX (also probably _MIN)
pwd.h Prefix pw_
signal.h Prefixes sa_ and SA_
sys/stat.h Prefixes st_ and S_
sys/times.h Prefix tms_
termios.h Prefix c_

And there are additional restrictions not involving underscores of course.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
3

The C standard, library chapter, reserves certain identifiers (emphasis mine):

C17 7.1.3 Reserved identifiers

— All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.
— All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces.

— Each macro name in any of the following subclauses (including the future library directions) is reserved for use as specified if any of its associated headers is included; unless explicitly stated otherwise (see 7.1.4).
— All identifiers with external linkage in any of the following subclauses (including the future library directions) and errno are always reserved for use as identifiers with external linkage.184)
— Each identifier with file scope listed in any of the following subclauses (including the future library directions) is reserved for use as a macro name and as an identifier with file scope in the same name space if any of its associated headers is included.

Where "reserved for any use" means reserved for the compiler/standard library, see What's the meaning of "reserved for any use"? "Reserved for the implementation" also means reserved for the compiler/standard library.

Furthermore, Future library directions C17.31 reserve a lot of identifiers - it's a big chapter, I'll only quote the most notable parts:

7.31.10 Integer types <stdint.h> Typedef names beginning with int or uint and ending with _t may be added to the types defined in the <stdint.h> header. Macro names beginning with INT or UINT and ending with _MAX, _MIN, or _C may be added to the macros defined in the <stdint.h> header.

7.31.12 General utilities <stdlib.h>
Function names that begin with str and a lowercase letter may be added to the declarations in the <stdlib.h> header.

7.31.13 String handling <string.h>
Function names that begin with str, mem, or wcs and a lowercase letter may be added to the declarations in the <string.h> header.


To answer your question directly:

So, where and where is it ok to use underscores?

Strictly speaking: nowhere. You should never declare identifiers starting with underscore, since they may clash with the standard library or language keywords etc. Though as is hinted from the bold text above, you may use one underscore followed by lower case in a local namespace.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Also: the user can write macros that begin with an underscore, followed by a lowercase letter. – pmor Jul 24 '22 at 18:02
-1

In the absence of reserved identifiers, adding any new reserved words or predefined identifiers would risk breaking existing code that might happen to use such identifiers. The purpose of having reserved categories of names is to allow the addition of such features by the standard, compiler extensions, standard-library-header internals, or other-library-header internals, without having to ascertain whether some existing code might already be using the identifiers.

Note, however, that while there's no guarantee that the Standard won't define an identifier __WOOZLELIB_FNORD, it would be unlikely that the authors of the Standard would choose a name containing "WOOZLELIB" for any purpose other than to standardize the functionality that Woozle library assigned to it. While it's theoretically possible the Standard might add a function to determine whether a string was feeling incensed and call it "stranger" ["string anger"], the Standard isn't intended to promote paranoia about such things.

supercat
  • 77,689
  • 9
  • 166
  • 211