17

I recently started learning networking in C and I saw some functions that start with an underscore- like _function()- what does that mean exactly? I also saw this :

 struct sockaddr_in  {  

__SOCKADDR_COMMON (sin_);  

 in_port_t sin_port;    

 struct in_addr sin_addr;    

 unsigned char sin_zero[sizeof (struct sockaddr) - 

 __SOCKADDR_COMMON_SIZE -  

sizeof (in_port_t) -         

sizeof (struct in_addr)];  

};

what does this parts of the code mean:

__SOCKADDR_COMMON (sin_);

unsigned char sin_zero[sizeof (struct sockaddr) - 

 __SOCKADDR_COMMON_SIZE -  

sizeof (in_port_t) -         

sizeof (struct in_addr)];
pedro santos
  • 337
  • 1
  • 2
  • 9
  • See also [What does double-underscore (`__const`) mean in C?](http://stackoverflow.com/questions/1449181/) which quotes the C standard on the subject of names starting with underscores. – Jonathan Leffler Sep 21 '16 at 20:51

2 Answers2

27

The underscore prefix is reserved for functions and types used by the compiler and standard library. The standard library can use these names freely because they will never conflict with correct user programs.

The other side to this is that you are not allowed to define names that begin with an underscore.

Well, that is the gist of the rule. The actual rule is this:

  • You cannot define any identifiers in global scope whose names begin with an underscore, because these may conflict with hidden (private) library definitions. So this is invalid in your code:

    #ifndef _my_header_h_
    #define _my_header_h_ // wrong
    int _x; // wrong
    float _my_function(void); // wrong
    #endif
    

    But this is valid:

    #ifndef my_header_h
    #define my_header_h // ok
    int x; // ok
    float my_function(void) { // ok
        int _x = 3; // ok in function
    }
    struct my_struct {
        int _x; // ok inside structure
    };
    #endif
    
  • You cannot define any identifiers in any scope whose names begin with two underscores, or one underscore followed by a capital letter. So this is invalid:

    struct my_struct {
        int _Field; // Wrong!
        int __field; // Wrong!
    };
    void my_function(void) {
        int _X; // Wrong!
        int __y; // Wrong!
    }
    

    But this is okay:

    struct my_struct {
        int _field; // okay
    };
    void my_function(void) {
        int _x; // okay
    }
    

There are actually a few more rules, just to make things complicated, but the ones above are the most often violated and the easiest to remember.

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • Thank you. In this part of the code __SOCKADDR_COMMON (sin_); what does the (sin_) mean? – pedro santos Sep 21 '16 at 20:12
  • 1
    It's part of the implementation details of how the library works, it's not really important. In this particular case it's probably the prefix used to construct the common socket field names, so you get `sin_family` for `struct sockaddr_in` and `sun_family` for `struct sockaddr_un`. This, I believe, is either a historical quirk, or an allowance for defining header fields with macros. – Dietrich Epp Sep 21 '16 at 20:16
  • Part 1 okay [but has typo: `int x; // ok inside structure` --> `int _x; // ok inside structure`. But, I don't see how a _scoped_ `__x` or `_X` is worse than an `_x` one. Also, global scope with `static` should be fine. If we define _literally_ `__my_func` no lib will use that, so obscure names are fine. It's a convention. It's up to the programmer to assess collision risk. If pgmr uses `_access` and there is no conflict, but lib later [in time] defines this, pgmr must resolve issue. But, this is a pgmr choice as to such risk. 99.44% of all such names will _never_ conflict – Craig Estey Sep 21 '16 at 20:28
  • 2
    @CraigEstey `__x` and `_X` may be macros. *But, this is a pgmr choice as to such risk. 99.44% of all such names will never conflict* And good luck figuring out the problem if there is a conflict. Given how easy it is to put bugs into code, why on God's good Earth would you *ever* do anything that you know could introduce impossible-to-figure-out bugs? – Andrew Henle Sep 21 '16 at 20:31
  • @AndrewHenle As a practical matter, I've been using `_whatever` for 35+ years in C and have _never_, not even _once_ experienced a conflict. The only conflicts I've experienced were rebuilding old(er) C code using C++ and having to change `for (try = 1; try <= 10; ++try)` into `for (trycount = 1; trycount <= 10; ++trycount)`. Same for `new`, etc. that became keywords. People even try to apply the no underscore stuff to `#define X(_x,_y)` where it doesn't even apply. The "_" is a POSIX lib convention [IIRC], so R/T, kernel non-issue. Even with a macro, it's easy to detect/fix/workaround. – Craig Estey Sep 21 '16 at 20:50
  • 3
    @CraigEstey: you're lucky, then. I've only been coding about 33 years in C, and I've run into some nasty problems because of internal code using names starting with underscore that conflicted with a system function of the same name. Things go badly wrong when your code defines `int _bind(int x, char y);` and the system defines and calls `char *_bind(void *p, char *a, int f);` or something similar — and the system library code ends up calling your `_bind()` instead of the intended one. That was a good few years ago (another millennium, in fact), but I remember that and a few similar cases. – Jonathan Leffler Sep 21 '16 at 20:56
  • 3
    @CraigEstey *The "_" is a POSIX lib convention* Please read **7.1.3 Reserved identifiers** of [the C Standard](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf): *All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use.* and *All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces.* Logically, your argument is the same as "I swam with sharks while covered in blood and didn't get eaten." 35 years you say? – Andrew Henle Sep 21 '16 at 21:02
  • @JonathanLeffler I "make my own luck" by using `_obscure`. Like other conventions masquerading as "rules", my main point is the _programmer_ should decide. Follow the "rule" most of the time, but... IMO, it is blatant _hubris_ for POSIX to usurp the C namespace. If they want, they can usurp `_posix_*` instead of `_*`. I've looked at [a lot] of modern glibc source. It would handle `_bind` and call its internal version even if the pgm defined it, using linker script trickery, `__bind` (nee `_bind`) has a symbol bind type of LOCAL in `libc.so` – Craig Estey Sep 21 '16 at 21:28
  • 2
    @CraigEstey: that was why I mentioned 'another millennium' — the problems are often different these days, but the choice of names can still cause problems. That is one of a few (less than half-a-dozen, I think) cases of interference with leading-underscore names. (I'll add, the local use of `_bind()` as a name wasn't my choice, and it didn't cause a problem until ported to a new platform.) I've had more problems with `_t` suffixes on types which POSIX reserves. In theory, you have to actively activate POSIX functionality — pace `-std=gnu11` which does it automatically. _[…continued…]_ – Jonathan Leffler Sep 21 '16 at 21:34
  • 2
    _[…continuation…]_ The POSIX rules are there so that if you violate them and it hurts, you don't have a comeback. To a large extent, that's true of the C rules too. They also guide the developers of systems — if they follow the rules, they won't hurt their users if those users also follow the rules. The tricky bit is that newbies look at system headers and think "oh, that must be how I should write my headers too", and immediately break all the rules set up to prevent the users and the system providers from hurting each other. Ultimately, that's an education problem. SO can help. – Jonathan Leffler Sep 21 '16 at 21:36
  • 2
    @CraigEstey: Regarding your first comment: The reason why it is still *not okay* to use `static ... _x` is because the library may be using `_x` in a header file you are using, and then you would get an error for multiple incompatible declarations of `_x`. You say it is up to the programmer to assess risk, which is correct, but only part of the truth—the programmer should also be assessing benefits. The benefits of using leading underscores in names, assuming you don't have an entrenched convention, are nil. I would expect code which uses such names to fail code review. – Dietrich Epp Sep 21 '16 at 22:05
  • @AndrewHenle IIRC, It's not in the _real_ C standard: K&R :-) I was _already_ aware of this [which originated in the POSIX lib spec, IIRC]. Don't be slavish to a _convention_ like this. Would `_Andrew_Henle_whatever` be likely to collide? It's intended for extern/global conflicts, but it's worded such that _scoped_ are an issue. But, they are no more an issue than `int fnc(void) { int fopen = 37; return fopen; }` or `struct foobar { int fopen; };`. And, why is a _language_ spec saying _anything_ about support libs? It's a separate document, IMO. kernel has _no_ libc but uses C. – Craig Estey Sep 21 '16 at 22:22
  • @CraigEstey: If I understand you correctly, you're saying that (1) it's okay, usually, to use underscores at the beginning of identifiers because if you make a judgment call about it, and (2) something about support libraries and what happens if you don't use the standard library? Not sure what the last three sentences of the comment are addressing. – Dietrich Epp Sep 21 '16 at 22:47
  • @JonathanLeffler You got the operative word: _newbies_. They shouldn't do what I'm talking about and should listen to ... _Experts_ (e.g. you and [maybe ;-)] me). We can do so _if_ we choose, because of our experience/judgement [to adhere or not]. (e.g.) I wouldn't do `_bind` but I'd do `_qbind` and take my chances. I've never had to, but if I _did_ have to fix a conflict, I would simply do it. I have an automated [perl] script to change `foo_t` because sometimes I've created `abc_t` and then realized I wanted `def_t`. Also, other scripts (250,000 lines) to make this a 10 second exercise. – Craig Estey Sep 21 '16 at 23:17
  • @JonathanLeffler Regarding headers, I'm assuming you mean the `#ifndef _STDIO_H` lock? For me, if I had a file `.../xyzlib/xyzdef.h`, I'd use: `#ifndef _xyzlib_xyzdef_h_`, so I'd be less likely to get a conflict. Also, who came up with the use `#include ` for standard headers and use `#include "xyzdef.h"` for project specific. I use `<>` for all and just add `-I.` when compiling. It _is_ education. SO does explain the _why_. But, it shouldn't have to. If a spec is going to mandate `_whatever` as reserved, _it_ should explain why. ISO/C is notorious for _not_ doing this. POSIX does. – Craig Estey Sep 21 '16 at 23:30
8

Leading underscores usually indicate one of 3 things:

  1. The definition is not part of the C standard, so it's not portable
  2. The definition is internal to a library or compiler, and should not be used from outside
  3. The definition should not be used lightly, as it implies some risk or necessary configuration that requires extra knowledge.

In this case, __SOCKADDR_COMMON is (2): an internal definition, part of the struct sockaddr_in type, which is the one you usually access from userland.

salezica
  • 74,081
  • 25
  • 105
  • 166