0

For example, why does the language use argv instead of argumentVector, or argVector (or argument_vector, depending on your preference) or malloc instead of allocateMemory or allocMem? Is there a justification? It seems to me that the abbreviations selected are also fairly obtuse. This is especially evident in the case of malloc, where "m" is placed before "alloc", which is especially unintuitive. Is there a way to go about thinking about this that will make it clearer and more readily apparent, or is this just a barrier to entry that I'll need to memorize?

Also, I've only been able to find answers about why people who program in c abbreviate extensively. This is about abbreviation in c as a language, not abbreviation as a stylistic convention.

  • 3
    Historical reasons. Back in the 70s computers were very limited and so the first C compilers had a restriction of 8 characters for any identifier – UnholySheep Mar 20 '22 at 21:35
  • **M**emory **Alloc**ator. And if my memory serves, some systems had limits on the length of symbol names. – StoryTeller - Unslander Monica Mar 20 '22 at 21:35
  • 2
    You are perfectly free to define `int main(int argumentCount, char *argumentVector[])` if you wish, but I find names that are overly long are harder to read. The right balance is needed: the function `malloc` isn't confusing, to me, but if you want to quibble about the use of `int j` then I'm with you. – Weather Vane Mar 20 '22 at 22:01
  • 3
    @UnholySheep: 8? Nah, C89/C90 had a limit of 6 characters, case-insensitive, on external names. See also [Why must the first 31 characters of an identifier be unique?](https://stackoverflow.com/questions/19905944/why-must-the-first-31-characters-of-an-identifier-be-unique) – Jonathan Leffler Mar 20 '22 at 22:20

2 Answers2

3

Historical note:

70's and early 80's era compilers allowed symbol names that were greater than 8 characters. Meaning that the compiler would parse them. But only the first 8 characters were stored into the symbol table and treated as significant/different when comparing symbol names.

So, you could have longer symbol names but the additional characters were ignored, so automatic variables (function scope) had to be unique in the first 8 characters:

int
main()
{
    int abcdefg0;  // unique
    int abcdefg1;  // unique

    int abcdefgh0;  // not unique
    int abcdefgh1;  // not unique

    int bcdefghiXXX;  // unique but truncated to [treated as] bcdefghi

    return 0;
}

The linker/loader allowed 8 character names. But, for a global (e.g.):

int foo;

int
bar()
{
    return 0;
}

The compiler would change the names by adding a prefix of _ [to distinguish C symbols from asm symbols]. So, we'd get _foo and _bar respectively in the .o file.

So, C global names had to be unique in the first 7 characters. Otherwise, abcdefg0/abcdefg1 (in the source file and compiler symbol table) --> _abcdefg0/_abcdefg1 (with compiler adding the prefix) --> _abcdefg/_abcdefg (in the .o)

(i.e.) A collision when linking.

Because of this, IIRC, most programmers kept even automatic/function scoped variables unique in the first 7 characters (even though they could be unique in the first 8) because it kept things simple when changing a function scope variable into a global scope variable.

At least that's what I did back then.

Craig Estey
  • 30,627
  • 4
  • 24
  • 48
  • 1
    C90 had a minimum length of 6 characters, monocase, for external identifiers (e.g. function names). – Jonathan Leffler Mar 20 '22 at 22:21
  • @JonathanLeffler I'm doing this from memory for compilers I [personally] used from 1981-1989 that were on Unix v7, System III, and System V. I can't recall if Solaris (nee SunOS) or Microport Unix were any better. I used my first Linux system in 1993 but I can't recall if the early versions still had the restriction or not. I _think_ linux (using `gcc`) had things lifted by then. – Craig Estey Mar 20 '22 at 22:27
  • I'm working from memory too — starting 1983 for C. "The Annotated C Standard" by H Schildt is 50% standard text and 50% nonsense contributed by the author. However, it is mostly good when quoting the standard, and section 5.2.4.1 _Translation limits_ lists two immediately relevant ones: • _31 significant initial characters in an internal identifier or macro name. • 6 significant initial characters in an external identifier._ I seem to remember reading somewhere that the C90 standard authors struggled with the 6 character limit, but there were (mainframe) systems where that was unavoidable. – Jonathan Leffler Mar 20 '22 at 22:41
  • @JonathanLeffler The compilers I was using [up to 1986] were derived/ported from the original Unix v7 compiler. They weren't standard conforming [AFAIK] because the code predated it. I know that 8 chars for the linker was the limit. I don't know about uppercase->lowercase because I _always_ used lowercase for all symbols. I was looking for longer names, so I _tested_ an 8 char C global and verified the 7 char limit in the linker. I don't know of any Unix port to a mainframe (e.g. IBM S/370) in that timeframe. So, a C compiler would be using the native OS/VS1 or VM/370/CMS linker – Craig Estey Mar 20 '22 at 22:57
  • Fortunately, it isn't a current problem. The first version of Unix I used was derived from 7th Edition Unix (with some System III extensions, at least later in the life cycle), and AFAICR it had the 8-character limit you remember. I worked on a project where the 6-character rule was enforced because of the systems used by collaborators — this was years before C89/C90 was published. – Jonathan Leffler Mar 20 '22 at 23:02
  • @JonathanLeffler Just curious. Was the SysIII system ported by Unisoft? They did a lot of SysIII/SysV ports to mc68000 systems for startups that manufactured them (I worked for one). At the time, Sun ported BSD (from the VAX port?). And, Altos [Computer Systems] ported Unix v7 to 8086 and mc68000 multiuser systems. – Craig Estey Mar 20 '22 at 23:12
  • I was introduced to Unix on the [ICL Perq](https://en.wikipedia.org/wiki/PERQ) (aka Three Rivers Corporation Perq). The first ones used a 16-bit microprogrammed chip (one side effect of which was that the `char *` address for a given memory location was numerically different from the `anything_bigger *` address of the same location — you didn't risk not declaring `malloc()` properly!). And casts to/from `char *` were necessary; there was no `void *` yet. The Unix o/s was called PNX (Perq uNiX, or thereabouts; also not a good abbreviation for I18N, though the ICL One-Per-Desk or OPD was worse). – Jonathan Leffler Mar 20 '22 at 23:19
  • 1
    *The C Programming Language* (1978) says (in Appendix A): "An identifier is a sequence of letters and digits... No more than the first eight characters are significant... External identifiers, which are used by various assemblers and loaders, are more restricted: DEC PDP-11, 7 characters, 2 cases; Honeywell 6000, 6 characters, 1 case; IBM 360/370, 7 characters, 1 case; Interdata 8/32, 8 characters, 2 cases." (My first exposure was on a PDP-11; on the IBM 360, I mostly used Fortran (WATFOR) but sometimes SNOBOL, which was really cool.) Anyway, that's where 6 characters, one case comes from. – rici Mar 21 '22 at 02:41
  • I just checked the future directions in §6.9.1 in the C90 standard, and it says _6.9.1 External names — Restriction of the significance of an external name to fewer than 31 characters or to only one case is an obsolescent feature that is a concession to existing implementations._ – Jonathan Leffler Mar 21 '22 at 05:11
2

I'm quoting my teacher here. He said that, back in the day, monitors were way smaller so if you use long names, your code won't fit on your screen anymore. C is an old language (the '70s) so it's basically a bunch of legacy we have to deal with now. Changing it would break to much existing code (Linux/unix is written in C for example).

Also the people using it are sometimes from that time so they are used to this style.

Jupiter
  • 1,421
  • 2
  • 12
  • 31