I have a question regarding section 5.2.4.1 Translation Limits in the first American National Standard for Programming languages - C
, also known as ANSI/ISO 9899-1990, ISO/IEC 9899.1990 (E), C89, etc. Simply put, the first ANSI C standard.
What does the standard say that is so strange?
It infamously states that a conforming C compiler is only required to handle, and I quote:
5.2.4.1 Translation Limits
- 6 significant initial characters in an external identifier
Now, it is painfully obvious that this is unreasonably short, especially considering that C does not have anything similar to a name space. It is especially important to allow for descriptive names when dealing with external identifiers, seeing how they will "pollute" everything you link.
Even the standard library mandates functions with a longer name, longjmp
, tmpfile
, strncat
. The latter, strncat
, showing that they had to work a bit to invent library names where the initial six characters were unique, instead of the arguably more logical strcatn
which would have collided with strcat
.
Why is it still a problem to me?
I enjoy oldish computers. I'm trying to write programs that will compile and work well on platforms pre-C99, which sometimes does not exist on my beloved targets. Perhaps I also enjoy trying to really follow the standard. I have learned a lot about C99 and C11 by just digging through the older standards, trying to trace reasons for certain limitations and implementation issues.
So, even though I know of no compiler or linker actually enforcing or imposing this limitation, it still nags me that I can not claim to have written strictly conforming code if I also want to use legible and non-colliding external identifiers.
Why would they impose such a thing?
They began work on the standardization some time during the early eighties, and finalized it in 1988 or 1989. Even in the seventies and sixties, it would not have been any problem whatsoever to handle longer identifiers.
Considering that any compiler wanting to conform to the new standard must be modified - if only to update the documentation - I don't see how it would be unreasonable for ANSI to set down the foot and say something similar to "It is 1989 already. You must handle 31 significant initial characters". It would not have been a problem for any platform, even ancient ones.
Backwards compatibility?
From what I've read when searching for this, the problem might come from FORTRAN. In an answer to the question What's the exact role of "significant characters" in C (variables)?, Jonathan Leffler writes:
Part of the trouble may have been Fortran; it only required support for 6 character monocase names, so linkers on systems where Fortran was widely used did not need to support longer names.
To me, this seems like the most reasonable answer to the direct question Why?. But considering that this restriction bugs me every time I want to write a program that could theoretically be built on old systems, I would like to know some more details.
Questions
- After having searched a bit about the FORTRAN track, I've only came up with theories and hand-waving. Which popular platforms did actually impose a limit of only 6 characters? Is there a linker which was extra popular, that forced the standards committee to budge?
- I'm not old enough to have been interested in these kind of details when they were discussed. Has this limit and its rationale been publicly discussed and defended? Was there a public outcry, or just silently ignored? Pitchforks outside the ANSI headquarters?
Ultimately, the answers to these questions will make it easier for me to decide how bad I should sleep at night for giving reasonable names to my functions.