17

Why is it sensible for a language to allow implicit declarations of functions and typeless variables? I get that C is old, but allowing to omit declarations and default to int() (or int in case of variables) doesn't seem so sane to me, even back then.

So, why was it originally introduced? Was it ever really useful? Is it actually (still) used?

Note: I realise that modern compilers give you warnings (depending on which flags you pass them), and you can suppress this feature. That's not the question!


Example:

int main() {
  static bar = 7; // defaults to "int bar"
  return foo(bar); // defaults to a "int foo()"
}

int foo(int i) {
  return i;
}
Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
bitmask
  • 32,434
  • 14
  • 99
  • 159
  • 6
    That feature has been removed from the language. Unfortunately, many compilers still accept it by default. – Daniel Fischer Aug 06 '12 at 19:57
  • possible duplicate of [Concept of "auto" keyword in c](http://stackoverflow.com/questions/4688816/concept-of-auto-keyword-in-c) –  Aug 06 '12 at 19:58
  • It doesn't as of C99. Modern compilers don't emit a warning as you say; they emit an error. You aren't using a "modern" compiler. I'm guessing visual studio. – Ed S. Aug 06 '12 at 19:59
  • @DanielFischer: Thanks, I wasn't aware of that (I'm behind on my standards reading). Anyway, I was mostly concerned why it would have been introduced in the first place. – bitmask Aug 06 '12 at 19:59
  • Programmers are lazy, and `int` is a powerfully universal type. Being able to omit it lets you write lots of quick-and-dirty code very succinctly -- "brevity is the soul of wit", as William Shatner said. It's only recently that space (both disk and screen) has become some abundant and programs so large that clean, readable style is preferable to brevity. – Kerrek SB Aug 06 '12 at 20:00
  • @0A0D: Using `auto` here is only to get the compiler to understand that `bar` is supposed to be a variable. I could have used `static` as well. – bitmask Aug 06 '12 at 20:00
  • 1
    My understanding is that "back then" C was "a better assembly", heavily borrowing from the underlying PDP architecture, and mapping to it very neatly. Unfortunately, the language borrowed the scarce type structure as well. – Sergey Kalinichenko Aug 06 '12 at 20:01
  • 7
    How is this not a real question? Seriously, what's going on? – bitmask Aug 06 '12 at 20:02
  • I agree. It will be closed by those over-eager fellows who maintain that this is subjective. It is not; there was obviously a reason behind making these rules a part of the standard. They would likely answer this in the form of "because it said so in the standard". IMO Stack Overflow is worse off for this sort of moderation. Oh well, voted to reopen – Ed S. Aug 06 '12 at 20:03
  • 1
    I agree, it's a real (and not uninteresting) question. But it's off topic and/or not constructive, I think. We could all only guess why. – Daniel Fischer Aug 06 '12 at 20:04
  • @EdS., Daniel: I'm a terrible judge on what belongs on programmers.SE (I'm almost invariably wrong), but would this be a candidate? – bitmask Aug 06 '12 at 20:06
  • 2
    I think it's a better fit here. Just wait for it to be opened again. – Ed S. Aug 06 '12 at 20:07
  • I'm not familiar with Programmers, but it may well be. Somebody should ask a mod over there. – Daniel Fischer Aug 06 '12 at 20:07
  • It's been opened again, post away. – Ed S. Aug 06 '12 at 20:07
  • The processor itself does not deal with types, only values and memory locations. – pmg Aug 06 '12 at 20:10
  • 5
    "We could all only guess why." -- Those who are familiar with the history or are willing to do research can do far better than guess. – Jim Balter Aug 06 '12 at 20:18
  • "heavily borrowing from the underlying PDP architecture, and mapping to it very neatly" -- This is largely a myth. See "More History" in http://cm.bell-labs.com/who/dmr/chist.html – Jim Balter Aug 06 '12 at 20:55
  • @bitmask: very good question. +1 from me. – Destructor Mar 12 '16 at 16:58

5 Answers5

15

See Dennis Ritchie's "The Development of the C Language": http://web.archive.org/web/20080902003601/http://cm.bell-labs.com/who/dmr/chist.html

For instance,

In contrast to the pervasive syntax variation that occurred during the creation of B, the core semantic content of BCPL—its type structure and expression evaluation rules—remained intact. Both languages are typeless, or rather have a single data type, the 'word', or 'cell', a fixed-length bit pattern. Memory in these languages consists of a linear array of such cells, and the meaning of the contents of a cell depends on the operation applied. The + operator, for example, simply adds its operands using the machine's integer add instruction, and the other arithmetic operations are equally unconscious of the actual meaning of their operands. Because memory is a linear array, it is possible to interpret the value in a cell as an index in this array, and BCPL supplies an operator for this purpose. In the original language it was spelled rv, and later !, while B uses the unary *. Thus, if p is a cell containing the index of (or address of, or pointer to) another cell, *p refers to the contents of the pointed-to cell, either as a value in an expression or as the target of an assignment.

This typelessness persisted in C until the authors started porting it to machines with different word lengths:

The language changes during this period, especially around 1977, were largely focused on considerations of portability and type safety, in an effort to cope with the problems we foresaw and observed in moving a considerable body of code to the new Interdata platform. C at that time still manifested strong signs of its typeless origins. Pointers, for example, were barely distinguished from integral memory indices in early language manuals or extant code; the similarity of the arithmetic properties of character pointers and unsigned integers made it hard to resist the temptation to identify them. The unsigned types were added to make unsigned arithmetic available without confusing it with pointer manipulation. Similarly, the early language condoned assignments between integers and pointers, but this practice began to be discouraged; a notation for type conversions (called `casts' from the example of Algol 68) was invented to specify type conversions more explicitly. Beguiled by the example of PL/I, early C did not tie structure pointers firmly to the structures they pointed to, and permitted programmers to write pointer->member almost without regard to the type of pointer; such an expression was taken uncritically as a reference to a region of memory designated by the pointer, while the member name specified only an offset and a type.

Programming languages evolve as programming practices change. In modern C and the modern programming environment, where many programmers have never written assembly language, the notion that ints and pointers are interchangeable may seem nearly unfathomable and unjustifiable.

Jim Balter
  • 16,163
  • 3
  • 43
  • 66
  • SRB suggested to DMR adding void as a type in C. "It saved the instruction loading register for return value" Plus when SRB arrived "C types were like PL/1 namely offsets from any base you liked". C changed to Algol68 types and a few other "A68isms" ended up in C... [Note: Algol68 had strong typing] – NevilleDNZ Aug 11 '12 at 11:08
  • @NevilleDNZ And there was his algol-influenced shell (http://research.swtch.com/shmacro). It's interesting that Algol60 -> CPL -> BCPL -> B -> C, and {Algol60, CPL} ->ALGOL68 -> C. Strachey's CPL was a very sophisticated language, and its a pity that many of its excellent features were lost along the way (to reappear elsewhere, such as in Haskell and other functional languages). – Jim Balter Aug 12 '12 at 05:04
  • I had spotted the "switch" stmts in C being similar to PL/I's "switch", thought that was all. And I was surprised to find C previously had "PL/I's type offsets from any base". Interesting how C (a language designed to be tiny) managed to include influences from so many languages. Certainly BCPL warrants a peek. – NevilleDNZ Aug 12 '12 at 11:39
  • @NevilleDNZ CPL much more than BCPL ... it was designed by this guy: http://en.wikipedia.org/wiki/Christopher_Strachey – Jim Balter Aug 12 '12 at 11:45
  • Link is broken. See http://web.archive.org/web/20080902003601/http://cm.bell-labs.com/who/dmr/chist.html – phlummox Aug 02 '22 at 06:27
14

It's the usual story — hysterical raisins (aka 'historical reasons').

In the beginning, the big computers that C ran on (DEC PDP-11) had 64 KiB for data and code (later 64 KiB for each). There was a limit to how complex you could make the compiler and still have it run. Indeed, there was scepticism that you could write an O/S using a high-level language such as C, rather than needing to use assembler. So, there were size constraints. Also, we are talking a long time ago, in the early to mid 1970s. Computing in general was not as mature a discipline as it is now (and compilers specifically were much less well understood). Also, the languages from which C was derived (B and BCPL) were typeless. All these were factors.

The language has evolved since then (thank goodness). As has been extensively noted in comments and down-voted answers, in strict C99, implicit int for variables and implicit function declarations have both been made obsolete. However, most compilers still recognize the old syntax and permit its use, with more or less warnings, to retain backwards compatibility, so that old source code continues to compile and run as it always did. C89 largely standardized the language as it was, warts (gets()) and all. This was necessary to make the C89 standard acceptable.

There is still old code around using the old notations — I spend quite a lot of time working on an ancient code base (circa 1982 for the oldest parts) which still hasn't been fully converted to prototypes everywhere (and that annoys me intensely, but there's only so much one person can do on a code base with multiple millions of lines of code). Very little of it still has 'implicit int' for variables; there are too many places where functions are not declared before use, and a few places where the return type of a function is still implicitly int. If you don't have to work with such messes, be grateful to those who have gone before you.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Wouldn't the compiler get even bigger if you included these two features? Or were there no "proper" function declarations in C in these days? – bitmask Aug 06 '12 at 20:17
  • 2
    Don't forget that the prototype notation was not formally added to C until the C89 standard was released (and was adapted from C++ which had demonstrated that it was beneficial). That added to the complexity of C compilers. Prior to that, you might have written: `main(){ static bar = 7; return foo(bar); } foo(i) { return i; }` without the namby-pamby molly-coddling presented by prototypes and headers and so on. – Jonathan Leffler Aug 06 '12 at 20:20
  • Ah yes, the size constraints... not only the size of the compiled code but the size of the source files as well. – Jeff Mercado Aug 06 '12 at 20:21
  • I'd say the typelessness of BCPL (was B also typeless?) is likely the biggest reason. And it's definitely worth mentioning that "implicit int" was removed from the language by the C99 standard; `static bar = 7;` is now a syntax error. – Keith Thompson Aug 06 '12 at 20:22
  • In the beginning, C ran on the PDP-7, which had 8K of 18-bit words. The PDP-11 that they switched to only had 24K bytes, and they had to remain compatible with such machines even after they got PDP-11/45's with split I and D space. – Jim Balter Aug 06 '12 at 21:00
8

Probably the best explanation for "why" comes from here:

Two ideas are most characteristic of C among languages of its class: the relationship between arrays and pointers, and the way in which declaration syntax mimics expression syntax. They are also among its most frequently criticized features, and often serve as stumbling blocks to the beginner. In both cases, historical accidents or mistakes have exacerbated their difficulty. The most important of these has been the tolerance of C compilers to errors in type. As should be clear from the history above, C evolved from typeless languages. It did not suddenly appear to its earliest users and developers as an entirely new language with its own rules; instead we continually had to adapt existing programs as the language developed, and make allowance for an existing body of code. (Later, the ANSI X3J11 committee standardizing C would face the same problem.)

Systems programming languages don't necessarily need types; you're mucking around with bytes and words, not floats and ints and structs and strings. The type system was grafted onto it in bits and pieces, rather than being part of the language from the very beginning. As C has moved from being primarily a systems programming language to a general-purpose programming language, it has become more rigorous in how it handles types. But, even though paradigms come and go, legacy code is forever. There's still a lot of code out there that relies on that implicit int, and the standards committee is reluctant to break anything that's working. That's why it took almost 30 years to get rid of it.

John Bode
  • 119,563
  • 19
  • 122
  • 198
6

A long, long time ago, back in the K&R, pre-ANSI days, functions looked quite different than they do today.

add_numbers(x, y)
{
    return x + y;
}

int ansi_add_numbers(int x, int y); // modern, ANSI C

When you call a function like add_numbers, there is an important difference in the calling conventions: all types are "promoted" when the function is called. So if you do this:

// no prototype for add_numbers
short x = 3;
short y = 5;
short z = add_numbers(x, y);

What happens is x is promoted to int, y is promoted to int, and the return type is assumed to be int by default. Likewise, if you pass a float it is promoted to double. These rules ensured that prototypes weren't necessary, as long as you got the right return type, and as long as you passed the right number and type of arguments.

Note that the syntax for prototypes is different:

// K&R style function
// number of parameters is UNKNOWN, but fixed
// return type is known (int is default)
add_numbers();

// ANSI style function
// number of parameters is known, types are fixed
// return type is known
int ansi_add_numbers(int x, int y);

A common practice back in the old days was to avoid header files for the most part, and just stick the prototypes directly in your code:

void *malloc();

char *buf = malloc(1024);
if (!buf) abort();

Header files are accepted as a necessary evil in C these days, but just as modern C derivatives (Java, C#, etc.) have gotten rid of header files, old-timers didn't really like using header files either.

Type safety

From what I understand about the old old days of pre-C, there wasn't always much of a static typing system. Everything was an int, including pointers. In this old language, the only point of function prototypes would be to catch arity errors.

So if we hypothesize that functions were added to the language first, and then a static type system was added later, this theory explains why prototypes are optional. This theory also explains why arrays decay to pointers when used as function arguments -- since in this proto-C, arrays were nothing more than pointers which get automatically initialized to point to some space on the stack. For example, something like the following may have been possible:

function()
{
    auto x[7];
    x += 1;
}

Citations

On typelessness:

Both languages [B and BCPL] are typeless, or rather have a single data type, the 'word,' or 'cell,' a fixed-length bit pattern.

On the equivalence of integers and pointers:

Thus, if p is a cell containing the index of (or address of, or pointer to) another cell, *p refers to the contents of the pointed-to cell, either as a value in an expression or as the target of an assignment.

Evidence for the theory that prototypes were omitted due to size constraints:

During development, he continually struggled against memory limitations: each language addition inflated the compiler so it could barely fit, but each rewrite taking advantage of the feature reduced its size.

Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • Originally, you might well have written: `add_numbers(x,y){return x+y;}` (with no unnecessary spaces in that transcription; using 4 lines of code in source files). – Jonathan Leffler Aug 06 '12 at 20:33
  • 1
    `char *malloc();` — `void *` was also a late addition (as was plain `void`), making its way into the language only a little before the C89 standard. Horrible, aren't I? – Jonathan Leffler Aug 06 '12 at 20:38
  • @JonathanLeffler: Fascinating. I take `void *` for granted these days, but it makes sense that it used to be `char *`, since the standard now says that `void *` and `char *` have to have the same representation. – Dietrich Epp Aug 06 '12 at 20:39
2

Some food for thought. (It's not an answer; we actually know the answer — it's permitted for backward compatibility.)

And people should look at COBOL code base or f66 libraries before saying why it's not cleaned up in 30 years or so!

gcc with its default does not spit out any warnings.

With -Wall and gcc -std=c99 do spit out the correct thing

main.c:2: warning: type defaults to ‘int’ in declaration of ‘bar’
main.c:3: warning: implicit declaration of function ‘foo’

The lint functionality built into modern gcc is showing its color.

Interestingly the modern clone of lint, the secure lint — I mean splint — gives only one warning by default.

main.c:3:10: Unrecognized identifier: foo
  Identifier used in code has not been declared. (Use -unrecog to inhibit
  warning)

The llvm C compiler clang which also has a static analyser built into it like gcc, spits out the two warnings by default.

main.c:2:10: warning: type specifier missing, defaults to 'int' [-Wimplicit-int]
  static bar = 7; // defaults to "int bar"
  ~~~~~~ ^
main.c:3:10: warning: implicit declaration of function 'foo' is invalid in C99
      [-Wimplicit-function-declaration]
  return foo(bar); // defaults to a "int foo()"
         ^

People used to think we don't need backward compatibility for 80's stuff. All the code must be cleaned up or replaced. But it turns out it's not the case. A lot of production code stays in prehistoric non-standard times.

EDIT:

I didn't look through other answers before posting mine. I may have misunderstood the intention of the poster. But the thing is there was a time when you hand compiled your code, and use toggle to put the binary pattern in memory. They didn't need a "type system". Nor does the PDP machine in front of which Richie and Thompson posed like this :

Don't look at the beard, look at the "toggles", which I heard were used to bootstrap the machine.

K&R

And also look how they used to boot UNIX in this paper. It's from the Unix 7th edition manual.

http://wolfram.schneider.org/bsd/7thEdManVol2/setup/setup.html

The point of the matter is they didn't need so much software layer managing a machine with KB sized memory. Knuth's MIX has 4000 words. You don't need all these types to program a MIX computer. You can happily compare a integer with pointer in a machine like this.

I thought why they did this is quite self-evident. So I focused on how much is left to be cleaned up.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Aftnix
  • 4,461
  • 6
  • 26
  • 43
  • "we actually know the answer, its permitted for backward compatibility" -- That's only an answer if one gets no further than the title of the question rather than reading and understanding what the OP really wants to know. – Jim Balter Aug 06 '12 at 20:48
  • I didn't notice his EDIT. I guess i could expand my post reflecting the issue behind it. – Aftnix Aug 06 '12 at 20:50
  • The edit was irrelevant to the point. Apparently you also didn't notice the other answers that actually address the issue. – Jim Balter Aug 06 '12 at 21:05
  • The PDP-11 toggles allowed entering binary values into memory; they served the same purpose as an EPROM loader. That it had toggle switches has no bearing *at all* on the question or on the value of type systems. I programmed the PDP-11 in C both before and after type safety was added, and type safety had all the benefits it has now. The amount of memory is irrelevant: the PDP-11 C compiler and UNIX operating system were complex software that greatly benefit from type safety. This non-answer reflects deep misunderstandings about types, programming languages, software systems, and computers. – Jim Balter Mar 12 '16 at 21:05