114

I ask because my compiler seems to think so, even though I don’t.

echo 'int main;' | cc -x c - -Wall
echo 'int main;' | c++ -x c++ - -Wall

Clang issues no warning or error with this, and gcc issues only the meek warning: 'main' is usually a function [-Wmain], but only when compiled as C. Specifying a -std= doesn’t seem to matter.

Otherwise, it compiles and links fine. But on execution, it terminates immediately with SIGBUS (for me).

Reading through the (excellent) answers at What should main() return in C and C++? and a quick grep through the language specs, it would certainly seem to me that a main function is required. But the verbiage from gcc’s -Wmain (‘main’ is usually a function) (and the dearth of errors here) seems to possibly suggest otherwise.

But why? Is there some strange edge-case or “historical” use for this? Anyone know what gives?

My point, I suppose, is that I really think this should be an error in a hosted environment, eh?

Community
  • 1
  • 1
Geoff Nixon
  • 4,697
  • 2
  • 28
  • 34
  • 6
    To make gcc a (mostly) standard compliant compiler you need `gcc -std=c99 -pedantic ...` – pmg Jan 05 '15 at 10:01
  • 3
    @pmg Its the same warning, with or without `-pedantic` or any `-std`. My system `c99` also compiles this without warning or error... – Geoff Nixon Jan 05 '15 at 10:04
  • 3
    Unfortunately, if you are "clever enough", you can create things that are acceptable by the compiler but doesn't make sense. In this case, you are linking the C runtime library to call a variable called `main`, which is unlikely to work. If you initialize main with the "right" value, it may actually return... – Mats Petersson Jan 05 '15 at 10:08
  • I would say that the difference between C and C++ is probably a case of "functions are treated (more) differently to variables" in C++, compared to C. – Mats Petersson Jan 05 '15 at 10:11
  • Wrong question: a C or C++ compiler don't know about *programs* but about *translation units* (several of them being linked together to make an executable program) – Basile Starynkevitch Jan 05 '15 at 10:16
  • 7
    And even if it is valid, it is an awful thing to do (unreadable code). BTW, it might be different in hosted implementations and in free-standing implementations (which do not know about `main`) – Basile Starynkevitch Jan 05 '15 at 10:17
  • It's also `::main()` which is reserved, IIRC `my::main` is OK. – MSalters Jan 05 '15 at 10:35
  • 1
    For more fun times, try [`main=195;`](http://codegolf.stackexchange.com/a/23397/16922) – geometrian Jan 06 '15 at 02:19
  • 1
    For even more fun, try [this 1984 IOCCC entry](http://www.ioccc.org/1984/mullender.c) on a VAX or PDP-11. – dan04 Jan 06 '15 at 04:48
  • @imallett that code is also discussed in [How can a program with a global variable called main instead of a main function work?](http://stackoverflow.com/q/32851184/1708801) – Shafik Yaghmour Aug 19 '16 at 16:09

9 Answers9

97

Since the question is double-tagged as C and C++, the reasoning for C++ and C would be different:

  • C++ uses name mangling to help linker distinguish between textually identical symbols of different types, e.g. a global variable xyz and a free-standing global function xyz(int). However, the name main is never mangled.
  • C does not use mangling, so it is possible for a program to confuse linker by providing a symbol of one kind in place of a different symbol, and have the program successfully link.

That is what's going on here: the linker expects to find symbol main, and it does. It "wires" that symbol as if it were a function, because it does not know any better. The portion of runtime library that passes control to main asks linker for main, so linker gives it symbol main, letting the link phase to complete. Of course this fails at runtime, because main is not a function.

Here is another illustration of the same issue:

file x.c:

#include <stdio.h>
int foo(); // <<== main() expects this
int main(){
    printf("%p\n", (void*)&foo);
    return 0;
}

file y.c:

int foo; // <<== external definition supplies a symbol of a wrong kind

compiling:

gcc x.c y.c

This compiles, and it would probably run, but it's undefined behavior, because the type of the symbol promised to the compiler is different from the actual symbol supplied to the linker.

As far as the warning goes, I think it is reasonable: C lets you build libraries that have no main function, so the compiler frees up the name main for other uses if you need to define a variable main for some unknown reason.

Waynn Lue
  • 11,344
  • 8
  • 51
  • 76
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • 3
    Though, C++ compiler treats main function differently. Its name is not mangled even without extern "C". I guess it is because otherwise it would need to emit it's own extern "C" main, to ensure linking. – UldisK Jan 05 '15 at 10:20
  • @UldisK Yes, I noticed this myself, and found quite interesting. It makes sense, but I'd never thought about that. – Geoff Nixon Jan 05 '15 at 10:25
  • @UldisK: Its name may or may not be mangled, or it may not exist at all. (In C++, `main` cannot be called, which eliminates the major reason for a function name) – MSalters Jan 05 '15 at 10:33
  • 2
    Actually, the results for C++ and C *are not* different, as pointed out here — `main` is not subject to name mangling (so it seems) in C++, whether or not it is a function. – Geoff Nixon Jan 05 '15 at 10:35
  • This is all irrelevant. The question is about what is valid and what is not. Compilers, linkers, name mangling and everything else is probably very interesting, but answers to questions of validity lie in language standards. – n. m. could be an AI Jan 05 '15 at 19:14
  • 4
    @n.m. I think that your interpretation of the question is too narrow: in addition to asking the question in the title of the post, OP clearly seeks an explanation of why his program compiled in the first place ("my compiler seems to think so, even though I don't") as well as a suggestion of why it could be useful to define `main` as anything other than a function. The answer offers an explanation for both parts. – Sergey Kalinichenko Jan 05 '15 at 19:21
  • I think `y.x` in the `gcc` command should be `y.c`, unless I'm missing something. – Ryan Dougherty Jan 06 '15 at 01:00
  • @Ryan That's right, thanks! I mistyped it (which isn't surprising given that it was around 5 AM). Thanks! – Sergey Kalinichenko Jan 06 '15 at 01:14
  • 1
    That the symbol main isn't subject to name mangling is irrelevant. There is no mention of name mangling in the C++ standard. Name mangling is an implementation issue. – David Hammen Jan 06 '15 at 14:16
  • @UldisK: I would expect that in many C++ implementations have a library function called "main", which performs any C++ specific actions which need to happen before the first statement of the user-supplied "main", and then calls to the user-supplied "main" function which would be given some other name. Some implementations may instead have the compiler generate a call to a startup routine before executing any of the user-supplied code within the "main" function, but there's not really any reason to activate the stack frame for the user-supplied "main" function before such code executes. – supercat Sep 29 '15 at 19:21
30

main isn't a reserved word it's just a predefined identifier (like cin, endl, npos...), so you could declare a variable called main, initialize it and then print out its value.

Of course:

  • the warning is useful since this is quite error prone;
  • you can have a source file without the main() function (libraries).

EDIT

Some references:

  • main is not a reserved word (C++11):

    The function main shall not be used within a program. The linkage (3.5) of main is implementation-defined. A program that defines main as deleted or that declares main to be inline, static, or constexpr is ill-formed. The name main is not otherwise reserved. [ Example: member functions, classes and enumerations can be called main, as can entities in other namespaces. — end example ]

    C++11 - [basic.start.main] 3.6.1.3

    [2.11/3] [...] some identifiers are reserved for use by C++ implementations and standard libraries (17.6.4.3.2) and shall not be used otherwise; no diagnostic is required.

    [17.6.4.3.2/1] Certain sets of names and function signatures are always reserved to the implementation:

    • Each name that contains a double underscore __ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use.
    • Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.
  • Reserved words in programming languages.

    Reserved words may not be redefined by the programmer, but predefineds can often be overridden in some capacity. This is the case of main: there are scopes in which a declaration using that identifier redefines its meaning.

manlio
  • 18,345
  • 14
  • 76
  • 126
  • - I guess I'm rather beguiled by the fact that (as it **is** so error prone), why this is a warning (not an error), and why it is only a warning when compiled as C - Sure, you can compile without a `main()` function, but you can't link it as a program. What's happening here is that a "valid" program is being linked without a `main()`, just a `main`. – Geoff Nixon Jan 05 '15 at 10:20
  • 7
    `cin` and `endl` aren't in the default namespace -- they're in the `std` namespace. `npos` is a member of `std::basic_string`. – AnotherParker Jan 05 '15 at 20:57
  • 1
    `main` *is* reserved as a global name. None of the other things you mentioned, nor `main`, are predefined. – Potatoswatter Jan 07 '15 at 06:36
  • 1
    See C++14 §3.6.1 and C11 §5.1.2.2.1 for limitations on what `main` is allowed to be. C++ says "An implementation shall not predefine the main function" and C says "The implementation declares no prototype for this function." – Potatoswatter Jan 07 '15 at 10:49
  • @manlio: please clarify what you are quoting from. As for plain C the cites are wrong. So I guess it is any of the c++ standards isn't it? – dhein Jan 07 '15 at 11:42
19

Is int main; a valid C/C++ program?

It is not entirely clear what a C/C++ program is.

Is int main; a valid C program?

Yes. A freestanding implementation is allowed to accept such program. main doesn't have to have any special meaning in a freestanding environment.

It is not valid in a hosted environment.

Is int main; a valid C++ program?

Ditto.

Why does it crash?

The program doesn't have to make sense in your environment. In a freestanding environment the program startup and termination, and the meaning of main, are implementation-defined.

Why does the compiler warn me?

The compiler may warn you about whatever it pleases, as long as it doesn't reject conforming programs. On the other hand, warning is all that's required to diagnose a non-conforming program. Since this translation unit cannot be a part of a valid hosted program, a diagnostic message is justified.

Is gcc a freestanding environment, or is it a hosted environment?

Yes.

gcc documents the -ffreestanding compilation flag. Add it, and the warning goes away. You may want to use it when building e.g. kernels or firmware.

g++ doesn't document such flag. Supplying it seems to have no effect on this program. It is probably safe to assume that the environment provided by g++ is hosted. Absence of diagnostic in this case is a bug.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
17

It is a warning as it is not technically disallowed. The startup code will use the symbol location of "main" and jump to it with the three standard arguments (argc, argv and envp). It does not, and at link time cannot check that it's actually a function, nor even that it has those arguments. This is also why int main(int argc, char **argv) works - the compiler doesn't know about the envp argument and it just happens not to be used, and it is caller-cleanup.

As a joke, you could do something like

int main = 0xCBCBCBCB;

on an x86 machine and, ignoring warnings and similar stuff, it will not just compile but actually work too.

Somebody used a technique similar to this to write an executable (sort of) that runs on multiple architectures directly - http://phrack.org/issues/57/17.html#article . It was also used to win the IOCCC - http://www.ioccc.org/1984/mullender/mullender.c .

dascandy
  • 7,184
  • 1
  • 29
  • 50
9

Is it a valid program?

No.

It is not a program as it has no executable parts.

Is it valid to compile?

Yes.

Can it be used with a valid program?

Yes.

Not all compiled code is required to be executable to be valid. Examples are static and dynamic libraries.

You have effectively built an object file. It is not a valid executable, however another program could link to the object main in the resultant file by loading it at runtime.

Should this be an error?

Traditionally, C++ allows the user to do things that may be seem like they have no valid use but that fit with the syntax of the language.

I mean that sure, this could be reclassified as an error, but why? What purpose would that serve that the warning does not?

So long as there is a theoretical possibility of this functionality being used in actual code, it is very unlikely that having a non-function object called main would result in an error according to the language.

Michael Gazonda
  • 2,720
  • 1
  • 17
  • 33
  • It creates an externally visible symbol named `main`. How can a valid program, which must have an externally visible *function* named `main`, link to it? – Keith Thompson Jan 05 '15 at 18:30
  • @KeithThompson Load at runtime. Will clarify. – Michael Gazonda Jan 05 '15 at 18:38
  • It can because it is not able to tell the difference between symbol types. Linking works just fine - execution (except in the carefully crafted case) does not. – Chris Stratton Jan 05 '15 at 18:43
  • 1
    @ChrisStratton: I think Keith's argument is that linking fails because the symbol is multiply defined... because the "valid program" would not be a valid program unless it defines a `main` function. – Ben Voigt Jan 05 '15 at 21:34
  • @BenVoigt But if it appears in a library, then linking won't (and probably cannot) fail, because at program link-time, the `int main;` definition won't be visible. –  Jan 06 '15 at 10:55
  • @hvd: Why wouldn't it be? It doesn't have static visibility, neither explicitly nor implicitly (as `const int main;` would in C++). The linker might be able to discard the entire compilation unit containing the definition, especially if no other symbols are provided by it. – Ben Voigt Jan 06 '15 at 16:55
  • @BenVoigt At the very least, it certainly won't be visible at link-time if you don't specify the library at link-time. Like this answer hints at, `dlopen` or equivalent can load libraries at run-time. –  Jan 06 '15 at 18:09
  • @hvd: Dynamic loading is an interesting discussion, but there's no way to interpret Keith's comment to mean anything except linking. "if you don't specify the library at link-time" does not qualify under "How can... link to it?" Moreover, this question asks about programs, not libraries. – Ben Voigt Jan 06 '15 at 18:44
  • Visibility is different than existing. Existing makes linking to something possible, visibility makes it easy. – Michael Gazonda Jan 06 '15 at 18:45
  • @BenVoigt Yes, I think Keith Thompson missed the fact that this answer did mention loading libraries at run-time, so that his comment isn't really appropriate on this answer. But you may have a point when you say that the question is about programs. –  Jan 06 '15 at 18:55
  • "it has no executable parts." -You mean it has no executable parts inside the main function, not in general – Khaled.K Jan 07 '15 at 05:00
6

I would like to add to the answers already given by citing the actual language standards.

Is ‘int main;’ a valid C program?

Short answer (my opinion): only if your implementation uses a "freestanding execution environment".

All following quotes from C11

5. Environment

An implementation translates C source files and executes C programs in two dataprocessing-system environments, which will be called the translation environment and the execution environment [...]

5.1.2 Execution environments

Two execution environments are defined: freestanding and hosted. In both cases, program startup occurs when a designated C function is called by the execution environment.

5.1.2.1 Freestanding environment

In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined.

5.1.2.2 Hosted environment

A hosted environment need not be provided, but shall conform to the following specifications if present.

5.1.2.2.1 Program startup

The function called at program startup is named main. [...] It shall be defined with a return type of int and with no parameters [...] or with two parameters [...] or equivalent or in some other implementation-defined manner.

From these, the following is observed:

  • A C11 program can have a freestanding or a hosted execution environment and be valid.
  • If it has a freestanding one, there need not exist a main function.
  • Otherwise, there must be one with a return vale of type int.

In a freestanding execution environment, I would argue that it is a valid program that does not allow startup to happen, because there is no function present for that as required in 5.1.2. In a hosted execution environment, while your code introduces an object named main, it cannot provide a return value, so I would argue that it is not a valid program in this sense, although one could also argue like before that if the program is not meant to be executed (on might want to provide data only for example), then it just does not allow to do just that.

Is ‘int main;’ a valid C++ program?

Short answer (my opinion): only if your implementation uses a "freestanding execution environment".

Quote from C++14

3.6.1 Main function

A program shall contain a global function called main, which is the designated start of the program. It is implementation-defined whether a program in a freestanding environment is required to define a main function. [...] It shall have a return type of type int, but otherwise its type is implementation-defined. [...] The name main is not otherwise reserved.

Here, as opposed to the C11 standard, less restrictions apply to the freestanding execution environment, as no startup function is mentioned at all, while for a hosted execution environment, the case is pretty much the same as for C11.

Again, I would argue that for the hosted case, your code is not a valid C++14 program, but I am sure that it is for the freestanding case.

Since my answer only considers the execution environment, I think the answer by dasblinkenlicht comes into play, as name mangling occuring in the translation environment happens beforehand. Here, I am not so sure that the quotes above are observed so strictly.

Ingo Schalk-Schupp
  • 843
  • 10
  • 26
4

My point, I suppose, is that I really think this should be an error in a hosted environment, eh?

The error is yours. You didn't specify a function named main that returns an int and tried to use your program in a hosted environment.

Suppose you have a compilation unit that defines a global variable named main. This might well be legal in a freestanding environment because what constitutes a program is left up to the implementation in freestanding environments.

Suppose you have another compilation unit that defines a global function named main that returns an int and takes no arguments. This is exactly what a program in a hosted environment needs.

Everything's fine if you only use the first compilation unit in a freestanding environment and only use the second in a hosted environment. What if you use both in one program? In C++, you've violated the one definition rule. That is undefined behavior. In C, you've violated the rule that dictates that all references to a single symbol must be consistent; if they aren't it's undefined behavior. Undefined behavior is a "get out of jail, free!" card to developers of an implementation. Anything an implementation does in response to undefined behavior is compliant with the standard. The implementation doesn't have to warn about, let alone detect, undefined behavior.

What if you use only one of those compilation units, but you use the wrong one (which is what you did)? In C, the situation is clear-cut. Failure to define the function main in one of the two standard forms in a hosted environment is undefined behavior. Suppose you didn't define main at all. The compiler/linker doesn't haven't to say a thing about this error. That they do complain is a nicety on their behalf. That the C program compiled and linked without error is your fault, not the compiler's.

It's a bit less clear in C++ because failure to define the function main in a hosted environment is an error rather than undefined behavior (in other words, it must be diagnosed). However, the one definition rule in C++ means linkers can be rather dumb. The linker's job is resolving external references, and thanks to the one definition rule, the linker doesn't have to know what those symbols mean. You provided a symbol named main, the linker is expecting to see a symbol named main, so all is good as far as the linker is concerned.

David Hammen
  • 32,454
  • 9
  • 60
  • 108
4

For C so far it is implementation defined behavior.

As the ISO/IEC9899 says:

5.1.2.2.1 Program startup

1 The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:

int main(void) { /* ... */ }

or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):

int main(int argc, char *argv[]) { /* ... */ }

or equivalent; or in some other implementation-defined manner.

dhein
  • 6,431
  • 4
  • 42
  • 74
3

No, this is not a valid program.

For C++ this was recently explicitly made ill-formed by defect report 1886: Language linkage for main() which says:

There does not appear to be any restriction on giving main() an explicit language linkage, but it should probably be either ill-formed or conditionally-supported.

and part of the resolution included the following change:

A program that declares a variable main at global scope or that declares the name main with C language linkage (in any namespace) is ill-formed.

We can find this wording in the latest C++ draft standard N4527 which is the the C++1z draft.

The latest versions of both clang and gcc now make this an error (see it live):

error: main cannot be declared as global variable
int main;
^

Before this defect report, it was undefined behavior which does not require a diagnostic. On the other hand ill-formed code requires a diagnostic, the compiler can either make this a warning or an error.

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
  • Thanks for the update! Great to see this is now getting picked up with compiler diagnostics. However, I must say I find the changes in the C++ standard perplexing. (For background, see comments above regarding name mangling of `main()`.) I understand the rationale for disallowing `main()` from having an explicit linkage-specification, but I *don't* understand it mandating that `main()` have _C++ linkage_. Of course the standard does not directly address how to handle ABI linkage/name mangling, but in practice (say, with Itanium ABI) this would mangle `main()` to `_Z4mainv`. What am I missing? – Geoff Nixon Oct 11 '15 at 15:56
  • I think [supercat's comment](http://stackoverflow.com/questions/27777071/is-int-main-a-valid-c-c-program/32970511#comment53537723_27777332) covers that. If the implementation is doing its own thing before calling the user defined main then it could easily choose to call a mangled name instead. – Shafik Yaghmour Oct 23 '15 at 19:08