13

I recently saw something curious. In the HHVM source code, the very first 3 lines of the main() function read as follows:

if (!argc) {
  return 0;
}

It's a bit silly, but still, I just can't help wondering... why return 0!? It's not that I think there's some correct way to handle this, but returning 0, usually associated with success, seems particularly inappropriate.

Besides not crashing, is there ever a case where there's an appropriate response to argc being 0? (Or even less than 0?) Does it ever matter?

The only way I know of to end up in a case with argc of 0 is with exec() and friends. If for some reason that does happen, it's almost certainly a bug in the caller and the callee can't do much about it.

(tagged as C and C++ because I expect that the answer is the same for the two languages)

Edit: To try and make the question less vague and philosophical, I'll offer an alternative.

if (!argc) {
  puts("Error: argc == 0");
  return 1;
}

The key points are that there's an indication of the error and a non-zero value is returned. It's extremely unlikely this would be needed, but if it was you might as well try to indicate the error. On the other hand, if the detected error is as serious as argc equal to 0, maybe there's a reason it would be bad to try and access stdout or the C standard library.

Praxeolitic
  • 22,455
  • 16
  • 75
  • 126
  • 8
    `argc` is required to be nonnegative by standard. – Columbo Jan 18 '15 at 20:34
  • 3
    Well, why not. It was asked to do nothing, it successfully did nothing. – Hans Passant Jan 18 '15 at 20:45
  • 2
    @HansPassant What makes you say it was asked to do nothing? It strikes me as it was asked a nonsensical question. – Praxeolitic Jan 18 '15 at 20:53
  • So, could we say then that this check doesn't make any sense, unless you are debugging the OS? – Chiel Jan 18 '15 at 21:13
  • I think the check itself does make sense. If you do get `argc` equal to 0 something went wrong but the program shouldn't crash so the only thing to do is exit. What gets me is the `return 0;`. – Praxeolitic Jan 18 '15 at 21:18
  • For completeness' sake: Which platforms does HHVM target? As a real suggestion: Are there any `exec*` calls? – mafso Jan 18 '15 at 21:31
  • @mafso HHVM primarily targets Linux and also targets OS X. There are plenty of occurrences of `exec*(` in the source being called from C++, PHP, and shell script. I haven't yet found one that would cause `argc == 0`, – Praxeolitic Jan 18 '15 at 22:03
  • There is an interesting difference between C and C++: In C, you are allowed to call `main` from your program. So maybe they call it somewhere and pass it 0 as the first parameter as some mechanism to break the recursion. In that case, silently returning success seems appropriate. I'm not familiar with the software you are talking about so I don't know whether it actually does this but you might want to look for it. – 5gon12eder Jan 19 '15 at 00:22
  • 1
    @Columbo `0` is non-negative – M.M Jan 19 '15 at 07:34
  • 1
    @MattMcNabb ... have you read the question? I'm referring to "(Or even less than 0?)". – Columbo Jan 19 '15 at 10:49

2 Answers2

10

Note that the C11 standard explicitly allows for argc == 0:

5.1.2.2.1 Program startup

¶1 The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:

int main(void) { /* ... */ }

or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):

int main(int argc, char *argv[]) { /* ... */ }

or equivalent;10) or in some other implementation-defined manner.

¶2 If they are declared, the parameters to the main function shall obey the following constraints:

  • The value of argc shall be nonnegative.
  • argv[argc] shall be a null pointer.
  • If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings, which are given implementation-defined values by the host environment prior to program startup. The intent is to supply to the program information determined prior to program startup from elsewhere in the hosted environment. If the host environment is not capable of supplying strings with letters in both uppercase and lowercase, the implementation shall ensure that the strings are received in lowercase.
  • If the value of argc is greater than zero, the string pointed to by argv[0] represents the program name; argv[0][0] shall be the null character if the program name is not available from the host environment. If the value of argc is greater than one, the strings pointed to by argv[1] through argv[argc-1] represent the program parameters.
  • The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.

10) Thus, int can be replaced by a typedef name defined as int, or the type of argv can be written as char ** argv, and so on.

The two bullet points saying 'if the value of argc is greater than zero' clearly allow argc == 0, though it would be unusual for that to be the case.

Theoretically, therefore, a program could take precautions against it, though argv[0] == 0 even if argc == 0, so as long as the code doesn't dereference a null pointer, it should be fine. Many programs, perhaps even most, do not take such precautions; they assume that argv[0] will not be a null pointer.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
5

I think it's just a case of Defensive programming due to the following snippet in the HHVM's sorce code (file hphp/hhvm/main.cpp):

int main(int argc, char** argv) {
  if (!argc) {
    return 0;
  }
  HPHP::checkBuild();
  int len = strlen(argv[0]);

In the line:

int len = strlen(argv[0]);

if argc == 0 -> argv[0] == NULL and strlen(argv[0]) will cause a segmentation fault.

I'm not familiar with HHVM but they can just suppose some program can call the program without arguments (not even the program name).

whoan
  • 8,143
  • 4
  • 39
  • 48
  • Which makes me wonder: Why should `if(!argc) return;` be any better than the segmentation fault? In either case the process is stopped immediately, and the caller has to handle the failure. In the case of a segmentation fault, we get a non-zero exit status returned to the caller, which is better than pretending everything's fine by `return 0;`. Is this just this religious thing about avoiding segfaults? – cmaster - reinstate monica Oct 19 '15 at 22:33
  • 3
    @cmaster `strlen(argv[0])` causes undefined behaviour. This has many worse potential consequences than a segmentation fault. – M.M Oct 19 '15 at 22:44
  • @M.M Well, if `argv[0]` is `NULL` (which it must be since `argv` is null-terminated), the only way `strlen(argv[0])` can do something other than segfaulting is if you are on a braindead system where someone has actually mapped something to address zero. And even if there *were* a page and some data to read, all `strlen()` implementation would still either segfault or just return the number of nonzero bytes at address zero. That a language lawyer says that something is undefined behavior does not mean that the behavior is unpredictable. – cmaster - reinstate monica Oct 19 '15 at 23:22
  • 1
    @cmaster anything can happen when it is undefined behaviour. Relying on a particular compiler's happenstance treatment of UB is just setting yourself up for trouble for no reason. One example: there are well known instances of bugs where programmers expected a segfault but the optimizer removed the line entirely. – M.M Oct 19 '15 at 23:30
  • @M.M The optimizer can't remove anything in `strlen(argv[0])` because the call has to work in the non-null case. The UB enters the program via input data. The segfault happens within the implementation of `strlen()`. – cmaster - reinstate monica Oct 20 '15 at 08:24