92

Given the standard definition for the main program:

int main(int argc, char *argv[]) {
   ...
}

Under which circumstances can argc be zero on a POSIX system?

dbush
  • 205,898
  • 23
  • 218
  • 273
Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • 13
    I'd try `int execv(const char *path, char *const argv[]);` with an `argv` containing only a `NULL` pointer to find out ;) –  Apr 13 '18 at 12:46
  • 4
    the C standard allows the `argc` to be `< 1`, if it is exactly `0` I found it here https://port70.net/~nsz/c/c11/n1570.html#5.1.2.2.1p2 – Achal Apr 13 '18 at 12:49
  • 1
    As for the language-lawyer tag: As already mentioned, the **language** allows it. POSIX isn't exactly a language definition, but I doubt it explicitly forbids it, see my first comment for what I *assume* might lead to `argc == 0`. –  Apr 13 '18 at 12:52
  • 13
    Given the widely used `if (argc < x) { fprintf(stderr, "Usage: %s ...", argv[0]); }`, I definitely see the **practical** relevance of this question :) –  Apr 13 '18 at 12:59
  • 3
    would be an effective dupe of https://stackoverflow.com/questions/8113786/executing-a-process-with-argc-0 if not for the precise stipulation that it has to be about POSIX – underscore_d Apr 13 '18 at 13:01
  • 2
    Another dup candidate could be: [When can argv\[0\] have null?](https://stackoverflow.com/q/2794150/1275169) – P.P Apr 13 '18 at 13:05
  • 1
    The irony is that the idiom can be reasonably adjusted to handle the problematic case by adding as little as six characters as follows: `if (argc < x) { fprintf(stderr, "Usage: %s ...", argv[0] || ""); }`. (The choice of blank string vs some other default, of course, is a matter of taste.) – mtraceur Apr 13 '18 at 19:47
  • 5
    @mtraceur `argv[0] || ""` is unfortunately an int... –  Apr 13 '18 at 20:44
  • 1
    @WumpusQ.Wumbley Good catch, you're absolutely right. This is what I get for firing code snippets "from the hip", so to speak. – mtraceur Apr 13 '18 at 21:54
  • Thank you all for those great comments and answers. @FelixPalmen This exactly why I was asking that! – Sylvain Leroux Apr 13 '18 at 21:58
  • 1
    @FelixPalmen At least with glibc that would actually still work and produce output `Usage: (null) ...`. But nevertheless it's certainly going to come as a surprise to many developers. And maybe somewhere there is a vulnerable SUID executable that can be exploited by having `argv[0]` be `NULL`. – kasperd Apr 14 '18 at 15:52
  • @FelixPalmen POSIX states *The argument arg0 should point to a filename that is associated with the process being started by one of the exec functions.* (Note: arg0 is the second argument of execl, same statement for execv). – Jean-Baptiste Yunès Apr 16 '18 at 08:27
  • @Jean-BaptisteYunès sure, but I don't read a "*should*" as a hard requirement ;) –  Apr 16 '18 at 08:57
  • @FelixPalmen Right! POSIX says **should** *For an implementation that conforms to POSIX.1-2017, describes a feature or behavior that is recommended but not mandatory. An application should not rely on the existence of the feature or behavior. An application that relies on such a feature or behavior cannot be assured to be portable across conforming implementations. For an application, describes a feature or behavior that is recommended programming practice for optimum portability.* – Jean-Baptiste Yunès Apr 16 '18 at 10:15
  • CVE-2021-4034 (polkit local privilege escalation) is one example of this going wrong. I too wasn't aware that argc could be 0. I thought it was a "shall" instead of a should. – ListsOfArrays Jan 26 '22 at 21:50

6 Answers6

107

Yes, it is possible. If you call your program as follows:

execl("./myprog", NULL, (char *)NULL);

Or alternately:

char *args[] = { NULL };
execv("./myprog", args);

Then in "myprog", argc will be 0.

The standard also specifically allows for a 0 argc as noted in section 5.1.2.2.1 regarding program startup in a hosted environment:

1 The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:

int main(void) { /* ... */ } 

or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):

int main(int argc, char *argv[]) { /* ... */ }

or equivalent; or in some other implementation-defined manner.

2 If they are declared, the parameters to the main function shall obey the following constraints:

  • The value of argc shall be nonnegative.
  • argv[argc] shall be a null pointer.

...

Note also that this means that if argc is 0 then argv[0] is guaranteed to be NULL. How printf treats a NULL pointer when used as the argument to a %s specifier is not spelled out in the standard however. Many implementations will output "(null)" in this case but it's not guaranteed.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • 8
    @SylvainLeroux You (and others reading this answer later) may find it interesting to know, in addition to the documentation about the general case, that as of a couple of years ago, *one* system that aims to be POSIX-compliant *has* in fact made it impossible to invoke programs with `argc == 0`: OpenBSD. – mtraceur Apr 13 '18 at 22:28
  • @mtraceur, out of curiosity: what does OpenBSD do, if you call `exec` with only the path and no arguments for the program? (fail? copy the path to the zeroth argument?) – ilkkachu Apr 13 '18 at 22:48
  • 1
    @ilkkachu On OpenBSD `execve` fail with the error `EINVAL` if you invoke it with an empty `argv`. It's easy to miss in the [`execve` manual page](https://man.openbsd.org/execve), as the error behavior for that condition is only mentioned in the list of possible errors at the bottom. – mtraceur Apr 14 '18 at 00:25
  • @mtraceur Interesting, as FreeBSD *does* allow it. – dbush Apr 14 '18 at 20:17
  • @mtraceur Very interesting. I wonder if there are some proprietary POSIX-certified *nix systems having made the same choice as OpenBSD. – Sylvain Leroux Apr 14 '18 at 23:12
  • Note that the `NULL` **must be** cast; it will have wrong representation if `NULL` isn't explicitly cast as `void *`. – Antti Haapala -- Слава Україні Apr 15 '18 at 10:14
  • @AnttiHaapala Correct. Updated to reflect. – dbush Apr 15 '18 at 13:27
  • @AnttiHaapala Could you clarify what you mean? I thought at first you meant that there needs to be an explicit cast between `char *` and/or `void *`, but as far as I know [`char *` and `void *` have the same representation](https://stackoverflow.com/a/39872598/4372452), and [the `NULL` macro is typically either a plain `0` or a `0` cast to `void *`, per definition of "null pointer constant"](https://stackoverflow.com/a/18781550/4372452) so I think I am not understanding what needs to be cast to what here? – mtraceur Apr 16 '18 at 18:43
  • 1
    @mtraceur the NULL in varargs part. 0 would be an int not pointer. – Antti Haapala -- Слава Україні Apr 17 '18 at 01:08
  • Shouldn't the `execl` example be `execl("./myprog", (char *)NULL);`? – Barmar Apr 17 '18 at 20:25
  • @Barmar No, the signature is `int execl(const char *path, const char *arg, ...);`. The second argument is a known `const char *` so it doesn't need to be casted, and the absence of any additional parameters gives the warning "not enough variable arguments to fit a sentinel". – dbush Apr 17 '18 at 20:31
28

To add to the other answers, there is nothing in C (POSIX or not) preventing main() from being called as a function within the program.

int main(int argc, int argv[]) {
    if (argc == 0) printf("Hey!\n");
    else main(0,NULL);

    return 0;
}
Possum
  • 537
  • 3
  • 11
  • 12
    ... Huh. I thought that was specifically prohibited, but it turns out only C++ does that and C is fine with it. – Daniel H Apr 13 '18 at 22:33
24

Yes, it can be zero, meaning that argv[0] == NULL.

It's a convention that argv[0] is the name of the program. You can have argc == 0 if you launch yourself the binary, like with execve family and don't give any argument. You can even give a string that is nowhere near to be the program name. That's why using argv[0] to get the name of the program is not entirely reliable.

Usually, the shell where you type your command-line always add the program name as the first argument, but again, it's a convention. If argv[0] == "--help" and you use getopt to parse option, you will not detect it because optind is initialized to 1, but you can set optind to 0, use getopt and "help" long option will show up.

long story short : It's perfectly possible to have argc == 0 (argv[0] is not really special by itself). It happen when the launcher doesn't give argument at all.

bobsburner
  • 117
  • 8
Tom's
  • 2,448
  • 10
  • 22
  • "but you can set optind to 0" -- Not portably. POSIX says "If the application sets *optind* to zero before calling *getopt* (), the behavior is unspecified." –  Apr 13 '18 at 13:48
  • Hu ? Where did you read this ? I usually read the manual from linux.die and there is nothing specified about the undefined behavior, so if I could update my documentation it would be great. – Tom's Apr 13 '18 at 13:55
  • That's because most system's manpages describe the behaviour on that system, so don't necessarily document what is and isn't system-specific. I forget whether it's actually allowed to link to the official spec directly, so I'm only going to link to http://www.unix.org/version4/, with a note that direct links are readily available from many others. –  Apr 13 '18 at 14:16
  • The standard says: "If the value of argc is greater than zero, the string pointed to by argv[0] represents the program name; argv[0][0] shall be the null character if the program name is not available from the host environment." This doesn't sound as if `argv[0] =="--help"` could be allowed. – Gerhardh Apr 13 '18 at 14:20
  • 7
    Well, '--help' is a legit filename, so, why argv[0] can't be '--help' ? – Tom's Apr 13 '18 at 14:24
  • As I understand it in the answer above, `"--help"` is meant to be the first option, not the name of the executable. If the executable is `"--help"`, then of course it is possible. – Gerhardh Apr 13 '18 at 14:44
  • ... There is no link between argv[0] and the real name of the executable. Your program can be "1.bin", if I execute it with execve and providing "2.exe", what do you think will happen ? Nothing. argv[0] will be "2.exe" and your program will be running like a charm. It's just a convenient convention that argv[0] should be the program name. So if the name of the program is "--help", and after the first getopt I reset optind to 0, then "--help" will appear. I just point out that yes, argv[0] is commonly the program name (even Unix C function like getopt take that in count), but – Tom's Apr 13 '18 at 14:59
  • you have no guarantee that it will always be the case. – Tom's Apr 13 '18 at 14:59
  • 2
    The wording "it's a convention" is slightly inaccurate. A program not following that "convention" is not conforming to POSIX (read the "Rationale" part of the documentation). So while it is very possible to do something different, doing so is non-conforming. Whether or not this is something one should expect to happen (because it is possible) or not is up to debate. I'm in the "don't support broken software" team. – Damon Apr 13 '18 at 15:20
  • I agree on this point. It's better to conform than doing some hacking-like thing that will only lead to confusion. But well, that could exist, so be carefull. – Tom's Apr 13 '18 at 15:24
  • Postel's Law applies -- "be conservative in what you do, be liberal in what you accept from others". – Charles Duffy Apr 13 '18 at 16:47
  • 6
    @CharlesDuffy: That's how we got stuck with tag soup (HTML quirks mode), email spam (SMTP abuse), and the regular internet blackholing of random countries (bad BGP pushes). If somebody is violating the standard, and there is no historical reason for allowing them to do so, then you should fail. Loudly. – Kevin Apr 13 '18 at 17:37
  • 1
    @Damon The rationale is not normative and it would not be the first time the rationale makes a claim that is not backed up by the normative text. I am unable to find any normative text that says passing a null pointer for the program name is non-conforming. –  Apr 13 '18 at 22:18
  • @hvd: Try the standard as e.g. linked in Stargateur's answer. It begins wtih saying _"arg0 should point to a filename string"_ which admittedly says _should_, not _must_. But even _should_ suggests that a null pointer isn't as-intended. Three screen pages further down, under "Rationale", you can read _"The wording, in particular the use of the word should, requires a Strictly Conforming POSIX Application to pass at least one argument"_. What one makes of that is up to one's personal opinion/religion. My personal POV is: "Don't _encourage_ non-conformers" (which you do by supporting them). – Damon Apr 14 '18 at 10:01
  • @Damon (Apologies for incorrect earlier comment, deleted and fixed.) I linked to the standard in an earlier comment. "should" is defined as "For an application, describes a feature or behavior that is recommended programming practice for optimum portability." It is not a requirement, it just means it's a recommendation to support some not-100%-POSIX implementations. And the rationale, again, is not normative. –  Apr 14 '18 at 10:10
  • @CharlesDuffy: Standards that need to wrangle dialects that predate them should strongly encourage documents/programs to specify what dialect they're using; programs should be liberal in the *range of dialects* they accept, but encouraged to flag things that are inconsistent with a declared dialect. Unfortunately, the authors of the C Standard have yet to recognize this principle. – supercat May 02 '18 at 15:31
17

Early proposals required that the value of argc passed to main() be "one or greater". This was driven by the same requirement in drafts of the ISO C standard. In fact, historical implementations have passed a value of zero when no arguments are supplied to the caller of the exec functions. This requirement was removed from the ISO C standard and subsequently removed from this volume of POSIX.1-2017 as well. The wording, in particular the use of the word should, requires a Strictly Conforming POSIX Application to pass at least one argument to the exec function, thus guaranteeing that argc be one or greater when invoked by such an application. In fact, this is good practice, since many existing applications reference argv[0] without first checking the value of argc.

The requirement on a Strictly Conforming POSIX Application also states that the value passed as the first argument be a filename string associated with the process being started. Although some existing applications pass a pathname rather than a filename string in some circumstances, a filename string is more generally useful, since the common usage of argv[0] is in printing diagnostics. In some cases the filename passed is not the actual filename of the file; for example, many implementations of the login utility use a convention of prefixing a ( '-' ) to the actual filename, which indicates to the command interpreter being invoked that it is a "login shell".

Also, note that the test and [ utilities require specific strings for the argv[0] argument to have deterministic behavior across all implementations.

Source


Can argc be zero on a POSIX system?

Yes but it would not be strictly conforming to POSIX.

Stargateur
  • 24,473
  • 8
  • 65
  • 91
  • 2
    Does "POSIX system" imply "every single program on the system is a 'Strictly Conforming POSIX Application"? Genuine question - I don't know the pedantic semantics of the standard on this matter. I do know that I can take a system that is certified as a POSIX System and trivially write/run a program on it that isn't a "Strictly Conforming POSIX Application", and I'm not sure if it's the intended meaning of the POSIX standard to say that the system becomes non-conforming the moment any non-strictly-conforming third-party application is installed on it? – mtraceur Apr 13 '18 at 19:26
  • @mtraceur Isn't that a completely separate question? Not sure if you expect it to be answered in this comment thread or if you want the extra information added to the answer. Both seems weird. – pipe Apr 13 '18 at 20:24
  • @mtraceur if your program run on linux, you can expect that you program will run under Posix standard, the responsibility is going to the program that execute you to be Posix conforming, aka your shell in general. That say, nothing prevent a shell to not being posix conforming but I highly doubt that someone will use it ;), if your distribution of linux allow non conforming Posix program in their packets, is not the problem of linux. – Stargateur Apr 13 '18 at 20:37
  • @pipe I'm asking for clarification on that point because I think it would determine whether the implicit reasoning in this answer applies to the question, or is misleading: The answer "Yes but it would not be strictly conforming to POSIX" seems to me to logically imply that the mere presence (or perhaps execution) of any program that can run `char *nothing = { 0 }; execve(*prog, nothing, nothing)` on a system would lead to POSIX declaring that system "not strictly conforming". This strikes me as unlikely to be what the POSIX standard intended? – mtraceur Apr 13 '18 at 21:33
  • @Stargateur Can we really expect that? Let's say we write a C program that is a "strictly conforming POSIX application", and run it on Linux, or FreeBSD, or one of the most recent HP-UX or Solaris systems (on any other Unix that's POSIX-conformant). So we've got our strictly conforming POSIX application on a POSIX system: Now some other developer comes along, writes a small program that calls our program with the `execve` system call - but that developer makes a small mistake and calls us without any arguments: Is it fair to say the *system* as a whole is no longer "strictly conforming POSIX"? – mtraceur Apr 13 '18 at 21:43
  • @Stargateur You have thatif statement backwards. Plenty of programs run on Linux, but not more general POSIX systems; almost any distribution will include kernel modules, or programs that use capabilities or process namespaces or something. Almost anything POSIX should run on Linux (although there are some incompatibilities); the reverse is not true. – Daniel H Apr 13 '18 at 22:29
4

whenever you want to run any executable like ./a.out it will have one argument thats the program name. But It is possible to run a program with argc as zero in Linux, by executing it from another program that calls execv with an empty argument list.

for e.g

int main() {
    char *buf[] = { NULL };
    execv("./exe", buf); /* exe is binary which it run with 0 argument */
    return 0;
}
Achal
  • 11,821
  • 2
  • 15
  • 37
4

TL;DR: Yes, argv[0] can be NULL, but not for any good/sane reason I know of. However, there are reasons not to care if argv[0] is NULL, and to specifically allow the process to crash if it is.


Yes, argv[0] can be NULL on a POSIX system, if and only if it was executed without any arguments.

The more interesting practical question is, should your program care.

The answer to that is "No, your program can assume argv[0] is not NULL", because some system utilities (command-line utilities) either do not work or work in a non-deterministic fashion, when argv[0] == NULL, but more importantly, there is no good reason (other than stupidity or nefarious purposes) why any process would do that. (I'm not sure if the standard usage of getopt() also fails then — but I would not expect it to work.)

A lot of code, and indeed most examples and utilities I write, begin with the equivalent of

int main(int argc, char *argv[])
{
    if (argc < 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
        printf("Usage: %s [ -h | --help ]\n", argv[0]);
        /* ... print usage ... */
        return EXIT_SUCCESS;
    }

and this is reasonable and acceptable, because there is no good reason for a process to exec another process without providing at least the command path being executed, i.e. execlp(cmd, cmd, NULL) rather than execlp(cmd, NULL).

(However, I can think of a few nefarious reasons, like exploiting timing race windows related to pipe or socket commands: an evil process sends an evil request via an established Unix domain socket, and then immediately replaces itself with an authorized victim command (run without any arguments, to ensure minimum start-up time), so that when the service getting the request checks the peer credentials, it sees the victim command, instead of the original evil process. It is, in my opinion, best for such victim commands to crash hard and fast (SIGSEGV, by dereferencing a NULL pointer), rather than try and behave "nicely", giving the evil process a larger time window.)

In other words, while it is possible for a process to replace itself with another but without any arguments causing argc to be zero, such behaviour is unreasonable, in the strict sense that there is no known non-nefarious reason to do so.

Because of this, and the fact that I love making life hard for nefarious and uncaring programmers and their programs, I personally will never add the trivial check, similar to

static int usage(const char *argv0)
{
    /* Print usage using argv0 as if it was argv[0] */
    return EXIT_SUCCESS;
}

int main(int argc, char *argv[])
{
    if (argc < 1)
        return usage("(this)");
    if (argc < 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help"))
        return usage(argv[0]);

    /* argv[0] and argv[1] are non-NULL, argc >= 2 */

except if requested by someone with a particular existing use case in mind. And even then I'd be a bit suspicious, wanting to verify the use case myself first.

Nominal Animal
  • 38,216
  • 5
  • 59
  • 86
  • Do you consider checking programs to see if they were written with careful attention to corner-cases a good/sane/non-nefarious usecase? Because I regularly invoke every new program I use with zero arguments for a quick estimate of what sort of behavior I can expect from the program in other "shouldn't happen / shouldn't be done" circumstances. I even have a [command-line tool](https://github.com/mentalisttraceur/exec0/) so I can do it quickly and easily without having to write code every time for it (though admittedly I wrote the tool more for controlling the zeroth argument in general). – mtraceur Apr 13 '18 at 19:22
  • @mtraceur: In general yes, but in this particular case no. This is simply because I think the no-argument case has no proper use cases (that I know of), but at least one nefarious use case; and the nefarious use case is best defeated by the program dying as early as possible (and due to SIGSEGV is a good way of doing that). I do not think that checking the no-argument case is any kind of an indication of whether the program otherwise behaves sanely in *"should not occur"* -type of situations. – Nominal Animal Apr 13 '18 at 20:58
  • @mtraceur: One example is `write()` (low-level C function; a syscall wrapper in Linux) returning a negative value other than -1. It simply should not occur, so most programmers do not test for it. Yet, it has occurred in Linux, due to a kernel filesystem bug, for writes over 2 GiB. (Currently, writes are limited to just under 2 GiB at the syscall level, to avoid similar bugs in other filesystem drivers.) My own code is the only one I have seen that checks that case too. (I treat it as an `EIO` error.) Yet, as I said, I have decided not to check for `argc == 0` in my code. – Nominal Animal Apr 13 '18 at 21:01
  • I've decided to +1 your answer, our discussion aside, because I think your perspective and approach is valuable. I wish more people thought so critically and carefully about whether to handle (or not to handle) the full range of corner-cases that might happen. – mtraceur Apr 13 '18 at 22:03
  • Correct me if I'm wrong: Does the C standard actually guarantee any sane behavior, let alone dying-as-early-as-possible, if you dereference `argv[0]` when `argc == 0`? Isn't that dereferencing a null pointer, which in turn means undefined behavior? Are you comfortable trusting that every implementation of C your code will be ran against will perform something reasonable? – mtraceur Apr 13 '18 at 22:11
  • As for whether checking the no-argument case correlates to whether the program otherwise behaves sanely in corner-cases: Do you think *most people* who don't check for `argc == 0` do it from your position, or as part of a general pattern of overlooking corner-cases? Or yet another angle: a coder who approaches things with as much careful consideration as you do may or may not decide to handle that case, but a careless coder will almost certainly not handle it - which type of coder is more common? Given just your code without your rationale, I cannot know which it was. – mtraceur Apr 13 '18 at 22:23
  • @mtraceur: re. dereferencing NULL: no, no guarantees, only practical observation. Yes, I do expect all sane C implementations to abort a process when a NULL pointer is dereferenced. No, I do not think `argc == 0` check is indicative of anything, that's why I don't think it is useful as an indicator/test at all. A better test would be to interpose `read()` and `write()`, and see what happens when real-life errors (short reads/writes, as might happen with pipes or sockets) occur; or when `malloc()`/`realloc()` fails with NULL. Those might be indicative of the general developer attitude. – Nominal Animal Apr 14 '18 at 02:21
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/168962/discussion-between-mtraceur-and-nominal-animal). – mtraceur Apr 14 '18 at 04:22