30

Is it safe to use the argv pointer globally? Or is there a circumstance where it may become invalid?

i.e: Is this code safe?

char **largs;
void function_1()
{
    printf("Argument 1: %s\r\n",largs[1]);
}
int main(int argc,char **argv)
{
    largs = argv;
    function_1();
    return 1;
}
Yu Hao
  • 119,891
  • 44
  • 235
  • 294
Steve Dell
  • 575
  • 1
  • 7
  • 23
  • 2
    No, the code is not save. The program may be called without arguments, so only argv[0] or largs[0] would contain a string. –  Jul 24 '15 at 12:39
  • 7
    @manni66: Good observation, though not really the point of the question: but you could fix it with something like `if (largs[0] && largs[1])`. – Nate Eldredge Jul 24 '15 at 13:47
  • 1
    @NateEldredge: What makes you think that `largs[1]` would be null if `argc==1` ? The OP just needs to make a global `int largsc;` and assign it `largsc = argc` in main() just like he did for `largs`. Then `function_1()` has access to not only the program's `argv`, but also its `argc`. – phonetagger Jul 24 '15 at 20:03
  • 4
    @phonetagger: C89 2.1.2.2, that's what. (Or C99 5.1.2.2.1.) "`argv[argc]` shall be a null pointer." Of course, you can also do as you suggest, but it's often convenient to take advantage of the fact that `argv` is guaranteed to be null-terminated. – Nate Eldredge Jul 24 '15 at 20:33
  • 1
    @NateEldredge: Ouch, that hurt! :) I never knew that. That might have simplified a few things over the years if I had. Thanks for the info. – phonetagger Jul 24 '15 at 20:36
  • @phonetagger: You're welcome! I didn't mean to sound snappy, I guess I should have added a smiley :-) I have occasionally seen people (including me) write things like `char **p; for (p = argv; *p; p++) process_arg(*p);`. On a side note, it appears that ISO C permits `argc` to be 0, so unless you are in an environment that guarantees that won't happen, you can't even be sure that `argv[0]` contains a string. – Nate Eldredge Jul 24 '15 at 20:42
  • It is perhaps a useful historical note that `getopt` and it's many descendants and distant relations take `argc` and `argv` as arguments so there is a long standing tradition of accessing the strings pointed to by `argv` in functions other than `main`. – dmckee --- ex-moderator kitten Jul 24 '15 at 22:49
  • On some old nonunix systems argv[0] contains "". I believe Nate Eldredge ment `char **p; for (p = argv+1; *p; p++) process_arg(*p);` – Mikkel Christiansen Jul 25 '15 at 09:27

6 Answers6

41

Yes, it is safe to use argv globally; you can use it as you would use any char** in your program. The C99 standard even specifies this:

The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.

The C++ standard does not have a similar paragraph, but the same is implicit with no rule to the contrary.

Note that C++ and C are different languages and you should just choose one to ask your question about.

haccks
  • 104,019
  • 25
  • 176
  • 264
TartanLlama
  • 63,752
  • 13
  • 157
  • 193
  • 4
    It's likely that the OP was intending to know the behavior in both C and C++, not one or the other, considering that both languages use `argv` and he explicitly tagged `c` and `c++` when he first posted the question :) – Chris Cirefice Jul 24 '15 at 14:48
  • 1
    To be pedantic, `argv` is really in existence only during the execution of `main()`, so there might be some problems accessing `argv` in something like an `atexit()` handler. – Lee Daniel Crocker Jul 24 '15 at 19:40
  • 1
    @LeeDanielCrocker Is that true? The spec can be tricky, but it looked to me like TartanLlama quoted the spec saying it remained until termination. This would suggest to me that the memory is available through termination, and main() is simply passed a pointer to it. – Cort Ammon Jul 25 '15 at 03:15
19

It should be safe so long as main() function does not exit. A few examples of things that can happen after main() exits are:

  1. Destructors of global and static variables
  2. Threads running longer than main()

Stored argv must not be used in those.

The reference doesn't say anything which would give a reason to assume that the lifetimes of the arguments to main() function differ from the general rules for lifetimes of function arguments.

So long as argv pointer itself is valid, the C/C++ runtime must guarantee that the content to which this pointer points is valid (of course, unless something corrupts memory). So it must be safe to use the pointer and the content that long. After main() returns, there is no reason for the C/C++ runtime to keep the content valid either. So the above reasoning applies to both the pointer and the content it points to.

Serge Rogatch
  • 13,865
  • 7
  • 86
  • 158
  • Could you give references to back this up? – ComicSansMS Jul 24 '15 at 08:39
  • @ComicSansMS, no, this is just common-sense based on general rules for local variable & parameter lifetimes: till the exit of the function to which they belong. – Serge Rogatch Jul 24 '15 at 08:44
  • @ComicSansMS, I've added a link to the reference and edited the answer. The point is that the reference for `main()` doesn't say anything special about the lifetimes of its arguments, so it is reasonable to assume that the arguments of `main()` follow the general rules for lifetimes. – Serge Rogatch Jul 24 '15 at 08:57
  • Yes, it's pretty hard to come up with definitive sources on this issue, that's why I asked. Thanks for trying though, much obliged. – ComicSansMS Jul 24 '15 at 09:00
  • 3
    To some extent your argumen seems to apply mainly to the lifetime of the `argv` pointer itself (i.e. if you'd stored a pointer _to_ it from `&argv`) rather than the contents. – Random832 Jul 24 '15 at 12:35
  • One solution to test this theory is to actually spin off a thread that lasts longer than `main()`, then try to access `argv` through a stored pointer, as @Random832 suggested. Maybe it's undefined behavior because `argv` contents are obliterated when `main()` exits (unlikely). Maybe it's perfectly fine because `argv` stays in memory for a while. Maybe you'll get garbage after 30 seconds when that pointer is overwritten by some other process because `main()` no longer controls that segment of memory. Who knows! – Chris Cirefice Jul 24 '15 at 14:53
  • @ChrisCirefice Since threads are not in standard C, it depends on the platform-specific standards or documentation of the particular threading library. AIUI on most systems, the thread will not continue executing at all. – Random832 Jul 24 '15 at 14:55
  • @Random832 That was my bad, I meant *process* :P as far as I remember from my OS course, `fork()/execve()` will have a different memory space entirely, so holding a pointer to the parent's `argv` would likely cause a segmentation fault(?). Now I'm not so sure, but it's one of those cases that would need to be actually demonstrated to know for sure since it's not defined in the standard. – Chris Cirefice Jul 24 '15 at 15:00
  • @Random832, the point is that as long as `argv` pointer itself is valid, the C/C++ runtime must guarantee that the content to which this pointer points is valid (of course, unless something corrupts memory). So it must be safe to use the pointer and the content that long. After `main()` returns, there is no reason for the C/C++ runtime to keep the content valid either. So my reasoning applies to both the pointer and the content it points to. – Serge Rogatch Jul 25 '15 at 08:13
  • @SergeRogatch You can't just assume that - the reasons the runtime _might_ have to keep the content valid are, after all, precisely the ones that this question is about: someone may have made a copy of it that is referenced by atexit handlers, signal handlers, or other threads that execute after main() has returned. And what the standard says is that the values are valid until "program termination". – Random832 Jul 26 '15 at 19:25
8

is it safe to use the argv pointer globally

This requires a little more clarification. As the C11 spec says in chapter §5.1.2.2.1, Program startup

[..].. with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared)

That means, the variables themselves have a scope limited to main(). They are not global themselves.

Again the standard says,

The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.

That means, the lifetime of these variables are till main() finishes execution.

So, if you're using a global variable to hold the value from main(), you can safely use those globals to access the same in any other function(s).

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
  • 1
    Does "program termination" consist of the end of main? Or can it be after that? – Yakk - Adam Nevraumont Jul 24 '15 at 20:18
  • @Yakk It is after that, see `atexit()` – Mikkel Christiansen Jul 25 '15 at 09:45
  • @mikk sure, that seems reasonable: but it disagrees with the above answer, *and* I have not seen a standard quote making it explicit. Is there one? Random in another answer provided a casual comment by someone who may be on the comittee about intent. – Yakk - Adam Nevraumont Jul 25 '15 at 12:16
  • @Yakk main is called from the initialization part of the program. The top element of the stack is the return address for main to return to the routine that handles the rest of the dying. The string array pointed to by argv still exist to the very end. However if the pointers in it has been malloc'ed anew, you could face problems. – Mikkel Christiansen Jul 25 '15 at 12:43
5

This thread on the comp.lang.c.moderated newsgroup discusses the issue at length from a C standard point of view, including a citation showing that the contents of the argv arrays (rather than the argv pointer itself, if e.g. you took an address &argv and stored that) last until "program termination", and an assertion that it is "obvious" that program termination has not yet occurred in a way relevant to this while the atexit-registered functions are executing:

The program has not terminated during atexit-registered function processing. We thought that was pretty obvious.

(I'm not sure who Douglas A. Gwyn is, but it sounds like "we" means the C standard committee?)

The context of the discussion was mainly concerning storing a copy of the pointer argv[0] (program name).

The relevant C standard text is 5.1.2.2.1:

The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.

Of course, C++ is not C, and its standard may subtly differ on this issue or not address it.

Random832
  • 37,415
  • 3
  • 44
  • 63
3

You can either pass them as parameters, or store them in global variables. As long as you don't return from main and try to process them in an atexit handler or the destructor of an variable at global scope, they still exist and will be fine to access from any scope.

ameyCU
  • 16,489
  • 2
  • 26
  • 41
2

yes, it is safe for ether C or C++, because there no thread after main was finish.

wangli_64604
  • 103
  • 5
  • What guarantees this? C doesn't have e.g. any destructors to kill all the additional threads on exit from `main`. – Ruslan May 08 '18 at 09:27