39

C11 5.1.2.2.1/2 says:

The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.

My interpretation of this is that it specifies:

int main(int argc, char **argv)
{
    if ( argv[0][0] )
        argv[0][0] = 'x';   // OK

    char *q;
    argv = &q;              // OK
}

however it does not say anything about:

int main(int argc, char **argv)
{
    char buf[20];
    argv[0] = buf;
}

Is argv[0] = buf; permitted?

I can see (at least) two possible arguments:

  • The above quote deliberately mentioned argv and argv[x][y] but not argv[x], so the intent was that it is not modifiable
  • argv is a pointer to non-const objects, so by in the absence of specific wording to the contrary, we should assume they are modifiable objects.
M.M
  • 138,810
  • 21
  • 208
  • 365
  • Related: [this answer](http://stackoverflow.com/a/25126140/1505939) which asserts that `argv[n]` is non-modifiable but does not provide any justification for that assertion – M.M Sep 09 '14 at 05:49
  • 4
    When it says that `argv` can be modified I take that to mean that `argv[n]` can be. (*Of course* the `argv` pointer *itself* can be modified, it's just a function-local argument.) – cdhowie Sep 09 '14 at 05:52
  • 5
    @cdhowie Why do they bother to say that `argc` can be modified, that same "of course" applies to it? I think they're just talking about the local variable, not the pointers. – Barmar Sep 09 '14 at 05:53
  • 4
    @cdhowie But it says "*the strings* pointed to". `argv[n]` is not a string; it's a pointer that points to the first character of a string. – M.M Sep 09 '14 at 05:54
  • @Barmar Fair point, though it seems silly to list `argc` and `argv` explicitly at all... – cdhowie Sep 09 '14 at 05:54
  • @MattMcNabb I am aware of that verbiage; it's also not the part I'm talking about. Nevertheless, my interpretation is not infallible. – cdhowie Sep 09 '14 at 05:55
  • @cdhowie I certainly agree that it seems silly to list `argc` and `argv` explicitly , but maybe there's historical justification that I'm not aware of. Maybe it's relevant in the case of `main` being called recursively. – M.M Sep 09 '14 at 05:56
  • 2
    [Here's a legitimate reason why a particular implementation might require that `argv[n]` not be modified.](http://coding.derkeiler.com/Archive/C_CPP/comp.lang.c/2006-06/msg03551.html) –  Sep 09 '14 at 07:52
  • @hvd good point, write it as an answer perhaps. I was active on other clc threads at the same time so I should have remembered! – M.M Sep 09 '14 at 08:09
  • @MattMcNabb I posted it as a comment because it doesn't answer the question of what the standard actually requires. I don't know if the hypothetical implementation in that message would conform to the (intended) requirements of the standard. –  Sep 09 '14 at 09:35
  • @hvd if we accept that the standard doesn't clearly specify what it requires, then the discussion has to move onto what a likely rationale would be and what situations it's supposed to cover. – M.M Sep 09 '14 at 10:00

5 Answers5

9

IMO, code like argv[1] = "123"; is UB (using the original argv).


"The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination." C11dr & C17dr1 §5.1.2.2.1 2

Recall that const came into C many years after C's creation.

Much like char *s = "abc"; is valid when it should be const char *s = "abc";. The need for const was not required else too much existing code would have be broken with the introduction of const.

Likewise, even if argv today should be considered char * const argv[] or some other signature with const, the lack of const in the char *argv[] does not completely specify the const-ness needs of the argv, argv[], or argv[][]. The const-ness needs would need to be driven by the spec.

From my reading, since the spec is silent on the issue, yet goes into depth about other assignments of main()'s argv = and argv[i][j] = , it is UB.

Undefined behavior is otherwise indicated in this International Standard by the words ‘‘undefined behavior’’ or by the omission of any explicit definition of behavior" §4 2


[edit]:

main() is a very special function in C. What is allowable in other functions may or may not be allowed in main(). The C spec details attributes about its parameters that given the signature int argc, char *argv[] that shouldn't need. main(), unlike other functions in C, can have an alternate signature int main(void) and potentially others. main() is not reentrant. As the C spec goes out of its way to detail what can be modified: argc, argv, argv[][], it is reasonable to question if argv[] is modifiable due to its omission from the spec asserting that code can.

Given the specialty of main() and the omission of specifying that argv[] as modifiable, a conservative programmer would treat this greyness as UB, pending future C spec clarification.


If argv[i] is modifiable on a given platform, certainly the range of i should not exceed argc-1.

As "argv[argc] shall be a null pointer", assignining argv[argc] to something other than NULL appears to be a violation.

Although the strings are modifiable, code should not exceed the original string's length.

char *newstr = "abc";
if (strlen(newstr) <= strlen(argv[1])) 
  strcpy(argv[1], newstr);

1 No change with C17/18. Since that version was meant to clarify many things, it re-enforces this spec is adequate and not missing an "argv array elements shall be modifiable".

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    "omission of any explicit definition of behaviour" - well, we would say `int bar = 7;` is defined despite the fact that the text "int bar = 7;" does not appear in the standard. – M.M Sep 09 '14 at 20:49
  • @Matt McNabb Normally one would readily agree with your comment's line of reasoning were it not for C11dr §5.1.2.2.1 2. The spec goes out of the way to say some things are modifiable even though `char *argv[]` _does not need that affirmation_. Since the spec specifically indicates modifiability for `argv`, and `argv[][]`, (2 out of 3) but not `argv[]`, that absence is significant - hence UB by omission. IMO, it is a weakness for the spec to imply modifiability for `argv[][]` and to be silent on `argv[]`. – chux - Reinstate Monica Sep 10 '14 at 00:38
  • @Matt McNabb A usefulness to modifiable of `argv[][]` is the direct use of `strtok()` on `argv[]` and command line arguments often need to be parsed. – chux - Reinstate Monica Sep 10 '14 at 18:30
  • I disagree (see my answer). Can you point to any actual implementation in which `argv[n]=buf` behaves badly? Or any implementation which warns against doing it? – david.pfx Sep 12 '14 at 10:13
  • @david.pfx Advertising your answer here? - hmmm. No, cannot come up with an example just like the spec does not specifically say it can be done - else OP would have seen that and there would have been no question. We are left with a greyness in the spec and are trying to call it black or white. In the end, until the spec is made more clear, compiler makers and C programmers will do there best. Certainly you do not see a greyness, but if you did, how would you code: the way you think it should be or conservatively? For me, I would code conservatively because of the possibility of UB. – chux - Reinstate Monica Sep 12 '14 at 14:38
  • In this I'm a provider rather than consumer. I would write my compiler to allow it. If I wanted to use it I would read the compiler source to find if it's safe. Sometimes for things like this there is no other way. – david.pfx Sep 12 '14 at 14:54
  • @david.pfx Good point on what a compiler (provider) should do! If I (consumer) wanted to use a dubious spec defined ability though, I would change my needs (not use it), even if the compiler I was presently using did allow it. This avoids getting caught by compiler providers that employ [embrace, enhance, extinguish](http://en.wikipedia.org/wiki/Embrace,_extend_and_extinguish). – chux - Reinstate Monica Sep 12 '14 at 15:09
  • I won't labour the point, but sometimes requirements push you into places you'd rather not be, particularly if you only own/control part of the code. I've relied on worse things than this. – david.pfx Sep 12 '14 at 23:23
  • I came to the opposite conclusion wrt. basically the same question ([here](http://stackoverflow.com/a/35109299/1475978)), because 1) leaving `const` out of definition of `argv` does not seem like an accidental omission to me, and 2) the standard consistently refers to `argv` as an array, and modifying an array ⇒ modifying its members, as the array itself cannot be modified as a whole; it is the adjustment done to array parameters (turning them into qualified pointers to the first element) that allows modifying `argv` itself. Is that explicit definition enough? I think so, but I could be wrong. – Nominal Animal Jan 31 '16 at 03:36
  • @Nominal Animal Note: `int main(argc, argv)` as `int argc` and `char *argv[]` had lots of history before `const` was invented. – chux - Reinstate Monica Jan 31 '16 at 04:12
  • Quite true, chux. Like I mentioned in that other question, there is quite a lot of code (in olden Unix-land, and in GNU land) that expects both the pointers in the `argv` array as well as the contents of the pointed-to strings to be modifiable. I see the standard as *implying* the pointers are modifiable, and that the omission of an unambiguous explicit statement (in C99 5.1.2.2.1p2) declaring it so, is just an oversight. On the other hand, I cannot see any fault in your answer here either; I just cannot help but draw different conclusions. – Nominal Animal Jan 31 '16 at 04:47
  • @Nominal Animal Suggest adding your answer here along with your additional insights. – chux - Reinstate Monica Jan 31 '16 at 05:26
3

The argv array is not required to be modifiable (but may be in actual implementations). This is an intentional wording which was reaffirmed in the n849 meeting in 1998:

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n849.htm

PUBLIC REVIEW COMMENT #7

[...]

Comment 10.
Category: Request for information/clarification
Committee Draft subsection: 5.1.2.2.1
Title: argc/argv modifiability, part 2
Detailed description:

Is the array of pointers to char pointed to by argv modifiable?

Response Code: Q
    This is currently implictly unspecified and the committee 
    has chosen to leave it that way.

In addition, two separate proposals were made to, respectively, change and augment the wording. Both were rejected. Interested readers can find them by searching for "argv".


Trivia: an example in the Kernighan and Ritchie The C Programming Language, 2nd ed, ("K&R2") runs afoul of this. It is on page 117, and the relevant line of code is:

while (c = *++argv[0])

which increments the pointer inside the argument vector itself to step through the characters of the string.

Kaz
  • 55,781
  • 9
  • 100
  • 149
  • This looks to me like this answer should be accepted as the right one. – Lover of Structure Apr 27 '23 at 16:23
  • @Kaz, thank-you for posting the informative link. It is telling that the the requests were rejected, yet I do not see this as clearly supporting "argv array is not required to be modifiable" only that the ambiguity remains, possibly dependent on the spec as a whole. – chux - Reinstate Monica Jun 11 '23 at 05:17
0

argc is just an int and is modifiable without any restriction.

argv is a modifiable char **. It means that argv[i] = x is valid. But it does not say anything about argv[i] being itself modifiable. So argv[i][j] = c leads to undefined behaviour.

The getopt function of C standard library does modify argc and argv but never modifies the actual char arrays.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • 6
    Saying "`x` is modifiable" means that you are allowed to change `x`; you seem to be interpreting it as meaning "you are allowed to change what `x` points to , if x is a pointer, or if x is not a pointer than you are allowed to change x" – M.M Sep 09 '14 at 20:45
  • 1
    As quoted in the question, “the strings pointed to by the `argv` array shall be modifiable by the program”, and `argv[i][j]` is part of a string pointed to by `argv`, so it is modifiable by the program. – Eric Postpischil Sep 09 '22 at 13:31
-1

The answer is that argv is an array and yes, its contents are modifiable.

The key is earlier in the same section:

If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings, which are given implementation-defined values by the host environment prior to program startup.

From this it is clear that argv is to be thought of as an array of a specific length (argc). Then *argv is a pointer to that array, having decayed to a pointer.

Read in this context, the statement to the effect that 'argv shall be modifiable...and retain its contents' clearly intends that the contents of that array be modifiable.

I concede that there remains some ambiguity in the wording, particularly as to what might happen if argc is modified.


Just to be clear, what I'm saying is that I read this language as meaning:

[the contents of the] argv [array] and the strings pointed to by the argv array shall be modifiable...

So both the pointers in the array and the strings they point to are in read-write memory, no harm is done by changing them, and both preserve their values for the life of the program. I would expect that this behaviour is to be found in all the major C/C++ runtime library implementations, without exception. This is not UB.

The ambiguity is the mention of argc. It is hard to imagine any purpose or any implementation in which the value of argc (which appears to be simply a local function parameter) could not be changed, so why mention it? The standard clearly states that a function can change the value of its parameters, so why treat argc specially in this respect? It is this unexpected mention of argc that has triggered this concern about argv, which would otherwise pass without remark. Delete argc from the sentence and the ambiguity disappears.

david.pfx
  • 10,520
  • 3
  • 30
  • 63
  • `argv` is a pointer which points to the first element of an array, that's what `char **argv` or `char *argv[]` means in a parameter list. `argv[0]` is a member of that array, but `argv` itself isn't. In any case , the relevant text is "the strings pointed to by the argv array", so it doesn't matter whether or not argv is called an array; as the thing being defined is *the strings pointed to*, not the array. – M.M Sep 11 '14 at 20:10
  • @MattMcNabb: I still read "the argv [array] ... shall be modifiable" regardless of how you slice and dice it. Seems pretty obvious to me, hard to see why anyone would disagree. – david.pfx Sep 12 '14 at 10:11
  • 1
    I don't see your justification for ignoring "the strings pointed to by". another example of the same language construct: "the man shot by the policeman died". Would you say the policeman died? – M.M Sep 12 '14 at 11:08
  • I see no ambiguity in the wording, though it's possible the wording doesn't match the intent. `argv` is a pointer, not an array, and is that pointer object that the standard says is modifiable. The phrase "the `argv` array" can only refer to the array to whose first element `argv` points. The standard doesn't say whether that array is modifiable. That does seem like an odd omission, and I suspect it was not deliberate. – Keith Thompson Sep 12 '14 at 15:52
  • 2
    But it's not plausible that the standard would be so sloppy as to use the unqualified `argv` (the name of a pointer object) to refer to an array object -- *especially* when it uses the more precise and correct phrase "the `argv` array" in the same sentence. The meaning of "The parameters `argc` and `argv` ... shall be modifiable ..." is clear, and it doesn't refer to the array. – Keith Thompson Sep 12 '14 at 15:55
  • "why treat argc specially in this respect?" is because `argc` and `argv` are the parameters of `main()`. `main()` itself in a special function detailed in the spec. IMO, the peculiarities of the starting point for code necessitated these details for `main()`, its parameters, return value, ability to be recursively called. – chux - Reinstate Monica Sep 13 '14 at 17:34
  • @KeithThompson: I'm not sure why you say "It's not plausible that the standard would be so sloppy as to..." when there's a lot of sloppiness in the Standard. It was written in a very different era from today, before language lawyers took over everything. – supercat Nov 01 '15 at 02:48
-2

It is clearly mentioned that argv and argv[x][x] is modifiable. If argv is modifiable then it can point to another first element of an array of char and hence argv[x] can point to the first element of some another string. Ultimately argv[x] is modifiable too and that could be the reason that there is no need to mention it explicitly in standard.

haccks
  • 104,019
  • 25
  • 176
  • 264
  • 4
    Can you provide a reference in the spec that backs this up? – templatetypedef Sep 09 '14 at 06:13
  • 5
    I don't think that logic holds. Any modifiable pointer to non-modifiable storage can be reassigned to point to modifiable storage too. (So we can't make the logical step that "the pointer can be reassigned to point to modifiable storage" implies "it must have originally been pointing to modifiable storage"). – M.M Sep 09 '14 at 06:14
  • @MattMcNabb;Although storage is either modifiable or non modifiable, `argv[x]` is modifiable here. Standard says the second argument to `main` should be `char *argv[]`. This means that `arg[x]` is not a`const` pointer. – haccks Sep 09 '14 at 06:28
  • 4
    @haccks `const int i = 3; int *p = (int *) &i;` is perfectly valid. Now `p` is not a `const` pointer, but using that pointer to modify what it points to is not allowed. Another example without casting is `char *s = "hello";`. That's why the standard needed to explicitly state that the strings themselves are modifiable. And the standard doesn't explicitly state so for the array of string pointers. –  Sep 09 '14 at 07:46