6

Take e.g. execve(2), which according to posix has this prototype [1]:

int execve(const char *path, char *const argv[], char *const envp[]);

To me, it seems as if

int execve(const char *path, const char *const argv[], const char *const envp[]);

would have been an obvious improvement.

So, does anyone know why this is as it is? What can explain the need to possibly manipulate the given argv/envp strings?

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/exec.html

  • [In response to your comments on my deleted answer] Oh, I see what you mean — yes, my answer addresses the arguments of `main`, not the arguments of `argv`. That's really a C API question, not a Unix system design question. I think the answer is simply historical compatibility with the original, pre-const APIs. – Gilles 'SO- stop being evil' Oct 15 '14 at 08:07
  • 2
    Since questions about the C API (as opposed to general system design) are off-topic here, I'm voting to migrate to [so]. (Do not repost, the question will be moved soon.) – Gilles 'SO- stop being evil' Oct 15 '14 at 08:08
  • Presumably some programs mutate these buffers – David Heffernan Oct 15 '14 at 10:46
  • That buffer may indeed be mutated in the new process but that would be on the other side of the exec, with a new memory layout which has nothing to do with the process sending off those arguments to the kernel before the exec happens. – Peter Rosin Oct 15 '14 at 20:06
  • Just stumbled over the same thing when looking at posix_spawn. chainging the api to const char ** or const char * const * shouldn't even break existing implementations. To use these methods with the given signature I either have to copy the argv and envp arrays or just cast them and hope nothing goes wrong... Too bad! – MofX Jul 14 '15 at 12:21

3 Answers3

2

The argv and envp arguments to execve() are not pointers to const in order to preserve backwards compatibility with valid C code that was written before const was added to C.

C first appeared in 1972. Later on, const was added to C in 1987.

In order to maintain compatibility with pre-const code, the updated-with-const declaration of execve() must be able to accept non-const inputs. C does not allow assignments from char *[] to const char *[]. This can be verified by trying (and failing) to compile the following program:

void  foo  ( const char * argv[] )  {  return;  }
void  main  ( int argc, char * argv[] )  {  foo ( argv );  }

Therefore, char *const[] is the strictest type that is backwards compatible with pre-const code.

The question provides a link to the POSIX documentation for execve(). The linked POSIX documentation does explain this as follows:

The statement about argv[] and envp[] being constants is included to make explicit to future writers of language bindings that these objects are completely constant. Due to a limitation of the ISO C standard, it is not possible to state that idea in standard C. ...

Actually, to be precise: it is not possible to state that idea in a way that is backwards compatible with pre-const code.

... Specifying two levels of const- qualification for the argv[] and envp[] parameters for the exec functions may seem to be the natural choice, given that these functions do not modify either the array of pointers or the characters to which the function points, but this would disallow existing correct code. Instead, only the array of pointers is noted as constant. The table of assignment compatibility for dst= src derived from the ISO C standard summarizes the compatibility:

        dst:          char *[]   const char *[]   char *const[]   const char *const[]
src:
char *[]               VALID           -              VALID               -
const char *[]           -           VALID              -               VALID 
char * const []          -             -              VALID               - 
const char *const[]      -             -                -               VALID

Since all existing code ...

Meaning all code that existed prior to const being added to C.

... has a source type matching the first row, the column that gives the most valid combinations is the third column. The only other possibility is the fourth column, but using it would require a cast on the argv or envp arguments. It is unfortunate that the fourth column cannot be used, because the declaration a non-expert would naturally use would be that in the second row.

Sources: 2018 edition, 2004 edition.

mpb
  • 1,277
  • 15
  • 18
0

This is basically due to the hole in the C standard that prevents the implicit conversion from T ** to const T * const *. Such a conversion would be safe (a conversion from T** to const T ** would be problematic), but the standard has not been updated to allow it.

mpb quotes from the POSIX standard

The statement about argv[] and envp[] being constants is included to make explicit to future writers of language bindings that these objects are completely constant. Due to a limitation of the ISO C standard, it is not possible to state that idea in standard C.

which implies that the POSIX standard will be changed when/if the C standard is ever updated.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
-2

Some programs manipulate the argv strings so that ps output show some state information. For example:

root      6550 10809  0 13:10 ?        00:00:00 pure-ftpd (IDLE)
root     32216     1  0 Apr05 ?        00:00:00 vtund[s]: waiting for connections on port 5000
1023     30448  9847  0 09:01 ?        00:00:01 imap [username 192.168.1.135]

Hence the argv values are not constant and should not be declared as such.

wurtel
  • 147
  • 5
  • 4
    Presumably, however, those programs modify the copy of the argv strings in their *own* address space, not the address space of the process which called `execve`. – Greg Hewgill Oct 15 '14 at 19:24