Memory map of what happens when we use command line arguments?

Question

What I understand is argc holds total number of arguments. Suppose my program takes 1 argument apart from program name. Now what does argv hold? Two pointer eg: 123,130 or ./hello\0 and 5. If it holds 123 how does it know it has read one argument? Does it know because of \0.

If all the above is wrong, can someone help me understand using memory map.

Why not testing by yourself? Show what you have done to answer your question. — Xvolks, Mar 09 '16 at 08:38
It's `char**`, why do you expect it to hold anything else than char*, i.e. pointers? — stijn, Mar 09 '16 at 08:38
I don't know much about stack overflow. But I think that people love to say 'Oh! Its a duplicate question'. Also down vote. It just breaks the backbone of newbie. Who after 1 down vote thinks a thousand time before adsing another question. And newbie being newbie some times decides against posting genuine question. I understand that this behavior is essential to keep up quality here. But can there be a middle ground, something like stack overflow -- where newbie could post and if it gets certain point there, his question can be transfered here. Of course, that will take time but . — imox, Mar 09 '16 at 09:24
@imox: Such a site would not draw experts to answer your questions. Who is willing to answer the same question over and over? StackOverflow achieves its goals because of the rules. And those goals are to have the best answers for every _unique_ question. This does include newbie questions, but not duplicate newbie questions. — MSalters, Mar 09 '16 at 11:48
@MSalters I fully agree with you. But a newbie finds very difficult to know that his question is a duplicate one. His question will be a paraphrase (maybe mine) and he will continue saying its not a duplicate. I only said to find a middle ground. I myself will try to do better homework before posting a question, maybe then I can learn even more. Anyways, cheers to everyone involved for the top notch work. — imox, Mar 09 '16 at 12:59

Michael Aaron Safyan · Accepted Answer · 2016-03-09T09:20:36.410

2

The argv array is an array of strings (where each entry in the array is of type char*). Each of those char* arrays is, itself, NUL-terminated. The argv array, itself, does not need to end in NULL (which is why a separate argc variable is used to track the length of the argv array).

In terms of those arrays being constructed to begin with, this is dependent on the calling program. Typically, the calling program is a shell program (such as BASH), where arguments are separated via whitespace (with various quoting options available to allow arguments to include whitespace). Regardless of how the argc, argv parameters are constructed, the operating system provides routines for executing a program with this as the program inputs (e.g. on UNIX, that method is one of the various variations of exec, often paired with a call to fork).

To make this a bit more concrete, suppose you ran:

./myprog "arg"

Here is an example of how this might look in memory (using completely fake addresses):

Addresss | Value | Comment
========================
0058     | 2      | argc
0060     | 02100  | argv (value is the memory address of "argv[0]")
...
02100    | 02116 | argv[0] (value is the memory address of "argv[0][0]")
02104    | 02300 | argv[1] (value is the memory address of "argv[1][0]")

...
02116    | '.'   | argv[0][0]
02117    | '/'   | argv[0][1]
02118    | 'm'   | argv[0][2]
02119    | 'y'   | argv[0][3]
02120    | 'p'   | argv[0][4]
02121    | 'r'   | argv[0][5]
02122    | 'o'   | argv[0][6]
02123    | 'g'   | argv[0][7]
02124    | '\0'  | argv[0][8]
...
02300    | 'a'   | argv[1][0]
02301    | 'r'   | argv[1][1]
02302    | 'g'   | argv[1][2]
02303    | '\0'  | argv[1][3]

edited Mar 09 '16 at 09:20

answered Mar 09 '16 at 08:57

Michael Aaron Safyan

93,612
16
138
200

Really informative answer. Please clear one more doubt. How does argc know how many argument? Is it on getting \n? And probably 02124 should be \0 not 02125. – imox Mar 09 '16 at 09:10
1

@imox on call to `exec()` OS can traverse the array of pointers you passed until it finds NULL; that's how it deduces argc, right before sending it to your program. – Yam Marcovic Mar 09 '16 at 09:12
@imox See `man (2) execvp` : "The array of pointers must be terminated by a null pointer." -- For just that reason. – Yam Marcovic Mar 09 '16 at 09:13
@Yam that is informative. Now I can say that I have somewhat understood it . – imox Mar 09 '16 at 09:15
Also, "argv (points to memory address of argv[0]" would mean `*argv == &argv[0]` which isn't the case. `argv` points directly to `argv[0]`, because `argv[0] == *(argv + 0) == *argv` – Yam Marcovic Mar 09 '16 at 09:16
@YamMarcovic my notation here was intended to convey the fact that the variable "argv" has its own storage (on the function stack). The value of that variable is the address of "argv[0]". I've replaced "points to memory address" with "value is the memory address" to address your concern, though I think it was clear enough given the context. – Michael Aaron Safyan Mar 09 '16 at 09:17
Yep. I think your edit is better, because the phrase "x points to y" is reserved and thus problematic to change in this specific context (which deals with pointer variables). – Yam Marcovic Mar 09 '16 at 09:22
Thank you @Michael and Yam for making it clear and the question was not exactly duplicate. – imox Mar 09 '16 at 09:32

Memory map of what happens when we use command line arguments?

1 Answers1