Why does using char agrv instead of char **argv as the argument of main cause the following output?

Question

When I do this:

int main(int agrc, char argv)
{
    printf("%d", argv);
    return 0;
}

I get this input when I run the program from command line:

$ prog_name 0
0

$ prog_name (from 0-7 characters)
48

$ prog_name 12345678
56

$ prog_name 1234567812345678
64

// and so on...

So where do these values come from and why they increment by 8?

What happens when I have this instead:

int main(int agrc, char argv[])

?

`gcc filename.c -Wall -o obj` Doesn't your compile throw any warning or error. The valid prototype for `main()` is `int main(int argc, char *argv[])` or `int main(void)`.. Let me not bring the other valid one which migiht confuse you for now — Gopi, Jun 20 '15 at 13:51
That is not a valid signature for `main` so the result is undefined behavior. — Captain Obvlious, Jun 20 '15 at 13:52
The operators are not just for decoration or that the code looks more mythical. Reading a book on programming in C will answer that question clearly. — too honest for this site, Jun 20 '15 at 13:52
@Olaf yes I understand what those operators do, but I didn't understand why I get that output. As Captain Oblivious pointed out I now understand that the behavior of this is unpredictable. I thought I would get some sort of error with gcc, but I only got a warning. — Firmus, Jun 20 '15 at 13:56
@Olaf Understanding those operators and understanding the result of ignoring the requirements placed on `main` are two different things. — Captain Obvlious, Jun 20 '15 at 14:00
@LightnessRacesinOrbit, Capt'n: I drew back my comment as it did not state clear it is my opinion.I actually think knowing about pointer/array, and using wrong argument types results in UB is vital knowledge to program in C. That is one of the first things I tell my students. But that's me, other tutors or ressources might put less emphasis on such issues. — too honest for this site, Jun 20 '15 at 14:08

score 5 · Answer 1 · edited May 23 '17 at 12:15

5

Your output is likely to be an address of "ordinary" argv parameter, that is ~~implicitely converted~~ interpreted_{see comment below} as char. In other words I suspect that what you have is equivalent to:

int main(int agrc, char **argv)
{
    printf("%d", (char) argv);
    return 0;
}

On my machine (CentOS 6 32-bit) disassembled object codes are as follows:

   0x080483c4 <+0>: push   %ebp
   0x080483c5 <+1>: mov    %esp,%ebp
   0x080483c7 <+3>: and    $0xfffffff0,%esp
   0x080483ca <+6>: sub    $0x10,%esp
   0x080483cd <+9>: mov    0xc(%ebp),%eax
   0x080483d0 <+12>:    movsbl %al,%eax
   0x080483d3 <+15>:    mov    %eax,0x4(%esp)
   0x080483d7 <+19>:    movl   $0x80484b4,(%esp)
   0x080483de <+26>:    call   0x80482f4 <printf@plt>

and original code that you've posted:

   0x080483c4 <+0>: push   %ebp
   0x080483c5 <+1>: mov    %esp,%ebp
   0x080483c7 <+3>: and    $0xfffffff0,%esp
   0x080483ca <+6>: sub    $0x20,%esp
   0x080483cd <+9>: mov    0xc(%ebp),%eax
   0x080483d0 <+12>:    mov    %al,0x1c(%esp)
   0x080483d4 <+16>:    movsbl 0x1c(%esp),%eax
   0x080483d9 <+21>:    mov    %eax,0x4(%esp)
   0x080483dd <+25>:    movl   $0x80484b4,(%esp)
   0x080483e4 <+32>:    call   0x80482f4 <printf@plt>

In both cases $0x80484b4 stores "%d" format specifier as string literal and 0xc(%ebp) is responsible for actual value that is used by printf():

(gdb) x/db 0xbffff324
0xbffff324: -60
(gdb) p $al
$3 = -60

Notice that AL (one byte accumulator, i.e. part of EAX) "fetches" only the first byte (my CPU is little endian, so it's actually LSB) at $ebp+0xc address. This means that (char) conversion does "cut-off" of an argv address.

As a consequence you may observe that each of these numbers have log2(n) least significant bits unset. This due to alignment requirement for objects of pointer type. Typically for a 32-bit x86 machine alignof(char **) == 4.

As already pointed in comments you violated C Standard, so it's an example of UB.

edited May 23 '17 at 12:15

Community

1
1

answered Jun 20 '15 at 14:09

Grzegorz Szpetkowski

36,988
6
90
137

1

I would have thought it more likely to be a byte-wise reinterpretation, not a logical conversion? This may especially make a difference since `sizeof(char**)` and `sizeof(char)` are unlikely to be the same. – Lightness Races in Orbit Jun 20 '15 at 14:10
@LightnessRacesinOrbit: I checked it by examining various input with `gdb` sessions, but I am not 100% sure. This is all implemenentation-depedendent. – Grzegorz Szpetkowski Jun 20 '15 at 14:13
Of course it's implementation-dependent. It's a question about what the implementation is doing. (OP should specify which impl though) – Lightness Races in Orbit Jun 20 '15 at 14:13
@Lightness Races in Orbit What info exactly should I post? I use MiniGW on Windows 7 (64bit) GCC version: 4.8.1 – Firmus Jun 20 '15 at 14:21
2

For OP's code , I get: `-12`, `100`, `-124`, `-28`, `-108`, `4`, `-76`, `84`, `68`, `-12`, `-44`, `20`, `52` across multiple runs for executing the *same binary* with `./a.out 12345678`. My platform is Linux (built on a custom kernel from the Linux source tree 3.19) and using gcc 5.1.1 and glibc 2.21. I hope I have given enough *implementation details* for anyone cares to explain. – P.P Jun 20 '15 at 14:57
@Firmus: Also your processor details (BTW it's called "MinGW", not "MiniGW" lol :P) – Lightness Races in Orbit Jun 20 '15 at 16:02
@GrzegorzSzpetkowski: Nice :) +1 – Lightness Races in Orbit Jun 20 '15 at 16:03
2

This is not a conversion, implicit or otherwise (and there is no implicit conversion from `char**` to `char`). Most likely the `char**` value that's actually passed to `main` is interpreted as a `char` value, or part of it is. Or, depending on the implementation, some uninitialized memory location might be interpreted as a `char` value, if `char` and `char**` are passed differently as argument. – Keith Thompson Jun 20 '15 at 18:11

Sourav Ghosh · Answer 2 · 2015-06-20T14:06:57.903

-1

From the C standards, regarding the signature of main()

The implementation declares no prototype for this function.

So, there will be no issues from the compiler if you pass different type of arguments.

In your code,

int main(int agrc, char argv)

is not the signature recommended for main(). It should either be

int main(int agrc, char* argv[])

or, at least

int main(int agrc, char** argv)

Otherwise, in a hosted environment, the behavior in not defined. You can check more on this in C11 standard, chapter 5.1.2.2.1.

In your case, as you see, you are making the second parameter a char type. As per the standard specification,

If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings,....

So, here, the supplied 0 is passed to main() as a pointer to string which is accepted in a char, which is not a defined behavior.

edited Jun 20 '15 at 14:06

answered Jun 20 '15 at 13:54

Sourav Ghosh

133,132
16
183
261

3

Yeah, he knows. He wanted us to explain the output he got from an implementation perspective. You did not answer the question. -1 – Lightness Races in Orbit Jun 20 '15 at 13:58
@LightnessRacesinOrbit I don't think so. If i know I am replacing a `char * []` with a `char` and asking why it gives weird output...then.... – Sourav Ghosh Jun 20 '15 at 13:59
2

He's not asking why it gives weird output. He's asking why it gives _this_ output. "Since the compiler doesn't give me an error, where do these numbers come from?" – Lightness Races in Orbit Jun 20 '15 at 14:01
1

Lightness Races in Orbit is right... I did that out of curiosity and I noticed the pattern. I thought there was a particular reason for this behavior ... F*ck me, right? – Firmus Jun 20 '15 at 14:03
@LightnessRacesinOrbit Well, I updated. Please review? – Sourav Ghosh Jun 20 '15 at 14:10
1

@SouravGhosh: No, you're still not getting it. _"It's not a defined behaviour"_ is your answer. Well, damn, _he knows that_. But the implementation and the runtime are still doing something that results in these values and, whether that behaviour is well-defined by any language standard or not, this OP would like to know what that "something" is. – Lightness Races in Orbit Jun 20 '15 at 14:11
@LightnessRacesinOrbit I really don't understand now. So basically OP is asking to define the UB, right? Sorry, that is out of my knowledge, I don't have compiler specific idea. – Sourav Ghosh Jun 20 '15 at 14:15
@LightnessRacesinOrbit yes, I would, but I am not reading the question the way you do. So...peace. :-) – Sourav Ghosh Jun 20 '15 at 14:17
1

also, tagging both `C` and `C++`, writing `Why does omitting * and [] `...that makes me doubtful that `Yeah, he knows. He wanted us to explain the output he got from an implementation perspective.` That is rather a random experiment, with the good (or ugly) looking `*` and `[]`s. – Sourav Ghosh Jun 20 '15 at 14:18

donjuedo · Answer 3 · 2015-06-20T14:09:12.797

-3

There is a string pointer on the stack, but you declared main with a char there, and then printed it as a decimal. The memory address of that string is not predictable, so you get unpredictable output.

Try this:

int main( int argc, char* argv[] )
{
    printf( "%s", argv[1] );
    return 0;
}

I think that will give you what you intended.

edited Jun 20 '15 at 14:09

answered Jun 20 '15 at 14:05

donjuedo

2,475
18
28

1

This doesn't answer the question at all. The poster wants to know why they get the result they do, not how to properly define `main`. – Captain Obvlious Jun 20 '15 at 14:07
Based on his reputation, I inferred that he really wanted it to just work, so I provided that. However, I have also edited the answer to address his literal question. – donjuedo Jun 20 '15 at 14:12
Sorry, no, you didn't. – Lightness Races in Orbit Jun 20 '15 at 14:14
1

What part of `The memory address of that string is not predictable` do you not understand? It says where the values come from and that there is not a deterministic reason to increment by 8. – donjuedo Jun 20 '15 at 14:17

Why does using char agrv instead of char **argv as the argument of main cause the following output?

3 Answers3