0
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
    write(STDOUT_FILENO, argv[1], atoi(argv[2]));
    return 0;
}

The following output (run in bash on Ubuntu) shows that 'xyz' is not passed in argv. I want to make sure that this is the limitation of the OS but not the shell. So it would not be possible to pass a null character in the middle of a string in argv. Could anybody confirm? Thanks.

$ ./main.exe $'abc\000xyz' 7 | xxd
00000000: 6162 6300 3700 53                        abc.7.S
user1424739
  • 11,937
  • 17
  • 63
  • 152
  • try with python as a shell. Looks like undefined/implementation defined to me – Jean-François Fabre Mar 01 '19 at 05:50
  • The question is where the limitation comes from. Is it from C, OS or shell? – user1424739 Mar 01 '19 at 05:50
  • can't even run it from python "embedded null character" – Jean-François Fabre Mar 01 '19 at 05:51
  • duplicate?: https://stackoverflow.com/questions/7316232/how-to-pass-x00-as-argument-to-program – Jean-François Fabre Mar 01 '19 at 05:53
  • this answer allows to make it work: https://stackoverflow.com/a/31739648/6451573 – Jean-François Fabre Mar 01 '19 at 05:54
  • So the limitation is from the OS. But OS is built on C. So the ultimate limitation is from C? But the direct limitation is from the OS? – user1424739 Mar 01 '19 at 05:56
  • Maube related, [How to find the main function's entry point of elf executable file without any symbolic information?](https://stackoverflow.com/q/9885545/608639) The question leads you to `__libc_start_main` for Linux ELF binaries. Also see [How main() is executed on Linux](http://tldp.org/LDP/LG/issue84/hawk.html). – jww Mar 01 '19 at 07:20
  • 2
    Does this answer your question? [What if there's '\0' character in command line input?](https://stackoverflow.com/questions/6560779/what-if-theres-0-character-in-command-line-input) – user202729 Feb 16 '21 at 16:22

1 Answers1

0

All program start by exec calls, despite OS implementations, lets see what does POSIX standard says about exec functions.

int execve(const char *path, char *const argv[], char *const envp[]);

argv are arguments passed to the program, envp are environment variables.

The argv and environ arrays are each terminated by a null pointer. The null pointer terminating the argv array is not counted in argc.

The argument argv is an array of character pointers to null-terminated strings.

So, it is very easy to conclude:

  1. Empty (zero length) string can be passed to argv
  2. NULL string can't be passed to argv, it will terminate argv, and discard all following arguments.
  3. any string in argv can't contain \x0 character, \x0 will terminate the current parameter string.

Explain what happened when run

python -c 'print "\x30\x00\x31"' | xargs --null prog

Try:

python -c 'print "\x30\x00\x31"' | strace -f xargs --null echo

We can find:

...
...
...
read(0, "0\0001\n", 4096)               = 4
...
...
...
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f09f46d9850) = 21232
strace: Process 21232 attached
...
...
...
[pid 21232] execve("/bin/echo", ["echo", "0", "1\n"], 0x7ffe5ccd7638 /* 74 vars */ <unfinished ...>

It shows that xargs potentially separate argument containing \x0 into two arguments when calling prog. Not done by shell, not done by OS, libc, but xargs

Zang MingJie
  • 5,164
  • 1
  • 14
  • 27
  • So the limitation is from exec? But the OS can support none zero terminate strings if there is another way call external programs? Why xargs is relevant here? – user1424739 Mar 01 '19 at 11:27