0

As "is known", a script my-script-file which starts with

#!/path/to/interpreter -arg1 val1 -arg2 val2

is executed by exec calling /path/to/interpreter with 2(!) arguments:

  1. -arg1 val1 -arg2 val2
  2. my-script-file

(and not, as one might naively expect, with 5 arguments

  1. -arg1
  2. val1
  3. -arg2
  4. val2
  5. my-script-file

as has been explained in many previous questions, e.g., https://stackoverflow.com/a/4304187/850781).

My problem is from the POV of an interpreter developer, not script writer.

How do I detect from inside the interpreter executable that I was called from shebang as opposed to the command line?

Then I will be able to decide whether I need to split my first argument by space to go from "-arg1 val1 -arg2 val2" to ["-arg1", "val1", "-arg2", "val2"] or not.

The main issue here is script files named with spaces in them. If I always split the 1st argument, I will fail like this:

$ my-interpreter "weird file name with spaces"
my-interpreter: "weird": No such file or directory
sds
  • 58,617
  • 29
  • 161
  • 278

1 Answers1

2

On Linux, with GNU libc or musl libc, you can use the aux-vector to distinguish the two cases.

Here is some sample code:

#define _GNU_SOURCE 1
#include <stdio.h>
#include <errno.h>
#include <sys/auxv.h>
#include <sys/stat.h>

int
main (int argc, char* argv[])
{
  printf ("argv[0] = %s\n", argv[0]);
  /* https://www.gnu.org/software/libc/manual/html_node/Error-Messages.html */
  printf ("program_invocation_name = %s\n", program_invocation_name);
  /* http://man7.org/linux/man-pages/man3/getauxval.3.html */
  printf ("auxv[AT_EXECFN] = %s\n", (const char *) getauxval (AT_EXECFN));
  /* Determine whether the last two are the same. */
  struct stat statbuf1, statbuf2;
  if (stat (program_invocation_name, &statbuf1) >= 0
      && stat ((const char *) getauxval (AT_EXECFN), &statbuf2) >= 0)
    printf ("same? %d\n", statbuf1.st_dev == statbuf2.st_dev && statbuf1.st_ino == statbuf2.st_ino);
}

Result for a direct invocation:

$ ./a.out 
argv[0] = ./a.out
program_invocation_name = ./a.out
auxv[AT_EXECFN] = ./a.out
same? 1

Result for an invocation through a script that starts with #!/home/bruno/a.out:

$ ./a.script 
argv[0] = /home/bruno/a.out
program_invocation_name = /home/bruno/a.out
auxv[AT_EXECFN] = ./a.script
same? 0

This approach is, of course, highly unportable: Only Linux has the getauxv function. And there are surely cases where it does not work well.

Bruno Haible
  • 1,203
  • 8
  • 8
  • this mess probably explains why unix has single character options and they can be coalesced into a single word like `rm -rf` – sds Nov 17 '17 at 13:51