Is there a way to know how the user invoked a program from bash?

Question

Here's the problem: I have this script foo.py, and if the user invokes it without the --bar option, I'd like to display the following error message:

Please add the --bar option to your command, like so:
    python foo.py --bar

Now, the tricky part is that there are several ways the user might have invoked the command:

They may have used python foo.py like in the example
They may have used /usr/bin/foo.py
They may have a shell alias frob='python foo.py', and actually ran frob
Maybe it's even a git alias flab=!/usr/bin/foo.py, and they used git flab

In every case, I'd like the message to reflect how the user invoked the command, so that the example I'm providing would make sense.

sys.argv always contains foo.py, and /proc/$$/cmdline doesn't know about aliases. It seems to me that the only possible source for this information would be bash itself, but I don't know how to ask it.

Any ideas?

UPDATE How about if we limit possible scenarios to only those listed above?

UPDATE 2: Plenty of people wrote very good explanation about why this is not possible in the general case, so I would like to limit my question to this:

Under the following assumptions:

The script was started interactively, from bash
The script was start in one of these 3 ways:
1. foo <args> where foo is a symbolic link /usr/bin/foo -> foo.py
2. git foo where alias.foo=!/usr/bin/foo in ~/.gitconfig
3. git baz where alias.baz=!/usr/bin/foo in ~/.gitconfig

Is there a way to distinguish between 1 and (2,3) from within the script? Is there a way to distinguish between 2 and 3 from within the script?

I know this is a long shot, so I'm accepting Charles Duffy's answer for now.

UPDATE 3: So far, the most promising angle was suggested by Charles Duffy in the comments below. If I can get my users to have

trap 'export LAST_BASH_COMMAND=$(history 1)' DEBUG

in their .bashrc, then I can use something like this in my code:

like_so = None
cmd = os.environ['LAST_BASH_COMMAND']
if cmd is not None:
    cmd = cmd[8:]  # Remove the history counter
    if cmd.startswith("foo "):
        like_so = "foo --bar " + cmd[4:]
    elif cmd.startswith(r"git foo "):
        like_so = "git foo --bar " + cmd[8:]
    elif cmd.startswith(r"git baz "):
        like_so = "git baz --bar " + cmd[8:]
if like_so is not None:
    print("Please add the --bar option to your command, like so:")
    print("    " + like_so)
else:
    print("Please add the --bar option to your command.")

This way, I show the general message if I don't manage to get their invocation method. Of course, if I'm going to rely on changing my users' environment I might as well ensure that the various aliases export their own environment variables that I can look at, but at least this way allows me to use the same technique for any other script I might add later.

If the `--bar` is obligatory, how about always adding it internally yourself, and providing a `--no-bar` for more skilled users who don't want it and do know how to add arguments? — Mark Setchell, Jul 12 '18 at 09:18
@MarkSetchell my attempt at generality backfired here. My current use case is about a script that exits, but wants to tell the user they can continue with --continue. Kind of like when `git rebase` hits conflicts. So your idea won't work for me. — itsadok, Jul 12 '18 at 09:38
@Aaron yes! But how can I access the parent bash process history? — itsadok, Jul 12 '18 at 09:40
I'm not sure what you mean, the history records what the user enters in the current shell (and others shells which history has been persisted to ~/.bash_history). This won't record your script's execution in quite a few situations, such as the script being called from another one (only the script the user called will be recorded) or from cron or other services. It can also be disabled by the user. — Aaron, Jul 12 '18 at 09:46
@Aaron the history usually gets persisted only when bash exits, unless you use some tricks with $PROMPT_COMMAND. I was wondering if there was a way to query the history of a running bash process. I don't care about cases when something other than interactive bash spawns the process, with the exception of git, which itself was spawned from interactive bash. — itsadok, Jul 12 '18 at 09:56
And what if the user puts it in the crontab? You can't show a user message there. Shouldn't you add a message to /var/log/messages instead? — Dominique, Jul 18 '18 at 09:29
Instead of burning up a lot of cycles on how to find the name of the invocation (which IMHO there are a lot of corner cases where it will fail) -- Why not work on the message to users? `Please add the --bar option to your command, like so: 'cmd --bar'` Most people are smart enough (again, IMHO) to know that `cmd` is the fill-in for whatever they typed. — dawg, Jul 18 '18 at 15:42
Note that it *is* potentially possible to make the shell export a copy of `BASH_COMMAND` to the environment from a `DEBUG` trap, but if you relied on that, your program would only have the behavior at hand when invoked by a shell so prepared... so it hardly seems useful. — Charles Duffy, Jul 18 '18 at 22:58
@dawg this is of course the current solution I have. I was wondering if there was a way to make it more convenient. — itsadok, Jul 19 '18 at 08:34
Re: edits -- if you wanted to walk your process tree and look at parents' command lines until you find a `git` command, yes, you can do that. *Shell* aliases aren't found anywhere in the history, but since invoking a *git* alias goes through a real external command, it'll show up. — Charles Duffy, Jul 19 '18 at 16:34
(Re: symlinks, the standard `argv[0]` approach works for them, mediated through `/proc/self/cmdline` or otherwise). — Charles Duffy, Jul 19 '18 at 16:58
BTW, [full command line as it was typed](https://stackoverflow.com/questions/667540/full-command-line-as-it-was-typed) is a closely related question. — Charles Duffy, Jul 19 '18 at 17:01
See my updated answer below: wrap the python script in a bash script, which provides the appropriate error messages, based on nonzero exit codes, for example. — philwalk, Jul 19 '18 at 17:07

Charles Duffy · Accepted Answer · 2018-07-19T16:55:00.063

No, there is no way to see the original text (before aliases/functions/etc).

Starting a program in UNIX is done as follows at the underlying syscall level:

int execve(const char *path, char *const argv[], char *const envp[]);

Notably, there are three arguments:

The path to the executable
An argv array (the first item of which -- argv[0] or $0 -- is passed to that executable to reflect the name under which it was started)
A list of environment variables

Nowhere in here is there a string that provides the original user-entered shell command from which the new process's invocation was requested. This is particularly true since not all programs are started from a shell at all; consider the case where your program is started from another Python script with shell=False.

It's completely conventional on UNIX to assume that your program was started through whatever name is given in `argv[0]`; this works for symlinks.

You can even see standard UNIX tools doing this:

$ ls '*.txt'         # sample command to generate an error message; note "ls:" at the front
ls: *.txt: No such file or directory
$ (exec -a foobar ls '*.txt')   # again, but tell it that its name is "foobar"
foobar: *.txt: No such file or directory
$ alias somesuch=ls             # this **doesn't** happen with an alias
$ somesuch '*.txt'              # ...the program still sees its real name, not the alias!
ls: *.txt: No such file

If you do want to generate a UNIX command line, use `pipes.quote()` (Python 2) or `shlex.quote()` (Python 3) to do it safely.

try:
    from pipes import quote # Python 2
except ImportError:
    from shlex import quote # Python 3

cmd = ' '.join(quote(s) for s in open('/proc/self/cmdline', 'r').read().split('\0')[:-1])
print("We were called as: {}".format(cmd))

Again, this won't "un-expand" aliases, revert to the code that was invoked to call a function that invoked your command, etc; there is no un-ringing that bell.

That can be used to look for a git instance in your parent process tree, and discover its argument list:

def find_cmdline(pid):
    return open('/proc/%d/cmdline' % (pid,), 'r').read().split('\0')[:-1]

def find_ppid(pid):
    stat_data = open('/proc/%d/stat' % (pid,), 'r').read()
    stat_data_sanitized = re.sub('[(]([^)]+)[)]', '_', stat_data)
    return int(stat_data_sanitized.split(' ')[3])

def all_parent_cmdlines(pid):
    while pid > 0:
        yield find_cmdline(pid)
        pid = find_ppid(pid)

def find_git_parent(pid):
    for cmdline in all_parent_cmdlines(pid):
        if cmdline[0] == 'git':
            return ' '.join(quote(s) for s in cmdline)
    return None

@philwalk, if I correctly understand your proposal (retrieving history from a script), it's *extremely* sensitive to runtime configuration -- many shell configurations won't write history to file until exit at all. If you controlled the user's shell configuration enough to make sure it would work, you could just have them install a `DEBUG` trap that exports a copy of `$BASH_COMMAND` and be done with it. — Charles Duffy, Jul 19 '18 at 19:45
@philwalk, ...moreover, it's quite untrue to say that I "provide as proof some approaches that don't work" -- I'm providing an approach that works in the git alias case (the `find_git_parent` function), and an approach that works in the symlink case (`sys.argv[0]`, which -- as I demonstrate -- is the approach used by standard UNIX tools). Those *are* the case that permit a solution that doesn't depend on details of shell runtime configuration. — Charles Duffy, Jul 19 '18 at 19:52
I was referring to the title of your answer "There's no way to see the original text (before aliases/functions/etc).". My solution does just that. — philwalk, Jul 19 '18 at 21:28
@philwalk, eh? It's no "solution"; the OP wants to retrieve that text *from a Python interpreter*. You can only retrieve it *from the same shell that ran the command* -- not even from a subshell interpreting a script, without the interactive parent shell's active participation. — Charles Duffy, Jul 19 '18 at 21:28
@philwalk, ...when your `aliasTest` script can reliably tell if *if was itself invoked* through an alias, without requiring the shell doing that invocation to have any non-default configuration beforehand, **that's** when I'll be impressed -- and not before. — Charles Duffy, Jul 19 '18 at 21:34
the original question specifically wondered if it was possible to get the information from the bash environment. And the question as currently worded wants a way to customize error messages, and my answer provides the needed information to do that. — philwalk, Jul 19 '18 at 23:28
@philwalk, *from* the bash environment, *into* Python (so the Python script can modify its error message). — Charles Duffy, Jul 20 '18 at 03:18

philwalk · Answer 2 · 2018-12-20T18:27:14.740

4

See the Note at the bottom regarding the originally proposed wrapper script.

A new more flexible approach is for the python script to provide a new command line option, permitting users to specify a custom string they would prefer to see in error messages.

For example, if a user prefers to call the python script 'myPyScript.py' via an alias, they can change the alias definition from this:

  alias myAlias='myPyScript.py $@'

to this:

  alias myAlias='myPyScript.py --caller=myAlias $@'

If they prefer to call the python script from a shell script, it can use the additional command line option like so:

  #!/bin/bash
  exec myPyScript.py "$@" --caller=${0##*/}

Other possible applications of this approach:

  bash -c myPyScript.py --caller="bash -c myPyScript.py"

  myPyScript.py --caller=myPyScript.py

For listing expanded command lines, here's a script 'pyTest.py', based on feedback by @CharlesDuffy, that lists cmdline for the running python script, as well as the parent process that spawned it. If the new -caller argument is used, it will appear in the command line, although aliases will have been expanded, etc.

#!/usr/bin/env python

import os, re

with open ("/proc/self/stat", "r") as myfile:
  data = [x.strip() for x in str.split(myfile.readlines()[0],' ')]

pid = data[0]
ppid = data[3]

def commandLine(pid):
  with open ("/proc/"+pid+"/cmdline", "r") as myfile:
    return [x.strip() for x in str.split(myfile.readlines()[0],'\x00')][0:-1]

pid_cmdline = commandLine(pid)
ppid_cmdline = commandLine(ppid)

print "%r" % pid_cmdline
print "%r" % ppid_cmdline

After saving this to a file named 'pytest.py', and then calling it from a bash script called 'pytest.sh' with various arguments, here's the output:

$ ./pytest.sh a b "c d" e
['python', './pytest.py']
['/bin/bash', './pytest.sh', 'a', 'b', 'c d', 'e']

NOTE: criticisms of the original wrapper script aliasTest.sh were valid. Although the existence of a pre-defined alias is part of the specification of the question, and may be presumed to exist in the user environment, the proposal defined the alias (creating the misleading impression that it was part of the recommendation rather than a specified part of the user's environment), and it didn't show how the wrapper would communicate with the called python script. In practice, the user would either have to source the wrapper or define the alias within the wrapper, and the python script would have to delegate the printing of error messages to multiple custom calling scripts (where the calling information resided), and clients would have to call the wrapper scripts. Solving those problems led to a simpler approach, that is expandable to any number of additional use cases.

Here's a less confusing version of the original script, for reference:

#!/bin/bash
shopt -s expand_aliases
alias myAlias='myPyScript.py'

# called like this:
set -o history
myAlias $@
_EXITCODE=$?
CALL_HISTORY=( `history` )
_CALLING_MODE=${CALL_HISTORY[1]}

case "$_EXITCODE" in
0) # no error message required
  ;;
1)
  echo "customized error message #1 [$_CALLING_MODE]" 1>&2
  ;;
2)
  echo "customized error message #2 [$_CALLING_MODE]" 1>&2
  ;;
esac

Here's the output:

$ aliasTest.sh 1 2 3
['./myPyScript.py', '1', '2', '3']
customized error message #2 [myAlias]

edited Dec 20 '18 at 18:27

answered Jul 18 '18 at 20:53

philwalk

634
1
7
15

1

I'm working on a version that incorporates your feedback, thanks! – philwalk Jul 18 '18 at 22:41
Keep in mind that history isn't always stored at all -- if a command is run with preceding whitespace, for example, it's not even stored in history *in-memory*, much less flushed to disk, with `HISTCONTROL` set to `nospace` or `ignoreboth`. And even when it *is* stored, flushing after every command is something that has to be turned on explicitly. And since different shells have different history, this approach needs to know which one the user is running to even have a chance. – Charles Duffy Jul 19 '18 at 19:59
2

I got really excited about the idea of wrapping the python script with a bash script, but discovered that a bash script doesn't see its parent's history unless I get the users to always source it, which I don't see as a viable option. Looks like Charles Duffy's suggestion about using DEBUG is the most promising angle. – itsadok Jul 22 '18 at 08:51
@itsadok ... in the aliasTest.sh example above, it produces the exact command line as called from within the bash script, without any need for any sourcing of scripts. – philwalk Jul 23 '18 at 20:24
A very simple alternative, if you can provide the alias, or document how it should be created, is to have the alias or script pass an argument identifying the required tag for the error message. Or, equivalently, if the alias is created to a shell script named fooForAlias.sh, you can identify it in sys.argv[0], providing the desired information. – philwalk Jul 23 '18 at 21:40
2

@philwalk it works in `aliasTest.sh` because you use the `history` command from the same bash process that invoked the commands. There's no way of finding out how the `aliasTest.sh` script itself was invoked. – itsadok Jul 24 '18 at 06:53
1

@philwalk, ...this means that if your program wants to have this kind of error handling when invoked from the user's shell, you need to have code running `history` *in the user's shell itself*, not in a script that shell starts. It's not an approach one could reasonably follow in practice, nor good hygiene -- if every program only printed good error messages to users who modify their shell configuration to wrap its invocation, we'd have a mess. – Charles Duffy Jul 25 '18 at 12:06
1

@Charles Duffy ... I'll have to update the description as it's apparently unclear what I'm proposing ... clients must call via the wrapper script, not emulate how it's calling the python script. A separate wrapper is required for each use case. – philwalk Jul 26 '18 at 00:09
2

I don't see how calling via a wrapper solves the problem -- the wrapper doesn't have access to its parent process's history, only *its own* history, unless the user's shell has some very specific configuration (flushing history to disk immediately on every command, which *is not* default behavior). But yes, please do edit to demonstrate. – Charles Duffy Jul 26 '18 at 00:31
1

Aside: `$@` isn't useful inside an alias -- aliases are just prefix substitution, so all remaining text *always* follows them. Thus, `alias foo='bar "$@"'` and then `foo one two three` invokes `bar "$@" one two three`; you don't notice this in typical scenarios because for an interactive shell `$@` is almost always a zero-element list (unless you went out of your way to modify it, as with `set -- "first argument" "second argument"`). – Charles Duffy Jul 26 '18 at 21:21
1

(And unquoted `$@` is identical to `$*`, which is to say it discards the difference between `"first argument" "second argument"` and `"first" "argument" "second" "argument"`). – Charles Duffy Jul 26 '18 at 21:23
2

...back to-point, though: If you *were* going to modify the calling shell, you could just modify it to set `$0`, and thus not need an extra command-line argument at all. For example: `myAlias() { exec -a myAlias myPyScript "$@"; }` will invoke `myPyScript` with `myAlias` in `$0`. – Charles Duffy Jul 26 '18 at 21:25

score 3 · Answer 3 · answered Jul 18 '18 at 15:31

3

There is no way to distinguish between when an interpreter for a script is explicitly specified on the command line and when it is deduced by the OS from the hashbang line.

Proof:

$ cat test.sh 
#!/usr/bin/env bash

ps -o command $$

$ bash ./test.sh 
COMMAND
bash ./test.sh

$ ./test.sh 
COMMAND
bash ./test.sh

This prevents you from detecting the difference between the first two cases in your list.

I am also confident that there is no reasonable way of identifying the other (mediated) ways of calling a command.

answered Jul 18 '18 at 15:31

Leon

31,443
4
72
97

I agree that it's impossible to robustly retrieve the original shell command line, but I disagree that you proved that here. There's considerably better data in `/proc/self` than is available in `ps`. `cmdline_for_pid() { local -a args; local arg; args=( ); while IFS= read -r -d '' arg; do args+=( "$arg" ); done – Charles Duffy Jul 18 '18 at 22:04
@CharlesDuffy For my test case `cmdline_for_pid` works no different from `ps -o command` – Leon Jul 19 '18 at 07:32
use a more interesting test case -- one with spaces, quotes, etc. – Charles Duffy Jul 19 '18 at 13:09
@CharlesDuffy I know the difference. I just wanted to note that for my illustration of my statement `cmdline_for_pid` doesn't add anything useful while requiring the reader to decipher the proof. – Leon Jul 19 '18 at 14:38
I understood the word "proof" to mean something different than what you intended, then. As I read it, a point is only *proved* about which data is or is not available if that data is reflected in a complete and canonical form; a lossy translation doesn't prove anything useful about what *could* have been retrieved by a less-lossy process, and thus doesn't prove anything about which data is or isn't actually available to be retrieved. – Charles Duffy Jul 19 '18 at 14:57

MayeulC · Answer 4 · 2018-07-19T08:24:22.603

I can see two ways to do this:

The simplest, as suggested by 3sky, would be to parse the command line from inside the python script. argparse can be used to do so in a reliable way. This only works if you can change that script.
A more complex way, slightly more generic and involved, would be to change the python executable on your system.

Since the first option is well documented, here are a bit more details on the second one:

Regardless of the way your script is called, python is ran. The goal here is to replace the python executable with a script that checks if foo.py is among the arguments, and if it is, check if --bar is as well. If not, print the message and return.

In every other case, execute the real python executable.

Now, hopefully, running python is done trough the following shebang: #!/usr/bin/env python3, or trough python foo.py, rather than a variant of #!/usr/bin/python or /usr/bin/python foo.py. That way, you can change the $PATH variable, and prepend a directory where your false python resides.

In the other case, you can replace the /usr/bin/python executable, at the risk of not playing nice with updates.

A more complex way of doing this would probably be with namespaces and mounts, but the above is probably enough, especially if you have admin rights.

Example of what could work as a script:

#!/usr/bin/env bash

function checkbar
{
    for i in "$@"
    do
            if [ "$i" = "--bar" ]
            then
                    echo "Well done, you added --bar!"
                    return 0
            fi
    done
    return 1
}

command=$(basename ${1:-none})
if [ $command = "foo.py" ]
then
    if ! checkbar "$@"
    then
        echo "Please add --bar to the command line, like so:"
        printf "%q " $0
        printf "%q " "$@"
        printf -- "--bar\n"
        exit 1
    fi
fi
/path/to/real/python "$@"

However, after re-reading your question, it is obvious that I misunderstood it. In my opinion, it is all right to just print either "foo.py must be called like foo.py --bar", "please add bar to your arguments" or "please try (instead of )", regardless of what the user entered:

If that's an (git) alias, this is a one time error, and the user will try their alias after creating it, so they know where to put the --bar part
with either with /usr/bin/foo.py or python foo.py:
- If the user is not really command line-savvy, they can just paste the working command that is displayed, even if they don't know the difference
- If they are, they should be able to understand the message without trouble, and adjust their command line.

`"$@"`, not `$@`. Unquoted `$@` behaves identically to unquoted `$*` -- which is to say that your arguments all get string-split and expanded as globs. — Charles Duffy, Jul 18 '18 at 21:53
And `exit -1` doesn't make sense -- UNIX exit status is a one-byte **unsigned** (positive) integer. And `[` doesn't promise that `==` will work at all -- the only [POSIX-standardized](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html) string comparison operator is `=`. And the `x` is needless/pointless when you're quoting your expansions and aren't using `test` modes flagged obsolescent (`-a` and `-o` introduce syntactic ambiguities, but those don't exist here). — Charles Duffy, Jul 18 '18 at 21:54
Yeah, that was quickly put together as an example, and not thoroughly tested; I wanted to be on the safe side with `x`. You are right, I will change both `$@` to `"$@"`. The sign bit makes the return value 255, but I can change it to 1, as it doesn't actually matter much. However, that answer is actually off-topic, as it doesn't answer OP's concerns. — MayeulC, Jul 19 '18 at 07:37

score -1 · Answer 5 · answered Jul 18 '18 at 11:34

I know it's bash task, but i think the easiest way is modify 'foo.py'. Of course it depends on level of script complicated, but maybe it will fit. Here is sample code:

#!/usr/bin/python

import sys

if len(sys.argv) > 1 and sys.argv[1] == '--bar':
    print 'make magic'
else:
    print 'Please add the --bar option to your command, like so:'
    print '    python foo.py --bar'

In this case, it does not matter how user run this code.

$ ./a.py
Please add the --bar option to your command, like so:
    python foo.py --bar

$ ./a.py -dua
Please add the --bar option to your command, like so:
    python foo.py --bar

$ ./a.py --bar
make magic

$ python a.py --t
Please add the --bar option to your command, like so:
    python foo.py --bar

$ /home/3sky/test/a.py
Please add the --bar option to your command, like so:
    python foo.py --bar

$ alias a='python a.py'
$ a
Please add the --bar option to your command, like so:
    python foo.py --bar

$ a --bar
make magic

Your "answer" doesn't answer OPs question in any way. Your solution always produces the same message when the `--bar` option is not specified, whereas OP needs a different message depending on the actual command used. — Leon, Jul 18 '18 at 17:04

Is there a way to know how the user invoked a program from bash?

5 Answers5

No, there is no way to see the original text (before aliases/functions/etc).

It's completely conventional on UNIX to assume that your program was started through whatever name is given in argv[0]; this works for symlinks.

If you do want to generate a UNIX command line, use pipes.quote() (Python 2) or shlex.quote() (Python 3) to do it safely.

That can be used to look for a git instance in your parent process tree, and discover its argument list:

It's completely conventional on UNIX to assume that your program was started through whatever name is given in `argv[0]`; this works for symlinks.

If you do want to generate a UNIX command line, use `pipes.quote()` (Python 2) or `shlex.quote()` (Python 3) to do it safely.