This magic line of Bash is indecipherable to me, can someone explain what it's doing?

Question

I found it on this thread: Best way to parse command line args in Bash?

And I'm trying to use it in this code: https://github.com/flyingfishfuse/bash-scripts/blob/master/debootstrap-ubuntu-18-04.sh

And this is the part I don't understand, specifically the third line.

[ $# = 0 ] && help
while [ $# -gt 0 ]; do
  CMD=$(grep -m 1 -Po "^## *$1, --\K[^= ]*|^##.* --\K${1#--}(?:[= ])" go.sh | sed -e "s/-/_/g")
  if [ -z "$CMD" ]; then echo "ERROR: Command '$1' not supported"; exit 1; fi
  shift; eval "$CMD" $@ || shift $? 2> /dev/null

Thank you for your assistance!

What do you find confusing? It is assigning a value to CMD. The value it assigns is the output of a badly written grep piped to sed, which should have used `tr`, which should tell you that the code should be avoided, not emulated. The only thing here which might be confusing is the regex, which absolutely should be simplified. Maybe the `${1#--}` is confusing you as well. I would not emulate this code. Avoid it. The `eval "$CMD" $@ || shift $?` should send you running, screaming. — William Pursell, Jan 20 '20 at 13:29

score 1 · Accepted Answer · answered Jan 20 '20 at 13:37

As the other answer explains the rest of the script, I'll try to explain the grep.

grep
- -m1 - print first match only.
- -P - use perl regex flavor.
- -o - print only the string that matched.
- "^## *$1, --\K[^= ]*|^##.* --\K${1#--}(?:[= ])"
  - The | or two regexes. So we search for one regex or the other.
  - The \K "resets the line position". Basically it means that with -o it will print the result from \K to the end that matched the regex. It's often used together grep -Po 'blabla\Kblabla'. For exampleecho abcde | grep -P 'ab\K..'will printde`.
- ^## *$1, --\K[^= ]*
  - Search for a line that goes ##<zero or more spaces><first argument>, --<anything but not spaces or '=', zero or more times>. And the part <anything but not spaces...> is printed out. So from -- to the next space or = is printed out.
- ^##.* --\K${1#--}(?:[= ])
  - (?: ... ) is a grouping in perl that "does not use memory". The (?:something) is the same as (something).
  - Search for lines that go ##<anything> --<first argument with leading '--' deleted><'=' or space, that will be not included in the output>
- So basically grep wants to find something like ## first_arg, --<this here> or ## blabla --<first_arg> where first_arg is the current first argument in the script and it will output the part inside < >.
sed
- "s/-/_/g" - Just substitutes all - with _

Thanks for taking the time on that :P I would have needed manpages for \K and a few other things, and on mobile it wasn’t worth the trouble. — D. Ben Knoble, Jan 20 '20 at 13:41
Thank you, I'm writing this partially to recover lost knowledge and partially to teach skiddies. I am adding all that to explain the regex and linking your answer in the comments. As you can see there is a year long gap in commits and before then it was more like 7 years... — moop, Jan 20 '20 at 14:14

score 0 · Answer 2 · answered Jan 20 '20 at 13:08

bash(1) isn’t all that “magic,” though it certainly makes its users feel like wizards! The key to deciphering it is to know which manpages to look at. Or, if you are a language person (like me), to realize that bash is one little glue language; a bunch of other little languages are sprinkled on top of it in the form of little tools like grep(1), sed(1), awk(1), cut(1), etc.

So, let’s dig in:

[ $# = 0 ] && help

This says, run the [ test command; if it succeeds, run the help command. $# is the number of arguments. Run help test at your shell to get a good idea of its workings (help only works for builtins—for the rest you need man(1)).

while [ $# -gt 0 ]; do

The beginning of a while loop. To note: most modern bash programmers would use either [[ $# > 0 ]] (unless they need ti be POSIX-sh compatible) or (($# > 0)).

CMD=$(...)

Run the stuff in parens and store the output in CMD (the manual for bash clarifies how this works).

grep ... go.sh

Searches go.sh for a particular pattern (see man re_format for the gory details of regular expressions, and man grep for what the options do—for example, -o means print only the match instead of the whole line).

| sed ...

Output the results of grep as the input to sed. man sed is a good resource for its little language. s/-/_/g substitutes all - for _; a more idiomatic way to do that would be to use tr instead of sed: tr - _.

if [ -z ... ]

Test if the argument is empty. The rest of the if logs a message and exits.

shift

Pops a command line argument off of $@, so now there’s one less.

eval "$CMD" "$@"

Run the CMD string (better would be to use an array here) with the remaining arguments.

|| shift $? 2>/dev/null

If it fails, shift off as many arguments as the exit code ($?), and redirect error messages to /dev/null (such as when there aren’t enough arguments to shift).

This part is a little bizarre, and probably only makes sense in the context of the application. Generally, exit codes don’t tell you what to shift. But you could program something that way.

The real magic is in the parameter expansion and the use of `grep -P` with `\K` and lookaheads. — tripleee, Jan 20 '20 at 13:18
holy crap you are awesome! I'm rewriting that whole part now thanks to your wisdom and its going to be CLEAN AND READABLE! — moop, Jan 20 '20 at 13:31
@moop thanks! I do a lot of bash programming; there *are* (contrary to some beliefs) best practices (esp. for readability and error-handling). Some of my scripts are on GitHub, which should be linked in my profile. Also see [strict mode](http://redsymbol.net/articles/unofficial-bash-strict-mode/), the [advanced guide](http://www.tldp.org/LDP/abs/html/abs-guide.html), [shell style guide](https://google.github.io/styleguide/shell.xml), and this [incredible wiki](http://mywiki.wooledge.org/BashGuide) — D. Ben Knoble, Jan 20 '20 at 13:35
@tripleee oh agreed; some of the stuff in there should make you run screaming :P — D. Ben Knoble, Jan 20 '20 at 19:05

score 0 · Answer 3 · answered Jan 20 '20 at 13:21

CMD=$(grep -m 1 -Po "^## *$1, --\K[^= ]*|^##.* --\K${1#--}(?:[= ])" go.sh | sed -e "s/-/_/g")

The above line is grepping max 1 line (-m 1) from file go.sh only matching the regular expression (-Po) : "^## *$1, --\K[^= ]*|^##.* --\K${1#--}(?:[= ])" ($1 being substituted by the script argument that is being processed currently inside the while loop ) Then the result of the grep command is piped to sed where each '-' is changed to '_' : sed -e "s/-/_/g" . And the resulting string of this operation becomes the value of the CMD variable.

This solved the regex question! I was getting confusing results from a debugger and its because that $1 was indeed being processed by the shell! — moop, Jan 20 '20 at 13:40

This magic line of Bash is indecipherable to me, can someone explain what it's doing?

3 Answers3