0

I would like to parse --this=style command line arguments in bash.

Main requirements:

  • --abc=xyz Obviously should set abc to xyz
  • --thing Should set thing=true
  • --no-thing Should set thing=false
  • Minimal noise when defining default values.

Other constraints:

  • Keeping non --this=style arguments around in some way is helpful.
  • I'm not interested in short form -a -b -c style.
  • A way of requiring certain args would be nice.

I'm interested in both short lightweight portable solutions that can just be inlined into any script, but also larger library style solutions that provide more features. Below is a middle of the ground solution I cooked up (worst of both worlds?) but I'll be interested to see what other ideas people have to balance the trade-off between length and functionality.

parse_args.sh

# Parse double dashed command line options.
#
# Parse positional options in the following formats:
#  --option         Sets {option} to "true".
#  --no-option      Sets {option} to "false".
#  --option=value   Sets {option} to {value}.
# 
# Option names must already be defined or an "unknown option" error will be
# printed to stderr.
#
# Output variables:
#   remaining_args  All remaining args in the order
#                   they were defined.
#
# Returns: 1 if there was an error.
parse_args() {
  remaining_args=() 
  for arg; do
    if [[ "$arg" =~ ^--(.+)$ ]]; then
      local k="${BASH_REMATCH[1]}"
      local v=true
      if [[ "$k" =~ ^([^=]+)=(.*)$ ]]; then
        k="${BASH_REMATCH[1]}"
        v="${BASH_REMATCH[2]}"
      elif [[ "$k" =~ ^no-(.*)$ ]]; then
        k="${BASH_REMATCH[1]}"
        v=false
      fi
      if [[ ! -v "$k" ]]; then
        echo "error: unknown option: '$k'" > /dev/stderr
        return 1
      fi
      eval $k="$v"
    else
      remaining_args+=("$arg")
    fi
  done 

  return 0
}

Here's an example program using it:

example.sh

#!/bin/bash

. parse_args.sh

# Declare defaults
breakfast="avocado toast"
lunch="burrito"
vegetarian=false
vegan=true

# Define help as a string, not a function, so that we have access to the
# defaults
help=$( cat - <<EOF
Usage: $0 --option=value other_food1 ... other_foodN

Options (with given defaults):

  breakfast ($breakfast)
    What to eat for breakfast.

  lunch ($lunch)
    What to eat for lunch.

  vegetarian ($vegetarian)
    Whether the meals should be vegetarian or not.

  vegan ($vegan)
    Whether the meals should be vegan or not.

other_foodN:
  Positional arguments are other foods.  These can be before, after, or in
  between options.
EOF
)

parse_args "$@" || { echo "$help" > /dev/stderr; exit 1 ; }
set -- "${remaining_args[@]}"

cat - <<EOF
  breakfast: '$breakfast'
      lunch: '$lunch'
 vegetarian: '$vegetarian'
      vegan: '$vegan'
other foods:
EOF

for p; do echo "  '$p'" ; done

Running it looks like:

$ ./example.sh \
  --breakfast=muesli mole \
  --lunch=rosti horchata \
  --vegetarian \
  --no-vegan

  breakfast: 'muesli'
      lunch: 'rosti'
 vegetarian: 'true'
      vegan: 'false'
other foods:
  'mole'
  'horchata'



Edit:

While digging a bit more I found that an old co-worker of mine ended up porting the "heavy weight" library from my old work. https://github.com/kward/shflags/blob/master/README.md This is exactly what I wanted for "heavy weight" solutions, thanks Kate for porting it! I'm still interested to see ideas that people have for light weight versions though.

  • 3
    There's `getopt` for that. It's usually in `/usr/bin`. – alvits Feb 04 '20 at 00:30
  • Before posting I read through [How do I parse command line arguments in Bash?](https://stackoverflow.com/questions/192249/how-do-i-parse-command-line-arguments-in-bash). The requirements there are different, but many of the answers did go closer to my direction, but none were satisfactory. – Brian Braunstein Feb 04 '20 at 00:50
  • Oh an about getopt and getopts, I don't find them to be particularly satisfying options because they don't do the full job, they still require some parsing/looping, and it ends up being rather unelegant with a fair bit of noise and repetition. – Brian Braunstein Feb 04 '20 at 01:00
  • 2
    Great code! Stackoverflow is not a forum to satisfy curiosity for other solutions. It's a forum for specific programming problems. [What topic can I ask about here?](https://stackoverflow.com/help/on-topic).. As you seem to have solved your problem. with nice simple function inside `parse_args.sh`, isn't your question better suited for example for codereview.stackexchange rather then here? – KamilCuk Feb 04 '20 at 01:02
  • Oh perhaps, although I am looking for answers. For example, a good answer would have been this: https://github.com/kward/shflags/blob/master/README.md It looks like my old co-worker actually ported the library from my old work to public land, yay! Thanks Kate! This is exactly what I was hoping for in a "heavy weight" solution. It would still be interesting to see what people come up with for elegant light weight solutions though. – Brian Braunstein Feb 04 '20 at 01:21
  • I agree with you about getopt/getopts, they leave me with a feeling of "why didn't i just write the whole thing myself?" That said, if you have a script that is complex enough that it needs to accept these types of arguments, that is a sign (IMO) that you are fast outgrowing Bash and might want to refactor in Python (or anything besides a shell script). Google publishes some of their style guides, including one for Bash, and the same sentiment is reflected there: https://google.github.io/styleguide/shell.xml – Z4-tier Feb 04 '20 at 07:55
  • So there is a place for bash scripts and needing --this=style arguments isn't an indication they're getting too complex, even simple scripts benefit from this. That's why Google went to the trouble of making a library for this. There are two reasons bash makes sense in my case 1) I'm stringing together a bunch of command line calls 2) python greatly increases a container image size relative to bash. – Brian Braunstein Feb 04 '20 at 14:58
  • I feel that the hard-wiring of user-visible name to variable name is in the end going to trip you up. "Why can't the option be called "--watsit" - agree and suddenly you have a versioning issue that pollutes the implementation of the thing feature, rather than being dealt with in the argument processor. – Gem Taylor Feb 04 '20 at 17:11
  • Also, any other variables you happen to declare at the top of the code become "vulnerable" to getting hit. And, you can't use `-v` in the implementation to see if an option was set, it needs to have an "unset" value. Perhaps it would be safer to simply declare a string or regex containing all the variable names you are willing to accept, but then stomp[ then with exec if you must. – Gem Taylor Feb 04 '20 at 17:18
  • Also, there is no protection (or handling, eg by making an array variable) for repeated options, which getopt at least can have. – Gem Taylor Feb 04 '20 at 17:19

1 Answers1

1

I'd suggest to use case

#!/bin/bash

while [[ $@ ]]; do
    case $1 in
        --abc=*   ) abc=${1/--abc=/};;
        --thing   ) thing=true ;;
        --no-thing) thing=false;;
        *         ) echo 'error'; exit 1;;
    esac
    shift
done

echo $abc $thing

And example of using this

$ ./test --abc='hello world!' --thing
hello world! true
Ivan
  • 6,188
  • 1
  • 16
  • 23
  • So the thing that I don't like about the case solutions is the same issue with getopt. It's pretty noisy and mixing parsing with the actual arguments. Notice how you've had to write "thing" 4 times here, and if you wanted to specify a default value you'd need a fifth. – Brian Braunstein Feb 04 '20 at 14:55
  • @BrianBraunstein I'm sure you could write the appropriate regex to merge the two lines at least, but in the end the clarity is probably more important than saving a line. – Gem Taylor Feb 04 '20 at 17:13