Design patterns or best practices for shell scripts

Question

Does anyone know of any resources that talk about best practices or design patterns for shell scripts (sh, bash etc.)?

I just wrote up a little article on [template pattern in BASH](http://quickshiftin.com/blog/2014/01/template-method-pattern-bash/) last night. See what you think. — quickshiftin, Jan 28 '14 at 16:02

score 237 · Accepted Answer · edited Dec 22 '22 at 10:29

I wrote quite complex shell scripts and my first suggestion is "don't". The reason is that is fairly easy to make a small mistake that hinders your script, or even make it dangerous.

That said, I don't have other resources to pass you but my personal experience. Here is what I normally do, which is overkill, but tends to be solid, although very verbose.

Invocation

make your script accept long and short options. be careful because there are two commands to parse options, getopt and getopts. Use getopt as you face less trouble.

CommandLineOptions__config_file=""
CommandLineOptions__debug_level=""

getopt_results=`getopt -s bash -o c:d:: --long config_file:,debug_level:: -- "$@"`

if test $? != 0
then
    echo "unrecognized option"
    exit 1
fi

eval set -- "$getopt_results"

while true
do
    case "$1" in
        --config_file)
            CommandLineOptions__config_file="$2";
            shift 2;
            ;;
        --debug_level)
            CommandLineOptions__debug_level="$2";
            shift 2;
            ;;
        --)
            shift
            break
            ;;
        *)
            echo "$0: unparseable option $1"
            EXCEPTION=$Main__ParameterException
            EXCEPTION_MSG="unparseable option $1"
            exit 1
            ;;
    esac
done

if test "x$CommandLineOptions__config_file" == "x"
then
    echo "$0: missing config_file parameter"
    EXCEPTION=$Main__ParameterException
    EXCEPTION_MSG="missing config_file parameter"
    exit 1
fi

Another important point is that a program should always return zero if completes successfully, non-zero if something went wrong.

Function calls

You can call functions in bash, just remember to define them before the call. Functions are like scripts, they can only return numeric values. This means that you have to invent a different strategy to return string values. My strategy is to use a variable called RESULT to store the result, and returning 0 if the function completed cleanly. Also, you can raise exceptions if you are returning a value different from zero, and then set two "exception variables" (mine: EXCEPTION and EXCEPTION_MSG), the first containing the exception type and the second a human readable message.

When you call a function, the parameters of the function are assigned to the special vars $0, $1 etc. I suggest you to put them into more meaningful names. declare the variables inside the function as local:

function foo {
   local bar="$0"
}

Error prone situations

In bash, unless you declare otherwise, an unset variable is used as an empty string. This is very dangerous in case of typo, as the badly typed variable will not be reported, and it will be evaluated as empty. use

set -o nounset

to prevent this to happen. Be careful though, because if you do this, the program will abort every time you evaluate an undefined variable. For this reason, the only way to check if a variable is not defined is the following:

if test "x${foo:-notset}" == "xnotset"
then
    echo "foo not set"
fi

You can declare variables as readonly:

readonly readonly_var="foo"

Modularization

You can achieve "python like" modularization if you use the following code:

set -o nounset
function getScriptAbsoluteDir {
    # @description used to get the script path
    # @param $1 the script $0 parameter
    local script_invoke_path="$1"
    local cwd=`pwd`

    # absolute path ? if so, the first character is a /
    if test "x${script_invoke_path:0:1}" = 'x/'
    then
        RESULT=`dirname "$script_invoke_path"`
    else
        RESULT=`dirname "$cwd/$script_invoke_path"`
    fi
}

script_invoke_path="$0"
script_name=`basename "$0"`
getScriptAbsoluteDir "$script_invoke_path"
script_absolute_dir=$RESULT

function import() { 
    # @description importer routine to get external functionality.
    # @description the first location searched is the script directory.
    # @description if not found, search the module in the paths contained in $SHELL_LIBRARY_PATH environment variable
    # @param $1 the .shinc file to import, without .shinc extension
    module=$1

    if test "x$module" == "x"
    then
        echo "$script_name : Unable to import unspecified module. Dying."
        exit 1
    fi

    if test "x${script_absolute_dir:-notset}" == "xnotset"
    then
        echo "$script_name : Undefined script absolute dir. Did you remove getScriptAbsoluteDir? Dying."
        exit 1
    fi

    if test "x$script_absolute_dir" == "x"
    then
        echo "$script_name : empty script path. Dying."
        exit 1
    fi

    if test -e "$script_absolute_dir/$module.shinc"
    then
        # import from script directory
        . "$script_absolute_dir/$module.shinc"
    elif test "x${SHELL_LIBRARY_PATH:-notset}" != "xnotset"
    then
        # import from the shell script library path
        # save the separator and use the ':' instead
        local saved_IFS="$IFS"
        IFS=':'
        for path in $SHELL_LIBRARY_PATH
        do
            if test -e "$path/$module.shinc"
            then
                . "$path/$module.shinc"
                return
            fi
        done
        # restore the standard separator
        IFS="$saved_IFS"
    fi
    echo "$script_name : Unable to find module $module."
    exit 1
}

you can then import files with the extension .shinc with the following syntax

import "AModule/ModuleFile"

Which will be searched in SHELL_LIBRARY_PATH. As you always import in the global namespace, remember to prefix all your functions and variables with a proper prefix, otherwise you risk name clashes. I use double underscore as the python dot.

Also, put this as first thing in your module

# avoid double inclusion
if test "${BashInclude__imported+defined}" == "defined"
then
    return 0
fi
BashInclude__imported=1

Object oriented programming

In bash, you cannot do object oriented programming, unless you build a quite complex system of allocation of objects (I thought about that. it's feasible, but insane). In practice, you can however do "Singleton oriented programming": you have one instance of each object, and only one.

What I do is: i define an object into a module (see the modularization entry). Then I define empty vars (analogous to member variables) an init function (constructor) and member functions, like in this example code

# avoid double inclusion
if test "${Table__imported+defined}" == "defined"
then
    return 0
fi
Table__imported=1

readonly Table__NoException=""
readonly Table__ParameterException="Table__ParameterException"
readonly Table__MySqlException="Table__MySqlException"
readonly Table__NotInitializedException="Table__NotInitializedException"
readonly Table__AlreadyInitializedException="Table__AlreadyInitializedException"

# an example for module enum constants, used in the mysql table, in this case
readonly Table__GENDER_MALE="GENDER_MALE"
readonly Table__GENDER_FEMALE="GENDER_FEMALE"

# private: prefixed with p_ (a bash variable cannot start with _)
p_Table__mysql_exec="" # will contain the executed mysql command 

p_Table__initialized=0

function Table__init {
    # @description init the module with the database parameters
    # @param $1 the mysql config file
    # @exception Table__NoException, Table__ParameterException

    EXCEPTION=""
    EXCEPTION_MSG=""
    EXCEPTION_FUNC=""
    RESULT=""

    if test $p_Table__initialized -ne 0
    then
        EXCEPTION=$Table__AlreadyInitializedException   
        EXCEPTION_MSG="module already initialized"
        EXCEPTION_FUNC="$FUNCNAME"
        return 1
    fi


    local config_file="$1"

      # yes, I am aware that I could put default parameters and other niceties, but I am lazy today
      if test "x$config_file" = "x"; then
          EXCEPTION=$Table__ParameterException
          EXCEPTION_MSG="missing parameter config file"
          EXCEPTION_FUNC="$FUNCNAME"
          return 1
      fi


    p_Table__mysql_exec="mysql --defaults-file=$config_file --silent --skip-column-names -e "

    # mark the module as initialized
    p_Table__initialized=1

    EXCEPTION=$Table__NoException
    EXCEPTION_MSG=""
    EXCEPTION_FUNC=""
    return 0

}

function Table__getName() {
    # @description gets the name of the person 
    # @param $1 the row identifier
    # @result the name
    
    EXCEPTION=""
    EXCEPTION_MSG=""
    EXCEPTION_FUNC=""
    RESULT=""
    
    if test $p_Table__initialized -eq 0
    then
        EXCEPTION=$Table__NotInitializedException
        EXCEPTION_MSG="module not initialized"
        EXCEPTION_FUNC="$FUNCNAME"
        return 1
    fi
    
    id=$1
      
      if test "x$id" = "x"; then
          EXCEPTION=$Table__ParameterException
          EXCEPTION_MSG="missing parameter identifier"
          EXCEPTION_FUNC="$FUNCNAME"
          return 1
      fi
    
    local name=`$p_Table__mysql_exec "SELECT name FROM table WHERE id = '$id'"`
      if test $? != 0 ; then
        EXCEPTION=$Table__MySqlException
        EXCEPTION_MSG="unable to perform select"
        EXCEPTION_FUNC="$FUNCNAME"
        return 1
      fi
    
    RESULT=$name
    EXCEPTION=$Table__NoException
    EXCEPTION_MSG=""
    EXCEPTION_FUNC=""
    return 0
}

Trapping and handling signals

I found this useful to catch and handle exceptions.

function Main__interruptHandler() {
    # @description signal handler for SIGINT
    echo "SIGINT caught"
    exit
} 
function Main__terminationHandler() { 
    # @description signal handler for SIGTERM
    echo "SIGTERM caught"
    exit
} 
function Main__exitHandler() { 
    # @description signal handler for end of the program (clean or unclean). 
    # probably redundant call, we already call the cleanup in main.
    exit
} 
    
trap Main__interruptHandler INT
trap Main__terminationHandler TERM
trap Main__exitHandler EXIT

function Main__main() {
    # body
}

# catch signals and exit
trap exit INT TERM EXIT

Main__main "$@"

Hints and tips

If something does not work for some reason, try to reorder the code. Order is important and not always intuitive.

do not even consider working with tcsh. it does not support functions, and it's horrible in general.

Please note: If you have to use the kind of things I wrote here, it means that your problem is too complex to be solved with shell. use another language. I had to use it due to human factors and legacy.

Wow, and I thought I was going for overkill in bash... I tend to use isolated functions and abuse subshells (thus I suffer when speed is in any way relevant). No global variables ever, neither in nor out (to preserve remains of sanity). All returns through stdout or file output. set -u/set -e (too bad set -e becomes useless as soon as first if, and most of my code often is in there). Function arguments taken with [local something="$1"; shift] (allows for easy reordering when refactoring). After one 3000 lines bash script I tend to write even smallest scripts in this fashion... — Eugene, Aug 22 '09 at 10:10
small corrections for modularization : 1 you need a return after . "$script_absolute_dir/$module.shinc" to avoid missing warning. 2 you must set IFS="$saved_IFS" before your return on finding module in $SHELL_LIBRARY_PATH — Duff, May 02 '13 at 12:43
"human factors" are the worst ones. Machines don't fight you when you give them something better. — jeremyjjbrown, Dec 18 '14 at 23:20
Why `getopt` vs `getopts`? `getopts` is more portable and works in any POSIX shell. Especially since the question is *shell best practices* instead of specifically bash best practices, I would support POSIX compliance to support multiple shells when possible. — Wimateeka, Sep 05 '17 at 14:41
thanks for offering all the advice for shell scripting even though you are being honest :"Hope it helps, although please note. If you have to use the kind of things I wrote here, it means that your problem is too complex to be solved with shell. use another language. I had to use it due to human factors and legacy." — dieHellste, Nov 15 '17 at 10:59
Would `if test "x${foo:+notset}" == "xnotset"` be better? Using :- means if foo was set to "notset" it would treat it as not set, whereas with :+, if foo was set to notset, it would expand to "x", so will never fail. Not familiar enough to know if :+ is limited to bash-only or recent bash versions. — Stuart R. Jefferys, Jul 16 '18 at 22:49

score 26 · Answer 2 · answered Sep 17 '08 at 02:44

Take a look at the Advanced Bash-Scripting Guide for a lot of wisdom on shell scripting - not just Bash, either.

Don't listen to people telling you to look at other, arguably more complex languages. If shell scripting meets your needs, use that. You want functionality, not fanciness. New languages provide valuable new skills for your resume, but that doesn't help if you have work that needs to be done and you already know shell.

As stated, there aren't a lot of "best practices" or "design patterns" for shell scripting. Different uses have different guidelines and bias - like any other programming language.

Note that for scripts of even slight complexity, that is NOT a best practice. Coding is not about just getting something to work. It is about building it quickly, easily, and it being reliable, reusable, and easy to read and maintain (especially for others). Shell scripts do not scale well to any level. More robust languages are much simpler for projects with any logic. — drifter, Dec 09 '11 at 13:26

score 20 · Answer 3 · answered Sep 17 '08 at 23:57

shell script is a language designed to manipulate files and processes. While it's great for that, it's not a general purpose language, so always try to glue logic from existing utilities rather than recreating new logic in shell script.

Other than that general principle I've collected some common shell script mistakes.

score 13 · Answer 4 · answered Sep 17 '08 at 19:30

Know when to use it. For quick and dirty gluing commands together it's okay. If you need to make any more than few non-trivial decisions, loops, anything, go for Python, Perl, and modularize.

The biggest problem with shell is often that end result just looks like a big ball of mud, 4000 lines of bash and growing... and you can't get rid of it because now your whole project depends on it. Of course, it started at 40 lines of beautiful bash.

score 12 · Answer 5 · edited Apr 10 '09 at 22:50

12

There was a great session at OSCON this year (2008) on just this topic: http://assets.en.oreilly.com/1/event/12/Shell%20Scripting%20Craftsmanship%20Presentation%201.pdf

edited Apr 10 '09 at 22:50

Jonathan Leffler

730,956
141
904
1,278

answered Sep 17 '08 at 04:57

Fhoxh

788
3
17

score 10 · Answer 6 · answered Sep 17 '08 at 00:05

10

use set -e so you don't plow forward after errors. Try making it sh compatible without relying on bash if you want it to run on not-linux.

answered Sep 17 '08 at 00:05

user10392

1,384
8
12

score 9 · Answer 7 · answered Sep 17 '08 at 00:02

9

Easy: use python instead of shell scripts. You get a near 100 fold increase in readablility, without having to complicate anything you don't need, and preserving the ability to evolve parts of your script into functions, objects, persistent objects (zodb), distributed objects (pyro) nearly without any extra code.

answered Sep 17 '08 at 00:02

8

you contradict yourself by saying "without having to complicate" and then listing the various complexities you think add value, while in most cases are abused into ugly monsters instead of used to simplify problems and implementation. – Evgeny Feb 08 '10 at 18:39
3

this implies a great drawback, your scripts will not be portable on systems where python is not present – astropanic Aug 25 '11 at 21:48
1

I realize this was answered in '08 (it is now two days before '12); however, for those looking at this years later, I would caution anyone against turning their back on languages like Python or Ruby as it is more likely that it is available and if not, it is a command (or couple clicks) away from being installed. If you need further portability, think about writing your program in Java as you will be hard-pressed to find a machine that doesn't have a JVM available. – Wil Moore III Dec 29 '11 at 21:14
@astropanic pretty much all Linux ports with Python nowadays – Pithikos Mar 07 '17 at 10:31
@Pithikos, sure, and fiddle with the hassle of python2 vs python3. Nowadays I writte all my tools with go, and can't be happier. – astropanic Mar 07 '17 at 11:18
Python is probably my favorite language, but when the primary need is manipulating files and processes and calling other tools, Python is a verbose pain in the butt compared to shell scripts. Sure, *consider* Python (or Ruby or Go or whatever), but shell has its place too. – Soren Bjornstad Dec 20 '18 at 23:39

score 7 · Answer 8 · answered Sep 17 '08 at 20:33

To find some "best practices", look how Linux distro's (e.g. Debian) write their init-scripts (usually found in /etc/init.d)

Most of them are without "bash-isms" and have a good separation of configuration settings, library-files and source formatting.

My personal style is to write a master-shellscript which defines some default variables, and then tries to load ("source") a configuration file which may contain new values.

I try to avoid functions since they tend to make the script more complicated. (Perl was created for that purpose.)

To make sure the script is portable, test not only with #!/bin/sh, but also use #!/bin/ash, #!/bin/dash, etc. You'll spot the Bash specific code soon enough.

Design patterns or best practices for shell scripts

8 Answers8

Linked

Related