50

For personal development and projects I work on, we use four spaces instead of tabs. However, I need to use a heredoc, and I can't do so without breaking the indention flow.

The only working way to do this I can think of would be this:

usage() {
    cat << '    EOF' | sed -e 's/^    //';
    Hello, this is a cool program.
    This should get unindented.
    This code should stay indented:
        something() {
            echo It works, yo!;
        }
    That's all.
    EOF
}

Is there a better way to do this?

Let me know if this belongs on the Unix/Linux Stack Exchange instead.

IBPX
  • 681
  • 1
  • 6
  • 11
  • 3
    No, it's a programming question, it's legit here. Thanks for checking. – Tom Zych Nov 19 '15 at 22:35
  • This seems like a good, clear, straightforward way to do it. Don't know other ways offhand. I'll upvote it when I get more votes, maybe someone knows something interesting. – Tom Zych Nov 19 '15 at 22:38
  • Ow, nice solution! Unfortunately not possible in most other languages, making indenting blocks of code tricky. – Kenney Nov 19 '15 at 22:38
  • 3
    The correct shell way to do this is to indent with tabs. Are you really that strongly against tabs even if only used for here-docs like this? – John1024 Nov 19 '15 at 22:44
  • @John1024 If I was using tabs, that'd be fine. However, I really, _really_ don't want to mix both. In most cases, it's not my choice what to use, as I have to conform to style guides. – IBPX Nov 19 '15 at 22:50
  • Spaces vs. tabs just comes down to preference. I prefer spaces because they're the same size for everyone, and it's easier to align stuff. I used to use tabs up until 6 months ago. You're entitled to use tabs if that works for you, but I personally have to (and prefer to) use spaces. – IBPX Nov 19 '15 at 22:54
  • 3
    You would only need to use tabs _within the heredoc_. You are free to use spaces everywhere else. At the risk of sounding harsh, if you have a shell style guide that doesn't know that the shell is designed this way, then the style guide should be updated. – John1024 Nov 20 '15 at 00:02
  • Like I said, it's not my choice. Even if it was, I wouldn't mix tabs and spaces. I would either use one or the other. Mixing them, in practice, is one of the worst things any developer can do. – IBPX Nov 20 '15 at 00:42
  • 5
    I can think of many things worse than mixing spaces and tabs in a language that doesn't distinguish between them for indentation. – chepner Nov 20 '15 at 00:55
  • 2
    Spaces have a fixed width. Tabs have a varying width. So if I'm indenting with four spaces, then I want a heredoc to be indented, I _could_ use tabs, but that means everyone editing my code has to have there editor set to 4-width tabs. At the end of the day, it's just preference. I'm not going to try to tell you that spaces are better than tabs or otherwise, that's up to you. – IBPX Nov 20 '15 at 01:09
  • 1
    Instead of `sed -e 's/^ //'` you could also use `cut -c 5-`. – user12205 Nov 20 '15 at 02:13

2 Answers2

66

(If you are using bash 4, scroll to the end for what I think is the best combination of pure shell and readability.)

For heredocs, using tabs is not a matter of preference or style; it's how the language is defined.

usage () {
⟶# Lines between EOF are each indented with the same number of tabs
⟶# Spaces can follow the tabs for in-document indentation
⟶cat <<-EOF
⟶⟶Hello, this is a cool program.
⟶⟶This should get unindented.
⟶⟶This code should stay indented:
⟶⟶    something() {
⟶⟶        echo It works, yo!;
⟶⟶    }
⟶⟶That's all.
⟶EOF
}

Another option is to avoid a here document altogether, at the cost of having to use more quotes and line continuations:

usage () {
    printf '%s\n' \
        "Hello, this is a cool program." \
        "This should get unindented." \
        "This code should stay indented:" \
        "    something() {" \
        "        echo It works, yo!" \
        "    }" \
        "That's all."
}

If you are willing to forego POSIX compatibility, you can use an array to avoid the explicit line continuations:

usage () {
    message=(
        "Hello, this is a cool program."
        "This should get unindented."
        "This code should stay indented:"
        "    something() {"
        "        echo It works, yo!"
        "    }"
        "That's all."
    )
    printf '%s\n' "${message[@]}"
}

The following uses a here document again, but this time with bash 4's readarray command to populate an array. Parameter expansion takes care of removing a fixed number of spaces from the beginning of each lie.

usage () {
    # No tabs necessary!
    readarray message <<'    EOF'
        Hello, this is a cool program.
        This should get unindented.
        This code should stay indented:
            something() {
                echo It works, yo!;
            }
        That's all.
    EOF
    # Each line is indented an extra 8 spaces, so strip them
    printf '%s' "${message[@]#        }"
}

One last variation: you can use an extended pattern to simplify the parameter expansion. Instead of having to count how many spaces are used for indentation, simply end the indentation with a chosen non-space character, then match the fixed prefix. I use : . (The space following the colon is for readability; it can be dropped with a minor change to the prefix pattern.)

(Also, as an aside, one drawback to your very nice trick of using a here-doc delimiter that starts with whitespace is that it prevents you from performing expansions inside the here-doc. If you wanted to do so, you'd have to either leave the delimiter unindented, or make one minor exception to your no-tab rule and use <<-EOF and a tab-indented closing delimiter.)

usage () {
    # No tabs necessary!
    closing="That's all"
    readarray message <<EOF
       : Hello, this is a cool program.
       : This should get unindented.
       : This code should stay indented:
       :      something() {
       :          echo It works, yo!;
       :      }
       : $closing
EOF
    shopt -s extglob
    printf '%s' "${message[@]#+( ): }"
    shopt -u extglob
}
terdon
  • 3,260
  • 5
  • 33
  • 57
chepner
  • 497,756
  • 71
  • 530
  • 681
  • 1
    How do you include a blank line using this last variation? – Jason Harrison Jul 05 '21 at 16:51
  • 1
    Just don't put anything after the initial `: `. – chepner Jul 05 '21 at 16:56
  • @chpner I think you would also have to modify the array expansion to remove the space after the colon. Part of the challenge I ran into is my editor and pre-commit hooks strip trailing spaces. My "solution" was to replace the ": " with "::" in the prefix and array expansion. – Jason Harrison Jul 05 '21 at 17:01
  • This answer is [discussed on Unix & Linux](https://unix.stackexchange.com/q/722377/50687). – A.L Oct 25 '22 at 23:32
0
geta() {
  local _ref=$1
  local -a _lines
  local _i
  local _leading_whitespace
  local _len

  IFS=$'\n' read -rd '' -a _lines ||:
  _leading_whitespace=${_lines[0]%%[^[:space:]]*}
  _len=${#_leading_whitespace}
  for _i in "${!_lines[@]}"; do
    printf -v "$_ref"[$_i] '%s' "${_lines[$_i]:$_len}"
  done
}

gets() {
  local _ref=$1
  local -a _result
  local IFS

  geta _result
  IFS=$'\n'
  printf -v "$_ref" '%s' "${_result[*]}"
}

This is a slightly different approach which requires Bash 4.1 due to printf's assigning to array elements. (for prior versions, substitute the geta function below). It deals with arbitrary leading whitespace, not just a predetermined amount.

The first function, geta, reads from stdin, strips leading whitespace and returns the result in the array whose name was passed in.

The second, gets, does the same thing as geta but returns a single string with newlines intact (except the last).

If you pass in the name of an existing variable to geta, make sure it is already empty.

Invoke geta like so:

$ geta hello <<'EOS'
>    hello
>    there
>EOS
$ declare -p hello
declare -a hello='([0]="hello" [1]="there")'

gets:

$ unset -v hello
$ gets hello <<'EOS'
>     hello
>     there
> EOS
$ declare -p hello
declare -- hello="hello
there"

This approach should work for any combination of leading whitespace characters, so long as they are the same characters for all subsequent lines. The function strips the same number of characters from the front of each line, based on the number of leading whitespace characters in the first line.

The reason all the variables start with underscore is to minimize the chance of a name collision with the passed array name. You might want to rewrite this to prefix them with something even less likely to collide.

To use in OP's function:

gets usage_message <<'EOS'
    Hello, this is a cool program.
    This should get unindented.
    This code should stay indented:
        something() {
            echo It works, yo!;
        }
    That's all.
EOS

usage() {
    printf '%s\n' "$usage_message"
}

As mentioned, for Bash older than 4.1:

geta() {
  local _ref=$1
  local -a _lines
  local _i
  local _leading_whitespace
  local _len

  IFS=$'\n' read -rd '' -a _lines ||:
  _leading_whitespace=${_lines[0]%%[^[:space:]]*}
  _len=${#_leading_whitespace}
  for _i in "${!_lines[@]}"; do
    eval "$(printf '%s+=( "%s" )' "$_ref" "${_lines[$_i]:$_len}")"
  done
}
Binary Phile
  • 2,538
  • 16
  • 16
  • Any reason you're using the non-expanding HEREDOC style here without making any note of it? I feel this is a question which will attract visits from people who are learning. – ocodo Jul 19 '22 at 03:02
  • 1
    Safety. Expansion should be chosen when necessary, not by default. – Binary Phile Oct 31 '22 at 19:59