1

I have been working with multi-line string in bash (no need to bring up bash array, it is a POSIX thing). A full working demo is posted in an online BASH emulator.

Problem I have is that everytime I call a function and return a string back, the proper way to handling multi-line string has inadvertly resulted in the tacking on an extra chr(10) to the end of the string.

Suggested duplicate(s) did not apply:

This working example of a multi-line bash string correctly has a blank line at the end and is:

# bash variable declared as multi-line string
ini_buffer="1.1.1.1"
"

That translate the original multi-line string into a hex dump of length 8:

00000000  31 2e 31 2e 31 2e 31 0a  00                       |1.1.1.1..|
00000009

Bash script starts off with:

# there is exactly one chr(10) at the end of ini_buffer
ini_buffer="1.2.3.4
"
echo "initial ini_buffer \"\"\"$ini_buffer\"\"\""
echo "initial ini_buffer len: ${#ini_buffer}"
echo


IFS= read -rd '' result < <(echo "$ini_buffer")
# got a SECOND chr(10) prepended to the final output 

echo "result len: ${#result}"
echo "\"\"\"$result\"\"\""
echo

Results are:

initial ini_buffer """1.2.3.4
"""
initial ini_buffer len: 8

result len: 9
"""1.2.3.4

"""

Notice that it grew a character?!

00000000  31 2e 31 2e 31 2e 31 0a  0a 00                    |1.1.1.1..|
00000009

Added the first function:

first_function()
{
  local first_buffer result1_buffer
  # takes a string with a single chr(10)
  first_buffer="$1"

  # calls a function, which does nothing.
  IFS= read -rd '' result1_buffer < <(second_function "$first_buffer")
  # yet it got prepended by another chr(10) for a total of two chr(10)

echoerr "result1_buffer len: ${#result1_buffer}"
echoerr "\"\"\"$result1_buffer\"\"\""
  # Suuposedly only one way to return a multi-line string neatly, 
  # and that is via STDOUT (fd=1)
  # echo "first_buffer len: ${#first_buffer}"
  echo "$result1_buffer"
}


# there is exactly one chr(10) at the end of ini_buffer
ini_buffer="1.2.3.4
"
echo "initial ini_buffer \"\"\"$ini_buffer\"\"\""
echo "initial ini_buffer len: ${#ini_buffer}"
echo

#  THIS LINE CHANGED from `echo` to `first_function`
IFS= read -rd '' result < <(first_function "$ini_buffer")
# got a SECOND chr(10) prepended to the final output 
# for a total of 3 prepended chr(10)s

echo "result len: ${#result}"
echo "\"\"\"$result\"\"\""
echo

Result of the first function is:

initial ini_buffer """1.2.3.4
"""
initial ini_buffer len: 8

result1_buffer len: 9
"""1.2.3.4

"""
result len: 10
"""1.2.3.4


"""

Every time that a function returning from a nested-called, another chr(10) gets tacked on to it upon return.

This also got increase when a second function was introduced of which I shall not include here for brevity.

This is getting maddening here to me. Has to do with the last-line being blank (or jus a chr(10) character). Not many online authoritative content on proper handling of multi-line string.

What did I do wrong?

Process substitution (<( ... )) gets used here instead of the usual command substitution ($( ... )) which had shown difficulty in working with multi-line string. As a result, I must output any debug stattement to the STDERR using:

echoerr() { printf "%s\n" "$*" >&2; }

I would like to do the proper thing of bash progrmaming with regard to multi-line handling, notably with blank line(s) at the end of its string.

A complete test is reiterated here (working demo link is in cited in the first paragraph):

echoerr() { printf "%s\n" "$*" >&2; }

dump_string_char()
{
  local string len_str idx this_char this_int
  string="$1"
  echoerr "string: \"\"\"$string\"\"\"" 
  len_str="${#string}"
  idx=0
  while [ $idx -lt ${len_str} ]; do
    this_char="${string:$idx:1}"
    this_int="$(LC_CTYPE=C printf "%d" "'$this_char")"
    echoerr "idx: $idx"
    if [ $this_int -lt 32 ]; then
      echoerr "$idx: ${this_int}" 
    else
      echoerr "$idx: \"${this_char}\"" 
    fi
    ((idx++))
  done
}

second_function()
{
  local second_ini_buffer result2_buffer
  second_ini_buffer="$1"

  # Some magical awk/sed that did not match any pattern
  # So let us use 'echo' to re-save the same string

  IFS= read -rd '' result2_buffer < <(echo "$second_ini_buffer")
  echoerr "result2_buffer len: ${#result2_buffer}"
  echoerr "\"\"\"$result2_buffer\"\"\""
  # so pass back the full ini_buffer as-is.
  # hopefully with ONE chr(1) at end-of-line.
  # but it doesn't.

  echo "$result2_buffer"
}


first_function()
{
  local first_buffer result1_buffer
  # takes a string with a single chr(10)
  first_buffer="$1"

  # calls a function, which does nothing.
  IFS= read -rd '' result1_buffer < <(second_function "$first_buffer")
  # IFS= read -rd '' result1_buffer < <(echo "$first_buffer")
  # yet it got prepended by another chr(10) for a total of two chr(10)

echoerr "result1_buffer len: ${#result1_buffer}"
echoerr "\"\"\"$result1_buffer\"\"\""
  # Suuposedly only one way to return a multi-line string neatly, 
  # and that is via STDOUT (fd=1)
  # echo "first_buffer len: ${#first_buffer}"
  echo "$result1_buffer"
}


# there is exactly one chr(10) at the end of ini_buffer
ini_buffer="1.2.3.4
"
echoerr "initial ini_buffer \"\"\"$ini_buffer\"\"\""
echoerr "initial ini_buffer len: ${#ini_buffer}"
echoerr
dump_string_char "$ini_buffer"


IFS= read -rd '' result < <(first_function "$ini_buffer")
# got a SECOND chr(10) prepended to the final output 
# for a total of 3 prepended chr(10)s

echoerr "result len: ${#result}"
echoerr "\"\"\"$result\"\"\""
echoerr

Once again to the moderator who thinks these are the same question:

It is not related to a carriage return but many ASCII characters, so this does not apply:

And does not address empty multi-lines like this question does:

John Greene
  • 2,239
  • 3
  • 26
  • 37
  • 5
    `echo` adds a newline (ASCII 10) at the end, and so do `printf "%s\n"` (that's what the `\n` means) and `<<<`. Try `printf "%s"`. – Gordon Davisson Mar 06 '22 at 23:52
  • 1
    What @GordonDavisson said is the answer. Further to that, note that a command substitution (`$(echo "$var")`) will strip _all_ trailing new lines, but a process substitution (`<(echo "$var")`) will not. – dan Mar 07 '22 at 00:32
  • 1
    Instead of a raw new line, you _could_ use a printf format string: `var='1.2.3.4\n'` (and expand it with `printf "$var"`), or at least put the new line in a variable, to make indentation possible. Both `printf %b` and `$'\n'` notation are not POSIX. Although I heard the latter is being considered for the next standard, which is great. – dan Mar 07 '22 at 00:37
  • 1
    You say that your code must be POSIX, but if you put a `#! /bin/sh` shebang on it [Shellcheck](https://www.shellcheck.net/) reports 12 separate Bashisms in it. – pjh Mar 07 '22 at 00:48
  • 1
    I've got a [section](https://victor-engmark.gitlab.io/advanced-shell-scripting-with-bash/#/newlines) in a course dedicated to handling various trailing newline issues in Bash. – l0b0 Mar 07 '22 at 01:23
  • Note that the usual unix/shell convention is that there should be a newline at the end of of a (nonempty) text file (including pipes, function output, etc), but *not* in variables. Therefore, newlines get normally added when variables are output, removed when they're read from input. You can use a different convention for variables if you like, but you're fighting against the way most shell tools are set up. – Gordon Davisson Mar 07 '22 at 05:05
  • @GordonDavisson, what is the way that most shell tools handle on returning a multi-line variable from a function call? My way merely leverages STDOUT as a singular passing mechanism of updated multi-line variable. If you can list two or more ways or a more simpler POSIX way that the other answers, you would (naturally) get the answer – John Greene Mar 07 '22 at 12:37
  • The topic of returning non-trivial values from Bash functions is covered in many other Stackoverflow questions, including these highly-upvoted ones: [Return value in a Bash function](https://stackoverflow.com/q/17336915/4154375), [How to return a string value from a Bash function](https://stackoverflow.com/q/3236871/4154375), [How to return an array in bash without using globals?](https://stackoverflow.com/q/10582763/4154375). – pjh Mar 07 '22 at 12:50
  • See [BashFAQ/084 (How do I return a string (or large number, or negative number) from a function? "return" only lets me give a number from 0 to 255.)](https://mywiki.wooledge.org/BashFAQ/084). – pjh Mar 07 '22 at 14:00
  • These are all good links. Perhaps, it was the non-array multi-line string and its return of its multi-line string value via STDOUT from a function with the last line(s) being a blank line that fooled some of us. – John Greene Mar 07 '22 at 14:23
  • @JohnGreene I didn't mean to imply you *shouldn't* retain the terminating newline in variables, just that if you do, you'll have to fight against the standard tools rather than letting them do the work for you. The standard conventions are really set up for single-valued (not line oriented) data, so multiline values are a bit of a kluge anyway. (And the standard tools don't handle blank lines at the end of values well at all, so you'll have to fight them on this any way you do it.) – Gordon Davisson Mar 07 '22 at 18:18
  • Yeah, so I have GREATLY noticed. And this fight got won. – John Greene Mar 14 '22 at 11:14

1 Answers1

3

Your problem here is that echo always adds a newline to the end of your string - if it already contains one, it will have two. Since you're always "returning" using echo "$result", every time it will add a new one.

You should try using printf "%s" "$result":

ini_buffer="1.2.3.4
"
echo "initial ini_buffer len: ${#ini_buffer}"

IFS= read -rd '' result < <(printf "%s" "$ini_buffer")

echo "result len: ${#result}"

Result:

initial ini_buffer len: 8
result len: 8
Leonardo Dagnino
  • 2,914
  • 7
  • 28
  • `echo -ne "stuff"` is also an option, but `printf` (for all output) is the better choice. – David C. Rankin Mar 07 '22 at 02:21
  • 1
    Yeah, I jusdt went with the 'safer' printf since `-n` for echo isn't really POSIX, which was mentioned in the question (`-e` neither, but that one isn't needed in the answer) – Leonardo Dagnino Mar 07 '22 at 02:39
  • this is looking to be like the best answer so far; the earlier comment of using `print “$var”` looked attractive until I saw the hazard of doing that (`$var` could have `printf` format field options embedded in it.) This answer ensures that its output gets truly “**as-is**” as possible and verbatimly so. – John Greene Mar 07 '22 at 12:26
  • This is the right answer. Demonstrated in this online bash emulator at this link: https://ideone.com/wzkLCW – John Greene Mar 07 '22 at 14:21