1

I'm having a variable with multiple strings, which can contain multiple lines:

var="foo 'bar baz' 'lorem
ipsum'"

I need all of them as array elements, so my idea was to use xargs -n1 to read every quoted or unquoted string into separate array elements:

mapfile -t arr < <(xargs -n1 <<< "$(echo "$var")" )

But this causes this error:

xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option

Finally the only idea I had, was to replace the line feed against a carriage return and restore it afterwards:

# fill array                                  preserve line feed (dirty)
mapfile -t arr < <(xargs -n1 <<< "$(echo "$var" | tr '\n' '\r')" )

# restore line feed
for (( i=0; i<${#arr[@]}; i++ )); do
  arr[i]=$(echo "${arr[$i]}" | tr '\r' '\n')
done

It works:

# for (( i=0; i<${#arr[@]}; i++ )); do echo "index: $i, value: ${arr[$i]}"; done
index: 0, value: foo
index: 1, value: bar baz
index: 2, value: lorem
ipsum

But only as long the input variable does not contain a carriage return.

I assume I need xargs output every result delimited by a null byte and import with mapfile's -d '', but it seems xargs is missing a print0 option (tr '\n' '\0' would manipulate the multi-line string itself).

mgutt
  • 5,867
  • 2
  • 50
  • 77
  • 3
    how is the initial variable (`var`) populated? do you have any control over how `var` is populated (ie, can you change the structure of `var`)? – markp-fuso Mar 20 '23 at 18:18
  • The easiest way is `eval arr=("$var")` but it is very dangerous if the variable `var` can contain strings like `$(...)` – M. Nejat Aydin Mar 20 '23 at 18:32
  • Also see [Convert a string into an array with bash, honoring quotes for grouping](https://stackoverflow.com/q/37372225/4154375). – pjh Mar 20 '23 at 20:50
  • *" it seems xargs is missing a print0"*.. `man xargs`. The first option is `-0, --null` which is often matched with `print0` for null terminated input/output from `find`. Good luck. – shellter Mar 20 '23 at 22:02
  • Some other previous Q/A that look relevant: ["Reading quoted/escaped arguments correctly from a string"](https://stackoverflow.com/questions/26067249/reading-quoted-escaped-arguments-correctly-from-a-string) and ["Get bash to respect quotes when word splitting subshell output"](https://superuser.com/questions/1529226/get-bash-to-respect-quotes-when-word-splitting-subshell-output). – Gordon Davisson Mar 20 '23 at 22:37
  • @shellter `-0` is only relevant for the input, not the output. @markp-fuso I have no control about `var`, but if it helps. you can manipulate `var`, before making further steps. @M.NejatAydin No thanks ^^ @pjh I'm already using a technique to parse quoted strings/substrings. The question is how to parse a substring which contains line breaks – mgutt Mar 20 '23 at 23:36
  • Sorry, didn't read your comment closely enough. Now I'm reading that last paragraph to say you would like a `0` for the last char of a multiline string? – shellter Mar 20 '23 at 23:51

1 Answers1

1

This Shellcheck-clean code demonstrates a way to do it by using Bash regular expressions to extract parts from the string:

#! /bin/bash -p

var="foo 'bar baz' 'lorem
ipsum'"

leadspace_rx='^[[:space:]]+(.*)$'
bare_rx="^([^'[:space:]]+)(.*)\$"
quoted_rx="^'([^']*)'(.*)\$"

arr=()
while [[ -n $var ]]; do
    if [[ $var =~ $leadspace_rx ]]; then
        var=${BASH_REMATCH[1]}
    elif [[ $var =~ $bare_rx ]]; then
        arr+=( "${BASH_REMATCH[1]}" )
        var=${BASH_REMATCH[2]}
    elif [[ $var =~ $quoted_rx ]]; then
        arr+=( "${BASH_REMATCH[1]}" )
        var=${BASH_REMATCH[2]}
    else
        printf 'ERROR: Cannot handle: %s\n' "$var" >&2
        exit 1
    fi
done

declare -p arr
  • The output is declare -a arr=([0]="foo" [1]="bar baz" [2]=$'lorem\nipsum')
  • The code for splitting up a string could easily be encapsulated in a function if you think this idea is worth pursuing.
  • The current code does things that you might not expect. For instance, the string a'b'c is converted to the array (a b c). If you can provide a more precise specification for the format of input strings I'll see if the code can be modified to handle it.
pjh
  • 6,388
  • 2
  • 16
  • 17