2

Let's assume I have the following string: x="number 1;number 2;number 3".

Access to the first substring is successfull via ${x%%";"*}, access to the last substring is via ${x##*";"}:

$ x="number 1;number 2;number 3"
$ echo "front : ${x%%";"*}"  #front-most-part
number 1
$ echo "back  : ${x##*";"}"  #back-most-part
number 3
$

How do I access the middle part: (eg. number 2)?
Is there a better way to do this if I have (many...) more parts then just three?
In other words: Is there a generic way of accessing substring No. n of string yyy, delimited by string xxx where xxx is an arbitraty string/delimiter?


I have read How do I split a string on a delimiter in Bash?, but I specifically do not want to iterate over the string but rather directly access a given substring.

This specifically does not ask or a split into arrays, but into sub-strings.

Community
  • 1
  • 1
Christian
  • 1,212
  • 1
  • 15
  • 30
  • 2
    Possible duplicate of [Split string into an array in Bash](http://stackoverflow.com/questions/10586153/split-string-into-an-array-in-bash) – Andreas Louv Oct 27 '15 at 13:04
  • try this `cut -d';' -f2 <<< $x` – Kalanidhi Oct 27 '15 at 13:06
  • @Kalanidhi That _only_ accesses the second substring, in the end I'd like delimited access to all of them (this actually happens in a loop and generic access to any of the substrings is required). – Christian Oct 27 '15 at 13:34
  • @dev-null No, I ask for a split into strings, not arrays (This actually happens within a loop over a multidimensional array, so I don't want a subarray of an array). – Christian Oct 27 '15 at 13:35

2 Answers2

4

With a fixed index:

x="number 1;number 2;number 3"

# Split input into fields by ';' and read the 2nd field into $f2
# Note the need for the *2nd* `unused`, otherwise f2 would 
# receive the 2nd field *plus the remainder of the line*.
IFS=';' read -r unused f2 unused <<<"$x"

echo "$f2"

Generically, using an array:

x="number 1;number 2;number 3"

# Split input int fields by ';' and read all resulting fields
# into an *array* (-a).
IFS=';' read -r -a fields <<<"$x"

# Access the desired field.
ndx=1
echo "${fields[ndx]}"

Constraints:

Using IFS, the special variable specifying the Internal Field Separator characters, invariably means:

  • Only single, literal characters can act as field separators.

    • However, you can specify multiple characters, in which case any of them is treated as a separator.
  • The default separator characters are $' \t\n' - i.e., space, tab, and newline, and runs of them (multiple contiguious instances) are always considered a single separator; e.g., 'a b' has 2 fields - the multiple space count as a single separator.

  • By contrast, with any other character, characters in a run are considered separately, and thus separate empty fields; e.g., 'a;;b' has 3 fields - each ; is its own separator, so there's an empty field between ;;.

The read -r -a ... <<<... technique generally works well, as long as:

  • the input is single-line
  • you're not concerned about a trailing empty field getting discarded

If you need a fully generic, robust solution that addresses the issues above, use the following variation, which is explained in @gniourf_gniourf answer here:

sep=';' 
IFS="$sep" read -r -d '' -a fields < <(printf "%s${sep}\0" "$x")    

Note the need to use -d '' to read multi-line input all at once, and the need to terminate the input with another separator instance to preserve a trailing empty field; the trailing \0 is needed to ensure that read's exit code is 0.

Community
  • 1
  • 1
mklement0
  • 382,024
  • 64
  • 607
  • 775
0

Don't use:

Create an array with a delimiter of ;:

x="number 1;number 2;number 3"
_IFS=$IFS; IFS=';'
arr=($x)
IFS=$_IFS

echo ${arr[0]} # number 1
echo ${arr[1]} # number 2
echo ${arr[2]} # number 3

Andreas Louv
  • 46,145
  • 13
  • 104
  • 123
  • 2
    The is _the_ broken method to “split” a string on a delimiter (I should say, to introduce bugs) that is, unfortunately, wide spread, as it is subject pathname expansion (globbing). The “fix” is then to use `set -f` (but this doesn't fix _all_ the issues, as it will still concatenate multiple successive empty fields); and at this point you probably feel that you're fighting _against_ the shell: that's because you're using an _antipattern._ The linked question shows the canonical way to split a string: `IFS=\; read -r -d '' -a arr < <(printf '%s;\0' "$x")` (and it's only one line!). – gniourf_gniourf Oct 27 '15 at 13:14
  • When I say _it will still concatenate multiple successive empty field_ I mean when used with a space as `IFS` (which is not the case here). Though in this case it will remove the last field if it's empty. – gniourf_gniourf Oct 27 '15 at 13:17
  • @gniourf_gniourf: The globbing argument is valid (and having to set and restore _2_ configuration items (`IFS`, `set -f`) makes this solution ultimately cumbersome (though it may have a slight performance advantage compared to `read ... <<<...` and presumably more so compared to `read ... < <(printf ...)`). However, from what I can tell, there's no escaping the shell considering _runs_ of tabs, spaces, newlines a _single_ separator (concatenating multiple successive empty fields) - this logic is built into `IFS`, so it affects the array-literal syntax as well as `read`. – mklement0 Oct 27 '15 at 14:40
  • 1
    @mklement0 this happens when `IFS` is set to a space character, like a space or a tab or a newline (because Bash treats these chars in a special way). Try it with: `a='a b'` (with 2 spaces between `a` and `b`) and then `IFS=' '; ary=( $a )`. You'll see that the empty fields are discarded. While this actually looks like what we'd want in general, it can be surprising when trying to slurp the output of a command in an array using `IFS=$'\n'; a=( $(echo a; echo; echo b) )`. Here the empty line isn't preserved… whether this is good or bad doesn't really matter; just something to be aware of! – gniourf_gniourf Oct 27 '15 at 15:55
  • 1
    @gniourf_gniourf: Fully agreed, and my updated answer describes all that. My point was: this behavior is _inevitable_ with `IFS` and is not a shortcoming of _this_, the `($x)` approach - I now suspect you never meant to imply that, that you were only pointing out a _general_ limitation - so I think we're in full agreement. – mklement0 Oct 27 '15 at 16:05