7

I want to split the string and construct the array. I tried the below code:

myString="first column:second column:third column"
set -A myArray `echo $myString | awk 'BEGIN{FS=":"}{for (i=1; i<=NF; i++) print $i}'`
# Following is just to make sure that array is constructed properly
i=0
while [ $i -lt ${#myArray[@]} ]
do
echo "Element $i:${myArray[$i]}"
(( i=i+1 ))
done
exit 0

It produces the following result:
Element 0:first
Element 1:column
Element 2:second
Element 3:column
Element 4:third
Element 5:column

This is not what I want it to be. When I construct the array, I want that array to contain only three elements.
Element 0:first column
Element 1:second column
Element 2:third column

Can you please advise?

Nathan Campos
  • 28,769
  • 59
  • 194
  • 300
  • I found the solution which is on the following lines: var='word1#word2|word3/word4|word5.word6|word7_word8|word9 word10|word11|word12' OIFS=$IFS; IFS='|' set -A arr $var IFS=$OIFS –  Oct 24 '09 at 11:52
  • u can remove the for loop by making a change like below awk 'BEGIN{FS=":"}{for (i=1; i<=NF; i++) print $i}.just keep it as awk 'BEGIN{RS=":"}{print}' – Vijay Oct 24 '09 at 11:53
  • bash on my system (4.0.33(5)-release) doesn't have a -A option for `set`. Which version are you running? – outis Oct 24 '09 at 12:17

5 Answers5

15

Here is how I would approach this problem: use the IFS variable to tell the shell (bash) that you want to split the string into colon-separated tokens.

$ cat split.sh
#!/bin/sh

# Script to split fields into tokens

# Here is the string where tokens separated by colons
s="first column:second column:third column"

IFS=":"     # Set the field separator
set $s      # Breaks the string into $1, $2, ...
i=0
for item    # A for loop by default loop through $1, $2, ...
do
    echo "Element $i: $item"
    ((i++))
done

Run it:

$ ./split.sh
Element 0: first column
Element 1: second column
Element 2: third column
Hai Vu
  • 37,849
  • 11
  • 66
  • 93
  • 1
    This only works on a string/line that has a max of 9 columns. Try echoing $11 and you get the value of $1 with '1' appended to the end of it. – Dennis Jul 10 '12 at 04:13
  • 2
    @Dennis - you need to use a different notation for positional params beyond 9. `${10} , ${11}...` http://wiki.bash-hackers.org/scripting/posparams – Cheeso Oct 24 '12 at 04:35
  • @Dennis: I have verified that my code works, even for lines with more than 10 columns. You can verify it yourself. – Hai Vu Oct 24 '12 at 14:34
  • 3
    @HaiVu Good Call! You taught me something which I appreciate. I did not know about the alternate syntax. I stand humbled before you, in thanks :-) – Dennis Nov 13 '12 at 04:58
  • 1
    @mat: Please make sure that: a) you did not make any typos, and b) that you use a bash shell, not an older bourne shell. – Hai Vu Feb 27 '14 at 18:58
  • @Hai Vu: Sorry, my fault. Used the wrong shell :S - bash is working fine! I removed my old comment. – mat Mar 07 '14 at 09:21
5

if you definitely want to use arrays in Bash, you can try this way

$ myString="first column:second column:third column"
$ myString="${myString//:/ }" #remove all the colons
$ echo "${myString}"
first column second column third column
$ read -a myArr <<<$myString
$ echo ${myArr[@]}
first column second column third column
$ echo ${myArr[1]}
column
$ echo ${myArr[2]}
second

otherwise, the "better" method is to use awk entirely

ghostdog74
  • 327,991
  • 56
  • 259
  • 343
4

Note that saving and restoring IFS as I often seen in these solutions has the side effect that if IFS wasn't set, it ends up changed to being an empty string which causes weird problems with subsequent splitting.

Here's the solution I came up with based on Anton Olsen's extended to handle >2 values separated by a colon. It handles values in the list that have spaces correctly, not splitting on the space.

colon_list=${1}  # colon-separate list to split
while true ; do
    part=${colon_list%%:*}  # Delete longest substring match from back
    colon_list=${colon_list#*:}  # Delete shortest substring match from front
    parts[i++]=$part
    # We are done when there is no more colon
    if test "$colon_list" = "$part" ; then
        break
    fi
done
# Show we've split the list
for part in "${parts[@]}"; do
    echo $part
done
Von
  • 4,365
  • 2
  • 29
  • 29
3

Ksh or Bash

#! /bin/sh
myString="first column:second column:third column"
IFS=: A=( $myString )

echo ${A[0]}
echo ${A[1]}
frayser
  • 1,754
  • 10
  • 17
2

Looks like you've already found the solution, but note that you can do away with awk entirely:

myString="first column:second column:third column"
OIFS="$IFS"
IFS=':'
myArray=($myString)
IFS=$OIFS
i=0
while [ $i -lt ${#myArray[@]} ]
do
    echo "Element $i:${myArray[$i]}"
    (( i=i+1 ))
done
outis
  • 75,655
  • 22
  • 151
  • 221