6

I'm trying to create arrays from strings that have pipe ("|") as delimiters and include spaces. I've been looking around for a while and I've gotten close thanks to sources like How do I split a string on a delimiter in Bash?, Splitting string into array and a bunch more. I'm close but it's not quite working. The two main problems are that there are spaces in the strings, there are starting and ending delimiters, and some of the fields are blank. Also, instead of just echoing the values, I need to assign them to variables. Here's the format of the source data:

|username|full name|phone1|phone2|date added|servers|comments|

Example:

|jdoe | John Doe| 555-1212 | |1/1/11 |  workstation1, server1 | added by me |

Here's what I need:

Username: jdoe
Fullname: John Doe
Phone1: 555-1212
Phone2: 
Date_added: 1/1/11
Servers: workstation1, server1
Comments: guest account

Edit: I use sed to strip out the first and last delimiter and spaces before and after each delimiter, input is now:

jdoe|John Doe|555-1212||1/1/11|workstation1, server1|added by me

Here's things I've tried:

oIFS="$IFS"; IFS='|'
for line in `cat $userList`; do
  arr=("$line")
  echo "Username: ${arr[0]}"  #not assigning a variable, just testing the output
  echo "Full Name: ${arr[1]}"
  echo "Phone 1: ${arr[2]}"
  echo "Phone 2: ${arr[3]}"
  # etc..
done
IFS="$oIFS"

Output:

Username: 
Full Name: 
Phone 1:
Phone 2:
Username: jdoe
Full Name: 
Phone 1:
Phone 2:
Username: John Doe
Full Name: 
Phone 1:
Phone 2:

Another thing I tried:

for line in `cat $userList`; do
  arr=(${line//|/ })
  echo "Username: ${arr[0]}"
  echo "Full Name: ${arr[1]}"
  echo "Phone 1: ${arr[2]}"
  echo "Phone 2: ${arr[3]}"
  # etc
done

Output:

Username: jdoe
Full Name: John
Phone 1:
Phone 2:
Username: Doe
Full Name: 555-1212
Phone 1:
Phone 2:

Any suggestions? Thanks!!

jww
  • 97,681
  • 90
  • 411
  • 885
Martin
  • 119
  • 1
  • 1
  • 10
  • 1
    this would be trivial in perl or ruby, any reason not to use one of those? – ennuikiller Dec 30 '11 at 19:08
  • I'm with the 'use perl' comment but if you want a bash-only kind of approach, why not at least use 'sed' or something to strip the spaces before and after the "|" characters and at least make it easier on yourself – FGhilardi Dec 30 '11 at 19:30
  • Perl is an option, but I'm not familiar with it. I've added a sed line to remove leading/trailing delimiters and spaces, updating question. – Martin Dec 30 '11 at 20:05
  • 1
    @ennuikiller this is as trivial in bash as it is in perl or ruby. – kojiro Dec 30 '11 at 20:24
  • 1
    @Martin might I suggest you try fgm's answer below instead. It is the most appropriate answer IMHO. Also, take a look at my comments to his answer as well. – SiegeX Dec 30 '11 at 20:48
  • I agree, @SiegeX's suggestion has made fgm's answer really scalable. Also, please feel free to look at aging awk's solution of mine. :) – jaypal singh Dec 30 '11 at 21:41

5 Answers5

11

Your first attempt is pretty close. The main problems are these:

  • for line in `cat $userList` splits the file by $IFS, not by line-breaks. So you should set IFS=$'\n' before the loop, and IFS='|' inside the loop. (By the way, it's worth noting that the for ... in `cat ...` approach reads out the entire file and then splits it up, so this isn't the best approach if the file can be big. A read-based approach would be better in that case.)
  • arr=("$line"), by wrapping $line in double-quotes, prevents word-splitting, and therefore renders $IFS irrelevant. It should just be arr=($line).
  • Since $line has a leading pipe, you either need to strip it off before you get to arr=($line) (by writing something like $line="${line#|}"), or else you need to treat arr as a 1-based array (since ${arr[0]}, the part before the first pipe, will be empty).

Putting it together, you get something like this:

oIFS="$IFS"
IFS=$'\n'
for line in `cat $userList`; do
  IFS='|'
  arr=($line)
  echo "Username: ${arr[1]}"  #not assigning a variable, just testing the output
  echo "Full Name: ${arr[2]}"
  echo "Phone 1: ${arr[3]}"
  echo "Phone 2: ${arr[4]}"
  # etc..
done
IFS="$oIFS"

(Note: I didn't worry about the fields' leading and trailing spaces, because of the "I can do that step separately" part . . . or did I misunderstand that? Do you need help with that part as well?)

ruakh
  • 175,680
  • 26
  • 273
  • 307
2
IFS='|'
while read username fullname phone1 phone2 dateadded servers comments; do
    printf 'username: %s\n' "$username"
    printf 'fullname: %s\n' "$fullname"
    printf 'phone1: %s\n' "$phone1"
    printf 'phone2: %s\n' "$phone2"
    printf 'date added: %s\n' "$dateadded"
    printf 'servers: %s\n' "$servers"
    printf 'comments: %s\n' "$comments"
done < infile.txt
kojiro
  • 74,557
  • 19
  • 143
  • 201
  • Funny, I didn't see the `IFS='|'` mentioned before so I wondered how would it work without setting the `IFS`. Sorry about that. – jaypal singh Dec 30 '11 at 20:32
  • @JaypalSingh SO gives you a few seconds to edit a new post before it starts tracking edits. When I originally posted this, I mis-pasted the IFS line, but I had it in within 12 seconds. You're fast. ;) – kojiro Dec 30 '11 at 20:35
1

Using arrays and paste. Doesn't account for empty fields since OP said it's not a requirement.

userList='jdoe|John Doe|555-1212||1/1/11|workstation1, server1|added by me'

fields=("Username: " "Full Name: " "Phone 1: " "Phone 2: " "Date_added: " "Servers: " "Comments: ")

IFS='|' read -ra data <<<${userList}

paste <(IFS=$'\n'; echo "${fields[*]}") <(IFS=$'\n'; echo "${data[*]}")

Username:       jdoe
Full Name:      John Doe
Phone 1:        555-1212
Phone 2: 
Date_added:     1/1/11
Servers:        workstation1, server1
Comments:       added by me
m0dular
  • 111
  • 2
1

Another solution:

shopt -s extglob

infile='user.lst'
declare -a label=( "" "Username" "Full Name" "Phone 1" "Phone 2"  )

while IFS='|' read  -a fld ; do
  for (( n=1; n<${#label[@]}; n+=1 )); do
    item=${fld[n]}
    item=${item##+([[:space:]])}
    echo  "${label[n]}:  ${item%%+([[:space:]])}"
  done
done < "$infile"

Leading and trailing blanks will be removed.

Fritz G. Mehner
  • 16,550
  • 2
  • 34
  • 41
  • This is the closest solution so far, but your PE's wont get rid of an *arbitrary* amount of whitespace, just one. You can fix this through the use of `shopt -s extglob` and use `${item##+([[:space:]])}` and `${item%%+([[:space:]])}` – SiegeX Dec 30 '11 at 20:40
  • 1
    Also, might I suggest you use `for ((n=1; n<=${#label[@]}; n++))` to be more flexible if he decides to add another field – SiegeX Dec 30 '11 at 20:47
0

Use column if available to you.

readarray -t my_vals <<< $(seq 5)
echo "${my_vals[@]}"             #1 2 3 4 5
column -to: <<< "${my_vals[@]}"  #1:2:3:4:5
  • -t = Table Output
  • -o = Output Delimiter (set to ':' here)
Rinzler
  • 39
  • 3