19

Hi Need a shell script to parse through the csv file - Line by line and then field by field ]

the file will look like this

X1,X2,X3,X4
Y1,Y2,Y3,Y4

I need to extract each of these X1,X2....

I wrote a script but it fails if the line exceeds one line..

Psidom
  • 209,562
  • 33
  • 339
  • 356
Balualways
  • 4,250
  • 10
  • 38
  • 51
  • 2
    The good news: two programs, [awk](http://www.vectorsite.net/tsawk.html) and [sed](http://www.grymoire.com/Unix/Sed.html), exist to do exactly that. The bad news: they're impossible to learn. I'm not putting this as an answer because it really isn't; hopefully someone below will post the correct awk/sed syntax for you to use in your specific problem. – eykanal Dec 14 '10 at 13:20
  • 1
    `sed` may be difficult to learn, but `awk` isn't. Awk's actually fairly easy. Although you don't specifically need either to do this, as it can be done with shell built-ins (see Ignacio's response, below). – Chris J Dec 14 '10 at 13:31

3 Answers3

51

Here's how I would do it.

First i set the IFS environment variable to tell read that "," is the field separator.

export IFS=","

Given the file "input" containing the data you provided, I can use the following code:

cat test | while read a b c d; do echo "$a:$b:$c:$d"; done

To quickly recap what is going on above. cat test | reads the file and pipes it to while. while runs the code between do and done while read returns true. read reads a line from standard input and separates it into variables ("a", "b", "c" and "d") according to the value of $IFS. Finally echo just displays the variables we read.

Which gives me the following output

X1:X2:X3:X4
Y1:Y2:Y3:Y4

BTW, the BASH manual is always good reading. You'll learn something new every time you read it.

Martin Olsen
  • 1,895
  • 1
  • 17
  • 20
  • 5
    [UUOC](http://en.wikipedia.org/wiki/Cat_%28Unix%29#Useless_use_of_cat) -- you don't need that cat :-) – Chris J Dec 14 '10 at 13:35
  • 1
    @Chris: I know! Just personal preference, for clarity... :-) – Martin Olsen Dec 14 '10 at 13:41
  • 7
    You should almost always use `-r` with `read`. You should do `while IFS=',' read -r a b c d` and you won't have to save and restore the value of `IFS` in order to have its behavior return to normal. Note that if there are more fields in your data than you have variables that the last variable will contain the excess, too. – Dennis Williamson Dec 14 '10 at 15:31
  • Should it not be `given the file "test"`? After all, you `cat` on test, and not on `input`. – MERose Nov 26 '16 at 10:59
4

Since eykanal mentioned AWk and and sed, I thought I'd show how you could use them.

awk -F, 'BEGIN{OFS="\n"}{$1=$1; print}' inputfile

or

sed 's/,/\n/' inputfile

Then a shell script could process their output:

awk_or_sed_cmd | while read -r field
do
    do_something "$field"
done

Of course, you could do the processing within the AWK script:

awk -F, '{for (i=1;i<=NF;i++) do_something($i)}' inputfile
Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
-1

ls -l

vi filename.sh

#!bin/sh

echo "INPUT PATTERN"

cat > test (input data and save it)

cat test | while read (input);(ex : "$a:$b:$c:$d");

done

echo "pattern shown as "$a:$b:$c:$d"\n"

exit(0;);

some_other_guy
  • 3,364
  • 4
  • 37
  • 55
kiran
  • 1