0

I am struggling with a simple loop. I want to get:

awk '{print $3}' z.csv > col1.csv
awk '{print $4}' z.csv > col2.csv
...
awk '{print $(i+2)}' z.csv> col(i).csv

Here what I tried so far:

k=$i+2;
for i in {1..21}
do 
 awk '{print $k}' z.csv > col"${i}".csv
done 

But it's far from working, any advice, please?

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
Mohamed
  • 11
  • 6
  • @jenesaisquoi what would you advise in this specific case please? Thank you – Mohamed Jun 12 '19 at 22:25
  • 1
    Possible duplicate of [How do I use shell variables in an awk script?](https://stackoverflow.com/questions/19075671/how-do-i-use-shell-variables-in-an-awk-script) – Rorschach Jun 12 '19 at 22:27

1 Answers1

5

There's no need for a bash loop calling awk multiple times, just loop within 1 call to awk, e.g. with GNU awk which handles the potential "too many open files" problem for you automatically:

awk '{
    for (i=1; i<=21; i++) {
        print $(i+2) > ("col"i".csv")
    }
}' z.csv

and with any awk:

awk '{
    for (i=1; i<=21; i++) {
        close(out)
        out = "col"i".csv"
        print $(i+2) >> out
    }
}' z.csv

If closing all of the files on every iteration causes a performance problem and you find you can have, say, 11 output files open at once without getting a "too many open files" error then you could do something like this:

awk '{
    for (i=1; i<=21; i++) {
        if (i>11) close(out)
        out = "col"i".csv"
        print $(i+2) >> out
    }
}' z.csv

or slightly more efficiently but with a bit more code:

awk '{
    for (i=1; i<=10; i++) {
        print $(i+2) > ("col"i".csv")
    }
    for (; i<=21; i++) {
        close(out)
        out = "col"i".csv"
        print $(i+2) >> out
    }
}' z.csv
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • You're welcome. You'll find that approach orders of magnitude faster than using a shell loop. – Ed Morton Jun 12 '19 at 22:28
  • The opening and closing of file handles is good practice but probably adds a fair amount of overhead. If there are just a few columns you might get faster performance if you take those out. Awk can keep a limited number of files open simultaneously but if you stay safely below that limit there is no need to keep on closing and reopening. – tripleee Jun 13 '19 at 05:07
  • @tripleee I thought of something you could do to address that - see the scripts I added at the bottom of my answer. – Ed Morton Jun 13 '19 at 05:52
  • @EdMorton thank you for the other alternatives, it is very interesting... this can be applied to other function like awk? – Mohamed Jun 13 '19 at 09:28
  • @Mohamed you're welcome. sorry, I don't understand your question `this can be applied to other function like awk?` – Ed Morton Jun 13 '19 at 13:49