File name substitution using awk and for loop

Question

Hi I am trying to write dynamic filenames using variable substitution and I unable to figure out what am i missing here.

for i in `cat justPid.csv`
 do 
 awk -v var="$i" -F"," '{if ($1==var) {print $0 }}' uniqPid.csv > "$i"file.txt
done

I have also tried the one below and many other combinations but it wont print multiple file names based on the $i.

for i in `cat justPid.csv`
 do 
 awk -v var="$i" -F"," '{if ($1==var) {print $0 }}' uniqPid.csv > ${i}_file.txt
done

Any suggestions?

Edit: my original intent is to split a 27gb file into manageable chunks based on PID (identifier in the file) so that it can be loaded onto R Studio for analysis. I am working on my laptop and not on a server hence the need to break them into small files. Also I am using the ("new") ubuntu bash shell on windows.

The smaller test files I am working on look like what Jithin has posted. I will try out the suggestions and will update this post!

$cat justPid.csv
aaaa
bbbb
cccc

$cat uniqPid.csv
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb
bbbb,1234567890
cccc,1234567890
dddd,cccccccccc
ffff,1234567890

using `for` is not advisable, see http://mywiki.wooledge.org/BashFAQ/001 for how to read file... show some 2-3 lines each of `justPid.csv` and `uniqPid.csv` and then show what is your required output.. — Sundeep, Dec 19 '17 at 04:22
My first suggestion would be either do it all in bash or all in awk as there is no reason to use both in this situation — grail, Dec 19 '17 at 06:53
[edit] your question to include concise, testable sample input and expected output. — Ed Morton, Dec 19 '17 at 13:23

Jithin Scaria · Accepted Answer · 2017-12-19T11:21:50.770

1

I am not quite sure this is what you are looking for, let

input files

$cat justPid.csv
aaaa
bbbb
cccc

$cat uniqPid.csv
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb
bbbb,1234567890
cccc,1234567890
dddd,cccccccccc
ffff,1234567890

script using for loop

for i in $(cat justPid.csv)
do
    awk -v var=${i} -F, '$1==var' uniqPid.csv > ${i}_file.txt
done

script using while loop

while read -r i
do
    awk -v var=${i} -F, '$1==var' uniqPid.csv > ${i}_file.txt
done < justPid.csv

Output

$ cat aaaa_file.txt
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb

$ cat bbbb_file.txt
bbbb,1234567890

$ cat cccc_file.txt
cccc,1234567890

note: It is not advised to use for loop, see the link Use a while loop and the read command , Don't Read Lines With For

edited Dec 19 '17 at 11:21

answered Dec 19 '17 at 10:15

Jithin Scaria

1,271
1
15
26

http://mywiki.wooledge.org/DontReadLinesWithFor and [use quotes](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) – tripleee Dec 19 '17 at 10:49
Awk naturally wants a condition and `{print $0 }'` is the default action so you really should refactor the script to just `awk -v var="$i" -F, '$1 == var'` – tripleee Dec 19 '17 at 10:50
... but if the OP's code is failing, I'm guessing this won't work either. The question doesn't really reveal what's wrong (DOS line endings?) – tripleee Dec 19 '17 at 10:51
Correct, question doesn't reveal that . and edited to have the change mentioned. – Jithin Scaria Dec 19 '17 at 11:11
Guys, Both the answers provided by Jithin and Ed are correct! However my issue was ^M character which I fixed with good old dos2unix command. Control character was messing up my original commands. – user5878832 Dec 19 '17 at 17:52

score 0 · Answer 2 · answered Dec 19 '17 at 13:27

Without sample input/output it's just an untested guess but I THINK all you need is either::

awk -F, '{print > ($1"_file.txt")}' uniqPid.csv

or maybe:

awk -F, 'NR==FNR{a[$1];next} $1 in a{print > ($1"_file.txt")}' justPid.csv uniqPid.csv

So far I don't see any reason for a loop at all. You might need to close the output files as you go but we can address that if/when you provide sample input/output and tell us whether or not you have GNU awk.

Thanks Ed for your suggestions! – user5878832 Dec 19 '17 at 17:53 — user5878832, Dec 19 '17 at 17:53

File name substitution using awk and for loop

2 Answers2