0

Hi I am trying to write dynamic filenames using variable substitution and I unable to figure out what am i missing here.

for i in `cat justPid.csv`
 do 
 awk -v var="$i" -F"," '{if ($1==var) {print $0 }}' uniqPid.csv > "$i"file.txt
done

I have also tried the one below and many other combinations but it wont print multiple file names based on the $i.

for i in `cat justPid.csv`
 do 
 awk -v var="$i" -F"," '{if ($1==var) {print $0 }}' uniqPid.csv > ${i}_file.txt
done

Any suggestions?

Edit: my original intent is to split a 27gb file into manageable chunks based on PID (identifier in the file) so that it can be loaded onto R Studio for analysis. I am working on my laptop and not on a server hence the need to break them into small files. Also I am using the ("new") ubuntu bash shell on windows.

The smaller test files I am working on look like what Jithin has posted. I will try out the suggestions and will update this post!

$cat justPid.csv
aaaa
bbbb
cccc

$cat uniqPid.csv
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb
bbbb,1234567890
cccc,1234567890
dddd,cccccccccc
ffff,1234567890
  • using `for` is not advisable, see http://mywiki.wooledge.org/BashFAQ/001 for how to read file... show some 2-3 lines each of `justPid.csv` and `uniqPid.csv` and then show what is your required output.. – Sundeep Dec 19 '17 at 04:22
  • 1
    My first suggestion would be either do it all in bash or all in awk as there is no reason to use both in this situation – grail Dec 19 '17 at 06:53
  • [edit] your question to include concise, testable sample input and expected output. – Ed Morton Dec 19 '17 at 13:23

2 Answers2

1

I am not quite sure this is what you are looking for, let

input files

$cat justPid.csv
aaaa
bbbb
cccc

$cat uniqPid.csv
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb
bbbb,1234567890
cccc,1234567890
dddd,cccccccccc
ffff,1234567890

script using for loop

for i in $(cat justPid.csv)
do
    awk -v var=${i} -F, '$1==var' uniqPid.csv > ${i}_file.txt
done

script using while loop

while read -r i
do
    awk -v var=${i} -F, '$1==var' uniqPid.csv > ${i}_file.txt
done < justPid.csv

Output

$ cat aaaa_file.txt
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb

$ cat bbbb_file.txt
bbbb,1234567890

$ cat cccc_file.txt
cccc,1234567890

note: It is not advised to use for loop, see the link Use a while loop and the read command , Don't Read Lines With For

Jithin Scaria
  • 1,271
  • 1
  • 15
  • 26
  • http://mywiki.wooledge.org/DontReadLinesWithFor and [use quotes](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) – tripleee Dec 19 '17 at 10:49
  • Awk naturally wants a condition and `{print $0 }'` is the default action so you really should refactor the script to just `awk -v var="$i" -F, '$1 == var'` – tripleee Dec 19 '17 at 10:50
  • ... but if the OP's code is failing, I'm guessing this won't work either. The question doesn't really reveal what's wrong (DOS line endings?) – tripleee Dec 19 '17 at 10:51
  • Correct, question doesn't reveal that . and edited to have the change mentioned. – Jithin Scaria Dec 19 '17 at 11:11
  • Guys, Both the answers provided by Jithin and Ed are correct! However my issue was ^M character which I fixed with good old dos2unix command. Control character was messing up my original commands. – user5878832 Dec 19 '17 at 17:52
0

Without sample input/output it's just an untested guess but I THINK all you need is either::

awk -F, '{print > ($1"_file.txt")}' uniqPid.csv

or maybe:

awk -F, 'NR==FNR{a[$1];next} $1 in a{print > ($1"_file.txt")}' justPid.csv uniqPid.csv

So far I don't see any reason for a loop at all. You might need to close the output files as you go but we can address that if/when you provide sample input/output and tell us whether or not you have GNU awk.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185