I have 100,000s of files that I wish to iterate the below sed command over:
sed -s -i -e 's/[[:space:]].*//' -e '1 s/^/>/g' -e '3 s/|*//g' -e '3 s/^/>ref/g' -e '1h;2H;1,2d;4G'
So far, I have been using a bash loop:
for i in read_* ; do
sed -s -i -e 's/[[:space:]].*//' -e '1 s/^/>/g' -e '3 s/|*//g' -e '3 s/^/>ref/g' -e '1h;2H;1,2d;4G' $i
mv $i $i.fasta
done
How can I use GNU Parallel to speed this up?
ls read_* > list.read.txt
parallel -j $cores -a list.read.txt sed -s -i -e 's/[[:space:]].*//' -e '1 s/^/>/g' -e '3 s/|*//g' -e '3 s/^/>ref/g' -e '1h;2H;1,2d;4G' []
I tried the above method where I create a list of files to iterate over and perform 10 jobs at once, however I get sed related error commands.