I have a tab delimited text file (animals.txt) with five columns:
302947298 2340974238 0 0 cat
345098948 8345988989 0 0 dog
098982388 2098340923 0 0 fish
932840923 0923840988 0 0 parrot
I have another file, mess.txt.gz, which is compressed using GNU zip (.gz file). It basically looks like a massive string of letters:
sdihfoiahdfosparrotdhiafoihsdfoijaslkdogoieufoiweuf
Basically, for every line in the tab delimited text file, I want to see if any of the animal names are present within this .gz file.
Ideally, it would return something like this:
302947298 2340974238 0 0 cat no
345098948 8345988989 0 0 dog yes
098982388 2098340923 0 0 fish no
932840923 0923840988 0 0 parrot yes
At the moment I am doing the following:
gunzip -cd mess.txt.gz | grep cat
gunzip -cd mess.txt.gz | grep dog
To automate it, I've tried the following:
cat animals.txt | awk '{print $5}' > animal_names.txt
cat animal_names.txt | while read line
do
gunzip -cd mess.txt.gz | grep $line > output.txt
done
I've also tried:
cat animal_names.txt | while read line
do
if [ gunzip -cd mess.txt.gz | grep $line ]
then
echo "Yes"
else
echo "No"
fi
; do
done > output.txt
What is the best way to do this in bash?