Hi I have a similar situation with Grep group of lines, but slightly different.
I have a file in the format of:
> xxxx AB=AAA NNN xxxx CD=DDD xxxxx
xxx
xxx
xxx
xxx
xxx
>xxxx AB=AAA JJJ xxxx CD=EEE xxxxx
xxx
xxx
xxx
xxx
xxx
xxx
xxx
>xxxx AB=AAA NNN xxxx CD=FFF xxxxx
xxx
xxx
xxx
xxx
>xxxx AB=EEE FFF xxxx CD=GGG xxxxx
xxx
xxx
xxx
xxx
xxx
xxx
(each item starting with > does not necessarily contain same number of lines with xxx, xxx is a list of string with all capital letters, the only cue that the record of this item is completed is that the next line starts with >)
Firstly, I want to grep all items with AB = EEE FFF as a resultant file like below:
>xxxx AB=EEE FFF xxxx CD=GGG xxxxx
xxx
xxx
xxx
xxx
xxx
>xxxx AB=EEE FFF xxxx CD=TTT xxxxx
xxx
xxx
xxx
xxx
>xxxx AB=EEE FFF xxxx CD=EEE xxxxx
xxx
xxx
xxx
xxx
xxx
xxx
Then, I have a csv file with list of CD items, and I want to grep all these with CD=xxx as xxx is a line in csv file.
A sample of an item is:
>sp|P01023|A2MG_HUMAN Alpha-2-macroglobulin OS=Homo sapiens OX=9606 GN=A2M PE=1 SV=3
MGKNKLLHPSLVLLLLVLLPTDASVSGKPQYMVLVPSLLHTETTEKGCVLLSYLNETVTV
SASLESVRGNRSLFTDLEAENDVLHCVAFAVPKSSSNEEVMFLTVQVKGPTQEFKKRTTV
MVKNEDSLVFVQTDKSIYKPGQTVKFR
AB in my example refers to OS here, and CD in my example refers to GN (so it's a single string containing capital letters AND/OR number
My csv file looks like (with ~1000 lines):
A2M
AIF1
Thanks a lot!