I have one file which has list of IDs, index/header, let's call it list.txt
TRINITY_DN10002_c0_g1_i1.p1
TRINITY_DN10002_c0_g1_i2.p1
TRINITY_DN10002_c1_g1_i1.p1
TRINITY_DN10006_c0_g1_i1.p1
TRINITY_DN10006_c0_g1_i4.p1
TRINITY_DN10007_c0_g1_i2.p1
TRINITY_DN1000_c0_g1_i15.p1
TRINITY_DN1000_c0_g1_i31.p1
TRINITY_DN1000_c0_g1_i40.p2
TRINITY_DN1000_c0_g2_i1.p1
TRINITY_DN10012_c2_g1_i1.p1
TRINITY_DN10014_c0_g2_i1.p1
TRINITY_DN10014_c2_g1_i1.p1
TRINITY_DN10014_c2_g2_i1.p1
and another file (large) which has the data/information (sequences) I want to extract from, calling it as datafile.fasta
CAGTCAATAATTTTGACGTACTTTTCAAAACATTTCTTGCTGTTTCTTCAAAACTCTTAT
CTAATTTTTTGTTTTTCAGAAGGTTAAAAGTATATTGAAAAGTTTCTCGATTTTCAGTGG
TTAGTCTAGGATTGCTATTATTTTCAATTTGGTGTTTGCATTCTTCAAACATTACCATCA
GATCATATCGTGTTATTTCATACTGCATTTCTTTGATTTCCTGCTTTGACAAAGTTGTTA
TCGATTTTTTCATCCTGTTTCTTAATTTTGTCAAACCAT
>TRINITY_DN36094_c0_g1_i1 len=250 path=[0:0-249]
AGGAAAGTACTTAGTGAGTACATACCTGACAGTGATCACGGTTTTACGAAAAGATACATT
GAAAGAAGAATACTTCAACCTCCATTAAAACGATACACGTATTGCTTGCTCACTTCCATG
TCTTATGGCGCTTCAGACTGCTATGATCCAAATTTTTGCGGTCCCATCTTTGGTAAACGA
AATTGTTACAAGCGGTCCAAATTTACAGATAATCATTTCCCAGATATAAATCTCGGCGGG
AAACTTCATC
>TRINITY_DN36018_c0_g1_i2 len=1265 path=[0:0-408 1:409-443 2:444-1264]
AAAAGCTCCAATTGTTTGGTGGACTCCCGCTGGCTAGTCAGGATAACAGATTACGGTCTG
CCAAGTTTTAGTAGCGGCCAGTTTTTTGACGAATCAGAACAAGATGCATATAGACGTAAA
CTTTGGACCGCCCCAGAATTATTGAGGGAGAACATACCCCCTAAGAACGGTTCCCAAAAG
GGTGACGTATACAGTTTCGCCATAGTGGCGTACGAGATTATAACAAGGTCTGAACCATTT
CCCTTTGATTTAATGACTGCCAGAGACGCGGTAAATAGGGTAAGAAACGGTGAAAGCATC
CCGTTCAGACCATGCCTACCCGAGACAACGGATGTTGGCAAAGCTGTCCTTGACCTCTTG
CGAGCTTGTTGGCATGAGGTTCCAGAACACAGACCCAACTTCAACCAGGTTCGAACTGTT
[....]
I am trying to get an output which should look like this
GCGTACGAGATTATAACAAGGTCTGAACCATTT
CCCTTTGATTTAATGACTGCCAGAGACGCGGTAA
TRINITY_DN10002_c0_g1_i1.p1
TCCATTAAAACGATACACGTATTGCTTGCTCACTTCCATG
TCTTATGGCGCTTCAGACTGCTA
to illustrate a few.
How can I use the list file to extract out my sequences?
I tried grep -Fwf list.txt -A1 datafile.fasta > NRCDS_nuc.fasta
but when I looked at the output file it comes out empty.
But trying samtools faidx tirnity_out.Trinity.fasta
and manually add the id's seems taxing.
Any help would be appreciated, thank you!