If any of you could modify the code so that the sequence names in file 1 are searched within file 2, and if there is a match, the lines in file 1 and its next line are copied to an outfile. right now the code only copies the matched titles but not its next line which is the sequence to the outfile. thanks
for example:
FILE 1 :
SEQUENCE 1 NAME
SEQUENCE 2 NAME
SEQUENCE 3 NAME
FILE 2:
SEQUENCE 1 NAME
AGTCAGTCAGTCAGTCAGTC
SEQUENCE 2 NAME
AAGGGTTTTCCCCCCAAAAA
SEQUENCE 3 NAME
GGGGTTTTTTTTTTAAAAAC
SEQUENCE 4 NAME
AAGTCCCCCCCCCCAAGGTT
etc.
OUTFILE:
SEQUENCE 1 NAME
AGTCAGTCAGTCAGTCAGTC
SEQUENCE 2 NAME
AAGGGTTTTCCCCCCAAAAA
SEQUENCE 3 NAME
GGGGTTTTTTTTTTAAAAAC
code:
use strict;
use warnings;
my $f1 = 'FILE1.fasta';
open FILE1, "$f1" or die "Could not open file \n";
my $f2= 'FILE2.fasta';
open FILE2, "$f2" or die "Could not open file \n";
my $outfile = $ARGV[1];
my @outlines;
my $n=0;
foreach (<FILE1>) {
my $y = 0;
my $outer_text = $_ ;
seek(FILE2,0,0);
foreach (<FILE2>) {
my $inner_text = $_;
if($outer_text eq $inner_text) {
print "$outer_text\n";
push(@outlines, $outer_text);
$n++;
}
}
}
open (OUTFILE, "sequences.fasta") or die "Cannot open $outfile \ +n";
print OUTFILE @outlines;
close OUTFILE;