I am trying to replace all characters that are not C
, T
, A
or G
with an N
in the sequence part of a fasta file - i.e. every 2nd line
I think some combination of awk and tr is what I would need...
To print every other line:
awk '{if (NR % 2 == 0) print $0}' myfile
To replace these characters with an N
tr YRHIQ- N
...but I don't know how to combine them so that the character replacement is only on every 2nd line but it prints every line
this is the sort of thing I have
>SEQUENCE_1
AGCYGTQA-TGCTG
>SEQUENCE_2
AGGYGTQA-TGCTC
and I want it to look like this:
>SEQUENCE_1
AGCNGTNANTGCTG
>SEQUENCE_2
AGGNGTNANTGCTC
but not like this:
>SENUENCE_1
AGCNGTNANTGCTG
>SENUENCE_2
AGGNGTNANTGCTC