I want to remove the last character of every line that begins with @ from my over 300 files each about 1gb.
My example file is as follows:
@1_1101_1473_2134_1
CATGCGGGAGGAGGAGGACGAGGACCTGCTGCAGTTTGCCATCCAGCAGAGTCTCCTGGAGGTGGGGGCCGAGTACGACCAGGTAACACCCC
+
FFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFBBFFFFF<FFFFFF/BFBF7FFBFFFFFFFFFFBFFFFFF
@1_1101_1635_2243_1
CATGCACACCTCCCGGTCTCCGTTGTGGAGGATCAGGTCCACGATCTCCTGGGTCCACGTGGTGCCTACACACACACACACACACACACACA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
And I want to remove the last character 1 from the lines that start with @ so my output should be
@1_1101_1473_2134_
CATGCGGGAGGAGGAGGACGAGGACCTGCTGCAGTTTGCCATCCAGCAGAGTCTCCTGGAGGTGGGGGCCGAGTACGACCAGGTAACACCCC
+
FFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFBBFFFFF<FFFFFF/BFBF7FFBFFFFFFFFFFBFFFFFF
@1_1101_1635_2243_
CATGCACACCTCCCGGTCTCCGTTGTGGAGGATCAGGTCCACGATCTCCTGGGTCCACGTGGTGCCTACACACACACACACACACACACACA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
I first tried python, which worked for these lines, but as a newbie, I couldn't figure out how to retain all the lines in an output.
with open("file.fq") as f:
for line in f:
length=(len(line)-2)
if line.startswith('@'):
line=line[:length]+''+line[length+1:]
print(line)
Which gives of course only the 'lines' but I wanted to show it works
@1_1101_1473_2134_
@1_1101_1635_2243_
Then I tried awk and sed. I can select the lines that start with @ using awk as follows:
awk '{if (/^@/)}'
And I can remove the last characters of each line with sed as:
sed {'s/.$//'}
So I tried of course combining these two, simply as:
awk '{if (/^@/)}' | sed {'s/.$//'} file.fq
Which does not work.
By the way, if possible, I would prefer deleting these characters directly from my files instead of creating a new file with these characters deleted as I have over 300gb of data, and naturally I would prefer a fast way of doing it.
Any help to upgrade my commands, or any alternative way of doing it in any other way is highly appreciated. Also I will want to run the correct command in a loop for all the files, that's why I first tried to generate a python script, so any help about the loop stage for your solution would also be great.
Many Thanks