I am facing a problem with python since a few days. I am a bioinformatics with no basic programming skills and I am working with huge text files (25gb approx.) that I have to process.
I have to read the txt file line-by-line at groups of 4lines per time, which means that the first 4 lines has to be read and processed and then I have to read the second group of 4 lines, and so on.
Obviously I cannot use the readlines() operator because it will overload my memory, and I have to use each of the 4 lines for some string recognition.
I thought about using a for cycle with the range operator:
openfile = open(path, 'r')
for elem in range(0, len(openfile), 4):
line1 = readline()
line2 = readline()
line3 = readline()
line4 = readline()
(process lines...)
Unfortunately this is not possibile because the file in "reading" mode cannot be iterated and treated like a list or a dictionary.
Can anybody please help to cycle this properly?
Thanks in advance