0

I am reading xyz trajectory files a lot. These files are structured in a way, that information corresponding to a time frame is stored in N lines.

I would like to write an iterator similar to:

file=open(...)
for line in file:
   analyze(line)

but reading N line at once:

file=Myopen(...,N=n)
for Nlines in file:
    analyze(Nlines)

Since the files are huge, I do not want to read the whole into memory, but the purpose is not to gain efficiency but to make a clean and reuseable code. Of course, one could check the index%N==0, and analyze when it is true, but I am a bit sick of writing that few lines over, and over, and over....

Comments and answers are more than appreciated!

user2393987
  • 189
  • 1
  • 2
  • 9

2 Answers2

1

The itertools documentation has a recipe for a generator function that does what you want:

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

If you don't need to handle files that aren't an exact multiple of three lines long, you can simplify things a bit and just use for nlines in zip(*[file]*5) directly in your code.

Blckknght
  • 100,903
  • 11
  • 120
  • 169
0

For instance:

file=Myopen(...,N=n)
Nlines = []
for i in range(N):
     Nlines.append(file.readline)
analyze( ''.join(Nlines) )
stovfl
  • 14,998
  • 7
  • 24
  • 51