8

Given a text I/O stream (e.g. from open() or StringIO()), how do I create another stream that filters out lines that match a certain pattern, without reading the entire input stream first? I know that I can easily get an iterable with (line for line in input if filter(line)), but I would like a seekable stream. I also understand that seeking would require reading the entire stream up to that point, even if the underlying stream allowed random access, but this is still better than reading the entire file as in StringIO("".join(line for line in input if filter(line))).

(As an add-on, pointers on how to memoize repeated seeks would be welcome!)

Uri Granta
  • 1,814
  • 14
  • 25
  • 1
    For your add-on question: http://stackoverflow.com/questions/1988804/what-is-memoization-and-how-can-i-use-it-in-python – Robᵩ Aug 15 '16 at 13:52
  • 1
    Probably you should subclass `io.TextIOWrapper` and override its `readline` (or even `read`?) method. Overriding `readline` is simpler but will only work when the stream is readed line-by-line. If you use `io.BufferedRandom` as the wrapped stream then you should have an ability to `seek` freely. The problem though is that you can always `seek` into the middle of some line. – MarSoft Sep 14 '17 at 11:14

0 Answers0