0

I need to validate a text file uploaded on s3. I can read all lines in a file this way:

    for line in response['Body'].iter_lines(chunk_size=50, keepends=False):
        self.validate(line.decode("utf-8"))

But text files can be very large - up to 100000 irrelevant lines, while I only need to validate the first 10 and last 2 lines. Is there a way to read only those lines to save time/memory?

smyer
  • 100
  • 2
  • 9
  • You could try [this](https://stackoverflow.com/a/36998080/12096138) but with `head -10` and `tail -2` maybe? – ImranD Jun 01 '22 at 10:04
  • It's easy to know if you've read (are reading) the first N lines but how do you know when you've got the last 2 without having read the entire file? – DarkKnight Jun 01 '22 at 10:09
  • Surely the `response` is already in your RAM, so the memory is consumed already to hold it. I guess you could `seek` backwards from the end of the buffer by around 10 lines (in bytes) and start looking for the last two from there. – Mark Setchell Jun 01 '22 at 10:37

0 Answers0