1

I am writing to a temporary file by downloading the file from S3. When I open the downloaded file (called 3) in my text editor, I can see all the lines of text. But my code returns nothing when I try to read the file line by line.

After running the code, the temporary file is created in the directory of the Python script and doesn't disappear.

import tempfile
import os

import boto3

s3 = boto3.client('s3')

with tempfile.TemporaryFile() as tf:
  try:
    s3.download_file(
      Bucket='the-chumiest-bucket',
      Key='path/to/the/file.txt',
      Filename=str(tf.name)
    )
  except Exception as e:
    print('error:', e)

  tf.flush()
  tf.seek(0, os.SEEK_END)

  for line in tf.readlines():
    print('line:', line)

If I run

with open('3', 'r') as f:
  for line in f.readlines():
    print(line)

I get the lines, so this could be a workaround, but I've seen many people read lines from a tempfile using this exact method.

Expected Result:

I get the lines within file.txt printed.

Actual Result:

I get nothing printed.

Edit #1

Changed tf.seek(0, os.SEEK_END) to tf.seek(0, os.SEEK_SET) (thanks @Barmar) and still no lines being printed. Just one blank line.

Community
  • 1
  • 1
ChumiestBucket
  • 868
  • 4
  • 22
  • 51
  • You're using `SEEK_END` to seek to the end of the file. There's no data to read after that point. – Daniel Pryden May 14 '19 at 17:00
  • Closely related: https://stackoverflow.com/questions/11696472/seek-function – Daniel Pryden May 14 '19 at 17:03
  • 1
    Maybe instead of downloading to a file, you should just download directly to an object. See https://stackoverflow.com/questions/37087203/retrieve-s3-file-as-object-instead-of-downloading-to-absolute-system-path – Barmar May 14 '19 at 17:09
  • Try using `tempfile.NamedTemporaryFile()` because what `tempfile.TemporaryFile()` returns isn't a real file (only a "file-like" object). – martineau May 14 '19 at 17:16

1 Answers1

2

You're seeking to the end of the file. There's nothing more to read when you're at the end. You should see to the beginning.

tf.seek(0, os.SEEK_SET)

I suspect the other problem is that you're updating the file outside of the tf stream. It's not going back to the filesystem to read the file contents. tf.flush() flushes the output buffer, but that doesn't do anything since you haven't written to the stream.

Instead of seeking in the tf stream, reopen the file:

with open(tf.name) as tf1:
  for line in tf1.readlines():
    print('line:', line)

Note that you should be using tempfile.NamedTemporaryFile to get a file that's named. And reopening the file only works on Unix, not Windows. You might want to use tempfile.mkstemp() instead, as I don't think it has that OS-dependency.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • not sure if I can do this in my Serverless environment, I'll try it now. thanks. – ChumiestBucket May 14 '19 at 17:19
  • 1
    If `tempfile.NamedTemporaryFile()` is used, it _can_ be reopened if a `delete=False` is use to create it…but of course that would also imply that it would have to be explicitly deleted when it's no longer needed. – martineau May 14 '19 at 17:25