0

I am trying to use a third-party utility that processes data from an iterable object e.g. a queue or a file. I need to push a bunch of AWS S3 files though this utility. Each one is a text file containing JSON messages, one complete message per line.

One approach would be to make a local copy of each file using key.get_contents_to_filename(), then open the local file for reading and pass the file object to the utility, then delete the local copy when done. But I am trying to avoid downloading files locally and prefer instead to read directly from S3. Is it possible to create an iterable object from an S3 key directly?

I Z
  • 5,719
  • 19
  • 53
  • 100
  • You want to retrieve a text file from S3 and put the *bytes* in an iterable container for processing- without saving it to local storage? And you have multiple files to do this to? Similar question, no answer yet, - http://stackoverflow.com/q/29086699/2823755 – wwii Mar 18 '15 at 18:48

1 Answers1

0

There is a key.get_contents_as_string method that you could probably load into io.StringIO or io.BytesIO.

>>> import io
>>> bt = io.BytesIO('abc\ndef\nghi')
>>> st = io.StringIO(u'abc\ndef\nghi')
>>> for thing in st:
        print thing


abc

def

ghi
>>> for thing in bt:
        print thing


abc

def

ghi
>>> 
wwii
  • 23,232
  • 7
  • 37
  • 77