Google ML. Reading data and accessing specific folders from the Bucket

Question

So far to manage getting data from the bucket I use download_to_file() to get it downloaded on the instance that it is using and access the files/folders locally. Though what I want to achieve is being able to just read from the cloud. How can I go about doing that? There doesn't seem to be a way for me create a relative path from the ML Job instance and the google cloud bucket.

rhaertel80 · Answer 1 · 2017-06-12T14:52:13.083

0

You can use TensorFlow's file_io.FileIO class to create file_like objects to read/write files on gcs, local, or any other supported file system.

See this post for some examples.

edited Jun 12 '17 at 14:52

answered Jun 10 '17 at 05:33

rhaertel80

8,254
1
31
47

I've been exploring google storage's blob documentation https://googlecloudplatform.github.io/google-cloud-python/latest/storage-blobs.html . When you attach a file from GCS to a 'blob' is it readable as that file by the system? – vr9494 Jun 11 '17 at 06:50
If you use that class, you'll have to use one of the methods on the class, e.g. `download_as_string` or `download_to_file`. While downloading the file may make sense in some cases, generally you'll use TensorFlow readers (https://www.tensorflow.org/programmers_guide/reading_data#reading_from_files) in which case you're just providing GCS paths to the readers. – rhaertel80 Jun 12 '17 at 14:58

Google ML. Reading data and accessing specific folders from the Bucket

1 Answers1