14

After installing and configuring Google Cloud SDK gsutil command can be run by simply typing its name and the argument(-s) using Windows cmd.

Here is the example:

"C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\bin\gcloud" version

enter image description here

But the same command fails if run using Python subprocess. With subprocess's shell argument set to True the ImportError occurs:

import subprocess

cmd = '"C:/Program Files (x86)/Google/Cloud SDK/google-cloud-sdk/bin/gsutil" version'

p = subprocess.Popen(cmd, shell=True)

.....

ImportError: No module named site

With subprocess's shell argument set to False then the WindowsError: [Error 2] The system cannot find the file specified occurs:

p = subprocess.Popen(cmd, shell=False)

Is there a way to run gsutil on Windows using Python?

Bonifacio2
  • 3,405
  • 6
  • 34
  • 54
alphanumeric
  • 17,967
  • 64
  • 244
  • 392

2 Answers2

12

Note that the proper and official way to interact with Google Cloud Storage is to make use of the Google Cloud Client Library for Python and not running the gsutil command through subprocess.Popen. If you are not setting up merely some tests I would suggest you to follow from the beginning this way if there is not any technological constrain that makes this way impracticable.

You can check at the following links the relative Overview and Documentation. A small example taken from the Documentation can be the following:

from google.cloud import storage

client = storage.Client()
bucket = client.get_bucket('<your-bucket-name>')
blob = bucket.blob('my-test-file.txt')
blob.upload_from_string('this is test content!')

You can find a further example here using google-cloud-python with the Datastore and Cloud Storage to manage expenses.

Gaurang Tandon
  • 6,504
  • 11
  • 47
  • 84
GalloCedrone
  • 4,869
  • 3
  • 25
  • 41
  • 15
    The python API does not allow use of the `-m` option for parallelism, as far as I know. So there are reasons for using `subprocess` and the `gsutil` command. – HoosierDaddy Sep 09 '19 at 22:41
  • @UricSou: You can share client instances across threads because the storage client uses the `requests` library. Just create client instances after `multiprocessing.Pool`. – tfad334 Nov 08 '19 at 17:28
  • 5
    Also, the python API is dead slow comapred to the commandline – CpILL Apr 24 '20 at 02:44
  • @GalloCedrone how to upload folder using above approach? – MAC Mar 25 '22 at 08:29
  • 2
    I highly recommend using gsutil and not the native python library for anything more than downloading one or two small files. gsutil's performance is 100x better, so the original direction of the question was good (invoking it from python) – orcaman Apr 27 '22 at 08:38
  • I also see benefit of using the gsutil command line tool for that. As it has really easy to use interace - allowing to used things like rm -r, cp -r (which you can't easily replicate using the other methods - you would need to iterate over all the objects) – Miroslav Karpíšek Oct 20 '22 at 12:21
1

Use shutil.which to get the full path to gsutil in a cross-platform manner:

import shutil

# If gsutil is in PATH:
path = shutil.which('gsutil')

# Or if gsutil isn't in PATH but you know where it is:
path = shutil.which('gsutil', path="C:/Program Files (x86)/Google/Cloud SDK/google-cloud-sdk/bin")

# Then you can use that path to run it.
import subprocess
cmd = [path, "version"]
p = subprocess.Popen(cmd)
Jon-Eric
  • 16,977
  • 9
  • 65
  • 97