0

I am trying to run a python script from a shell script at a particular time. The shell script runs some commands to generate some files inside a directory and the python script uploads all the files in the directory to Google Cloud Storage Bucket.

I automated this shell script using crontab.

The python script gsutil.py looks something like this:

from google.cloud import storage
import os

def upload(storage_client, file_path, blob_name):
    bucket = storage_client.get_bucket('some-bucket')
    blob = bucket.blob(blob_name)
    blob.upload_from_file_name(file_path)

def walker():

    rootPath = '/opt/xxx/yyy/zzz'
    storageClient = storage.Client()

    onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
    for file_ in onlyfiles:
        filePath = os.path.join(root, file_)
        blobName = filePath.replace(rootPath, "")
        print('file - {0} and blob - {1}', filePath, blobName)
        upload(storageClient, filePath, blobName)

def main():
    walker()

if __name__== "__main__":
    main()

The problem I am facing is when the try to run my shell script using bash run_script.sh it works totally fine. But when I try to run it using crontab, it gives me this error:

Traceback (most recent call last):
  File "/opt/xxx/yyy/gs_py/gsutil.py", line 28, in <module>
    main()
  File "/opt/xxx/yyy/gs_py/gsutil.py", line 25, in main
    walker()
  File "/opt/xxx/yyy/gs_py/gsutil.py", line 21, in walker
    upload(storageClient, filePath, blobName)
  File "/opt/xxx/yyy/gs_py/gsutil.py", line 7, in upload
    blob.upload_from_filename(file_path)
  File "/root/.local/share/virtualenvs/gs_py-qE0GGV3s/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 1136, in upload_from_filename
    predefined_acl=predefined_acl,
  File "/root/.local/share/virtualenvs/gs_py-qE0GGV3s/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 1085, in upload_from_file
    _raise_from_invalid_response(exc)
  File "/root/.local/share/virtualenvs/gs_py-qE0GGV3s/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 1956, in _raise_from_invalid_response
    raise exceptions.from_http_status(response.status_code, message, response=response)
google.api_core.exceptions.Forbidden: 403 POST https://www.googleapis.com/upload/storage/v1/b/tercept-carousell/o?uploadType=multipart: (u'Request failed with status code', 403, u'Expected one of', 200)

Am I missing something here? I am logged in as root user and put the entry in the crontab:

0 12 * * * bash /opt/xxx/yyy/run_script.sh >> /opt/xxx/yyy/script.log 2>&1 &

The command which recreates the above error case scenario is:

sudo bash run_script.sh

I am not really sure what is happening and why it is not working. I use only bash it is working and when I try to add it in crontab or run it using sudo it is not working. Can any one help me?

  • Are you using environment variables to store your credentials? – SuperShoot Jan 31 '19 at 11:39
  • yes! The GOOGLE_APPLICATION_CREDENTIALS. However, I made the script work by adding this os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/opt/service-account/credential.json'. Can you explain me the reason for this? – jst-raj Jan 31 '19 at 11:47
  • The problem I was facing that the GOOGLE_APPLICATION_CREDENTIALS was already set but I don't know why my script wasn't uploading. – jst-raj Jan 31 '19 at 11:56
  • Crontab can't see the environment variables that you have set in your `.profile` or `.bashrc` etc. When you set the variable in your source module (`os.environ[...] = '...'`) that makes the varaible available for the process but hard-coding obviously defeats the purpose of using environment variables in the first place. Take a look at some of the answers to [this question](https://stackoverflow.com/questions/2229825/where-can-i-set-environment-variables-that-crontab-will-use). – SuperShoot Jan 31 '19 at 12:14
  • Thank you so much for explaining this. – jst-raj Jan 31 '19 at 14:10

0 Answers0