1

I have a python script which reads the csv file,create table in bigquery based on data in csv and loads the data into the table on runtime. I want to trigger this script whenever a event occurs like a file arriving on a specified bucket. My Python script is present on vm instance(till now,used to run the script from vm instance). Is there any way to execute this script based on a event trigger ?

Shikha
  • 321
  • 1
  • 4
  • 19

1 Answers1

2

A feature of Google Cloud Storage called 'Object Change Notifications' could come in handy here. This will push a notification to a web hook, where you can use your processing code to handle the event (an example to run this on AppEngine is given here, but you could also implement this as a Flask endpoint for example).

Another option is to use Pub/Sub Notifications in combination with Cloud Storage, in case you are more comfortable with that. You can then use the Pub/Sub Python SDK to listen to a specific topic your events arrive in, and use the code you already wrote to handle these events.

Matthias Baetens
  • 1,432
  • 11
  • 18
  • Thanks a lot for providing documents link. Going through it and hopefully it will be helpful. – Shikha Sep 12 '17 at 11:39
  • Hi ..Is there any way to directly call to python script within an event trigger ? My index.js has following code in event trigger :exports.helloGCS = function (event, callback) { const file = event.data; if (file.resourceState === 'not_exists') { console.log(`File ${file.name} deleted.`); } else if (file.metageneration === '1') { console.log(`File ${file.name} uploaded.`); } else { console.log(`File ${file.name} metadata updated.`); } callback(); }; Can I call python script which is placed on vm instance in any way from this file only ? – Shikha Sep 12 '17 at 13:03
  • Can you give some more details on what you are trying to implement? Is the code above a Google Cloud Function you are deploying to handle the events? Are you using Object Change Notifications or Pub/Sub? – Matthias Baetens Sep 13 '17 at 09:02
  • Trying to call Python script from CloudFunction code. Trying to follow this link :https://github.com/extrabacon/python-shell Wrote below code in index.js: var PythonShell = require('python-shell'); PythonShell.run('my_script.py', function (err) { if (err) throw err; console.log('finished'); }); but getting error:Deployment failure: Function load error: Code in file index.js can't be loaded. Did you list all required modules in the package.json dependencies? Detailed stack trace: Error: Cannot find module 'python-shell' Can you please help. – Shikha Sep 13 '17 at 09:24
  • Hi Shikha. Unfortunately that will be impossible. The Cloud Function is a one-off stand-alone unit that performs the code you've written; completely serverless. It has no way to access the Python script you specify. You can either implement the logic you have in Python in the Cloud Function (in NodeJS) or you can develop a Python server that will handle the events sent to a certain endpoint in the cloud (specified when deploying the Object Change Notification). – Matthias Baetens Sep 14 '17 at 08:25
  • Hi Matthias, I was able to run python script (Which is present on cloud shell) in cloud function. However, I dont think its a standard way to call the script. I think python Flask web-service need to be setup and call to the web-service from the cloud function should be made. Could you please help me with any link/doc which can help me to setup that. I am beginner on google cloud, Your help will be much appreciated. – Shikha Sep 15 '17 at 12:20
  • Hi Shikha. If you want to go down the road of a Flask server, I would suggest to look at [Cloud Pub/Sub Notifications][1] instead of Cloud Functions as stated above. These are able to send a post request to a certain endpoint (which would in this case be your Flask server) and then the data in the Pub/Sub message can be handled there. Using Cloud Functions if you go down this road feels a bit odd. [1]: https://cloud.google.com/storage/docs/pubsub-notifications – Matthias Baetens Sep 17 '17 at 11:20
  • Hi Mathias,I was able to setup cloud function which is getting executed on every change in storage bucket but not sure how to send message to pub sub topic.Tried looking Cloud pub/Sub notification option from above link but can't find from where we can attach notification rules to a bucket.would you please help with any sample to setup notification configuration.I managed to setup a endpoint url by gcloud app deploy.but not sure how to push msg to that using cloud notifications.Your help will be much appreciated. – Shikha Sep 22 '17 at 18:47
  • Hi Mathias, Trying to set up "Object change Notification" process. But getting issue when validating ownership of domain. Have posted a question https://stackoverflow.com/questions/46398939/issue-in-domain-verification-on-google-cloud-platform . Could you please check and help if possible ? – Shikha Sep 25 '17 at 07:35
  • Could you describe what state you're currently in and where you want to end up ideally? – Matthias Baetens Oct 02 '17 at 12:58
  • Hi Matthias,My requirement is whenever a new file lands in bucket, Datahub load script (Written on Python) should pick the file and process it according to the code written in Datahub load script(Loads data into bigquery). I am trying to set object change notification for this as you had suggested but stuck in issues. Currently getting issue as I mentioned in https://stackoverflow.com/questions/46444791/error-on-command-gsutil-notification-watchbucket/46482590#46482590 I am still stuck and getting no clue how to resolve this issue . Any help will be much appreciated. – Shikha Oct 03 '17 at 07:06
  • By Datahub, I presume you mean Dataflow? If so, you might want to take a look at the last pattern in this [blogpost](https://cloud.google.com/blog/big-data/2017/06/guide-to-common-cloud-dataflow-use-case-patterns-part-1), I think this will help you set-up a best practice architecture for your problem at hand. – Matthias Baetens Oct 04 '17 at 13:04