0

I have a python code which reads data from an API and creates a json (its not just a simple read , there are same transformations as well)

I need to get the data into GCP (specifically cloud storage) and it needs to be run once every 24 hours. Google cloud function seems to be the ideal solution but it has a quota limit of 9 minutes. So the code does'nt work there.

What other options do I have in GCP Dataflow ? Can I use my standard python code in the beam framework ? Datafusion ? I doubt.

Any other suggestions ?

P.S. This is my first question in stackoverflow so please let me know if the format of my question is incorrect or if it can be improved upon.

  • Have you checked [Cloud functions Gen 2](https://cloud.google.com/functions/docs/2nd-gen/overview) that can run up to 60 minutes? How long does your code take to complete the execution? – Dharmaraj Apr 18 '22 at 04:34
  • How long of a quota do you need? I would use either Cloud Run or Compute Engine for CPU processing longer than 9 minutes. Also, consider Cloud Functions 2nd generation if GA is not required. – John Hanley Apr 18 '22 at 04:35
  • hi i forgot to mention code takes around 30 minutes to run – K_python2022 Apr 18 '22 at 07:11
  • I think, you might have a few other options as well. First - can you run the process more frequently, as it may decrease the duration of the cloud function invocation and put it into 540 seconds threshold? Second - can you divide your cloud function processing into some steps, and each step no longer than 540 seconds? An example of that was described here: https://stackoverflow.com/questions/66050709/google-cloud-platform-solution-for-serverless-log-ingestion-files-downloading – al-dann Apr 18 '22 at 10:08
  • I wrote that, it could help: https://medium.com/google-cloud/long-running-job-with-cloud-workflows-38b57bea74a5 – guillaume blaquiere Apr 18 '22 at 19:13
  • Thanks all Cloud functions 2nd gen worked – K_python2022 May 16 '22 at 13:29

0 Answers0