1

I successfully trained and deployed a pipeline in Vertex AI using Kubeflow for a retrieval model.

Now I want to schedule this pipeline run every 8 minutes. Here's my code:

from kfp.v2.google.client import AIPlatformClient
api_client = AIPlatformClient(project_id='my-project', region='us-central1')

api_client.create_schedule_from_job_spec(
    job_spec_path='vacantes_pipeline.json',
    schedule="/8 * * * *", # every 8 minutes
    time_zone='America/Sao_Paulo',
    parameter_values={
        "epochs_": 5,
    "embed_length":768,  
        "maxsplit_" : 130
    }
)

The JSON is successfuly created, but the Scheduler Job fails immediately.

Logging tells me the httpRequest has an error 503 plus:

jsonPayload: {
@type: "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished"
jobName: "projects/my-project/locations/us-central1/jobs/pipeline_vacantes-pipeline-with-deployment_c7e98a8f_59-14-a-a-a"
status: "UNAVAILABLE"
targetType: "HTTP"
url: "https://us-central1-bogotatrabaja.cloudfunctions.net/templated_http_request-v1"
}

Any ideas on how to solve this issue ?

razimbres
  • 4,715
  • 5
  • 23
  • 50
  • This doesn't answer your question directly, so I'm not posting it as an answer, but you can schedule pipeline runs without using the scheduler feature directly. Use Google Cloud Functions and write your pipeline definition in the script. You can then use Cloud Scheduler to post a message to a Pub/Sub topic at a fixed frequency, set the trigger for the Cloud Function on this Pub/Sub Topic. This Cloud Scheduler --> Pub/Sub --> Cloud Function method is how I schedule all of my Vertex AI pipeline runs. I would advise this because these services are well-documented and maintained. – AndrewJaeyoung Jun 27 '23 at 16:25

1 Answers1

0

The most common cause of 503 error is that the server is down due to maintenance or it was overloaded.

You can sift through your logs to see more information about the error as it can also help in troubleshooting and to see what is running in your server and the details about the health and status.

I also found an incident affecting the deployment of Vertex AI Online Prediction, Vertex AI Batch Prediction, and Cloud Machine Learning. You probably have encountered the 503 error if you are from one of these regions: Taiwan (asia-east1), Hong Kong (asia-east2), Tokyo (asia-northeast1), Seoul (asia-northeast3), Mumbai (asia-south1), Singapore (asia-southeast1), Sydney (australia-southeast1), Belgium (europe-west1), London (europe-west2), Frankfurt (europe-west3), Netherlands (europe-west4), Zurich (europe-west6), Montréal (northamerica-northeast1), Toronto (northamerica-northeast2), Iowa (us-central1), South Carolina (us-east1), Northern Virginia (us-east4), Oregon (us-west1), and Los Angeles (us-west2).

The issue was resolved on Thursday, 2023-05-11 18:59 US/Pacific. You may want to restart your server.

Poala Astrid
  • 1,028
  • 2
  • 10