8

I am trying to submit a job to gcloud ml-engine. For reference the job is using this sample provided by Google

It went through the first time, but with errors unrelated to this question, and now I am trying reissue the command after having corrected my errors:

gcloud ml-engine jobs submit training $JOB_NAME \
                                    --stream-logs \
                                    --runtime-version 1.0 \
                                    --job-dir $GCS_JOB_DIR \
                                    --module-name trainer.task \
                                    --package-path trainer/ \
                                    --region us-east1 \
                                    -- \
                                    --train-files $TRAIN_GCS_FILE \
                                    --eval-files $EVAL_GCS_FILE \
                                    --train-steps $TRAIN_STEPS

, where $JOB_NAME = census. Unfortunately, it seems that I cannot proceed with resubmitting the job unless I change $JOB_NAME to be something like census2, then census3, etc. for every new job.

The following is the error I receive:

ERROR: (gcloud.ml-engine.jobs.submit.training) Project [my-project-name]
is the subject of a conflict: Field: job.job_id Error: A job with this
id already exists.

Is this part of the design to not be able to resubmit using the same job name or I am missing something?

slcott
  • 1,194
  • 3
  • 14
  • 26

2 Answers2

3

Like Chunck just said, simply try setting JOB_NAME as: JOB_NAME="census_$(date +%Y%m%d_%H%M%S)"

Fuyang Liu
  • 1,496
  • 13
  • 26
2

Not sure if this will help but in Google's sample code for flowers, the error is avoided by appending the date and time to the job id as shown on line 22, e.g.,

declare -r JOB_ID="flowers_${USER}_$(date +%Y%m%d_%H%M%S)"
Chuck Finley
  • 250
  • 1
  • 10
  • To all pedants: don't try ISO format. `job_id Error: A name should start with a letter and contain only letters, numbers and underscores.` – cubuspl42 Mar 14 '19 at 08:09