1

I get this error when I try to submit my training job.

ERROR: (gcloud.ml-engine.jobs.submit.training) Could not copy [dist/object_detection-0.1.tar.gz] to [packages/10a409168355064d603079b7c34cdd7010a13b181a8f7776751e9110d66a5bdf/object_detection-0.1.tar.gz]. Please retry: HTTPError 404: Not Found

I'm running the following code:

gcloud ml-engine jobs submit training ${train1} \
    --job-dir=gs://${object-detection-tutorial-bucket1/}/train \
    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \
    --module-name object_detection.train1 \
    --region us-central1 \
    --config object_detection/samples/cloud/cloud.yml \
    --runtime-version=1.4 \ 
    -- \
    --train_dir=gs://${object-detection-tutorial-bucket1/}/train \
    --pipeline_config_path=gs://${object-detection-tutorial- 
    bucket1/}/data/ssd_mobilenet_v1_coco.config  
Victoria Cabales
  • 121
  • 1
  • 1
  • 4
  • Did you check that `dist/object_detection-0.1.tar.gz` exists? – leo9r Aug 08 '18 at 22:35
  • Also consider this similar question: https://stackoverflow.com/questions/49932251/error-when-submitting-training-job-to-gcloud – leo9r Aug 08 '18 at 22:38

3 Answers3

0

It looks like the syntax you're using is incorrect.

If the name of your bucket is object-detection-tutorial-bucket1, then you specify that with:

--job-dir=gs://object-detection-tutorial-bucket1/train

or you can run:

export YOUR_GCS_BUCKET="gs://object-detection-tutorial-bucket1"

and then specify the bucket as:

--job-dir=${YOUR_GCS_BUCKET}/train

The ${} syntax is used for accessing the value of a variable, but object-detection-tutorial-bucket1/ isn't a valid variable name, so it evaluates as empty.

Sources:

https://cloud.google.com/blog/big-data/2017/06/training-an-object-detector-using-cloud-machine-learning-engine

Difference between ${} and $() in Bash

0

Just remove $ { } in the script.Considering your bucket name to be object-detection-tutorial-bucket1,Run the below script-

gcloud ml-engine jobs submit training \ 
--job-dir=gs://object-detection-tutorial-bucket1/train \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \
--module-name object_detection.train1 \
--region us-central1 \
--config object_detection/samples/cloud/cloud.yml \
--runtime-version=1.4 \
-- \
--train_dir=gs://object-detection-tutorial-bucket1/train \
--pipeline_config_path=gs://object-detection-tutorial- \
bucket1/data/ssd_mobilenet_v1_coco.config \ 
shivamt042
  • 19
  • 3
0

Terrible fix but something which worked for me - just remove $variable format completely.

Here is an example:

!gcloud ai-platform jobs submit training anurag_card_fraud \
    --scale-tier basic \
    --job-dir gs://anurag/credit_card_fraud/models/JOB_20210401_194058 \
    --master-image-uri gcr.io/anurag/xgboost_fraud_trainer:latest \
    --config trainer/hptuning_config.yaml \
    --region us-central1 \
    -- \
    --training_dataset_path=$TRAINING_DATASET_PATH \
    --validation_dataset_path=$EVAL_DATASET_PATH \
    --hptune
Cristik
  • 30,989
  • 25
  • 91
  • 127