I'm trying to run my code of machine learning from images using tensorflow in Google CloudML. However, it seems the submitted job can't access to my files in my cloud shell or in GCS. Even though it is working fine in my local machine, I get the following error once I submit my job using the command gcloud from the cloud shell:
ERROR 2017-12-19 13:52:28 +0100 service IOError: [Errno 2] No such file or directory: '/home/user/pores-project-googleML/trainer/train.txt'
This folder can be found for sure in cloud shell, and I can check it when I type:
ls /home/user/pores-project-googleML/trainer/train.txt
I tried putting my file train.txt
in GCS and access to it from my code (by specifying the path gs://my_bucket/my_path
), but once the job submitted, I got a 'No such file or directory' error with the corresponding path.
To check where the job I submitted using gcloud is running, I added print(os.getcwd())
in the beginning of my python code trainer/task.py, which printed as a result in the logs: /user_dir
. I couldn't find this path using the cloud shell, not even in GCS. So my question is how can I know in which machine my job is running? If it's in a certain container somewhere, how can I access from it to my files using the cloud shell and in GCS?
Before I do all of this, I succesfully completed the 'Image Classification using Flowers Dataset' tutorial.
The command I used to submit my job is:
gcloud ml-engine jobs submit training $JOB_NAME --job-dir $JOB_DIR --packages trainer-0.1.tar.gz --module-name $MAIN_TRAINER_MODULE --region us-central1
where:
TRAINER_PACKAGE_PATH=/home/use/pores-project-googleML/trainer
MAIN_TRAINER_MODULE="trainer.task"
JOB_DIR="gs://pores/AlexNet_CloudML/job_dir/"
JOB_NAME="census$(date +"%Y%m%d_%H%M%S")"