I have a cron job run on local server. I would like to use it to
- start a new EMR cluster
- trigger a Spark model training job on the cluster
like this
ssh -o "StrictHostKeyChecking no" -i xxx.pem hadoop@10.10.x.xxxx "bash ~/train_model.sh"
. Since the EMR cluster is new every time, use-o "StrictHostKeyChecking no"
flag to avoid new host check. - finally shut down the EMR cluster.
The problem is the the model training takes 10+hrs and the ssh connection in step2 timeout every time.
By searching around, I find it might be resolved by editing ssh config on EMR cluster master node, but since the EMR cluster is new every time so I have to also do the edit every time. I am interesting to find out if there is a more neat way?