0

I have been exploring using Vertex AI for my machine learning workflows. Because deploying different models to the same endpoint utilizing only one node is not possible in Vertex AI, I am considering a workaround. With this workaround, I will be unable to use many Vertex AI features, like model monitoring, feature attribution etc., and it simply becomes, I think, a managed alternative to running the prediction application on, say, a GKE cluster. So, besides the cost difference, I am exploring if running the custom prediction container on Vertex AI vs. GKE will involve any limitations, for example, only N1 machine types are available for prediction in Vertex AI

There is a similar question, but I it does not raise the specific questions I hope to have answered.

  • I am not sure of the available disk space. In Vertex AI, one can specify the machine type, such as n1-standard-2 etc., but I am not sure what disk space will be available and if/how one can specify it? In the custom container code, I may copy multiple model artifacts, or data from outside sources to the local directory before processing them so understanding any disk space limitations is important.
  • For custom training in Vertex AI, one can use an interactive shell to inspect the container where the training code is running, as described here. Is something like this possible for a custom prediction container? I have not found anything in the docs.
  • For custom training, one can use a private IP for custom training as described here. Again, I have not found anything similar for custom prediction in the docs, is it possible?

If you know of any other possible limitations, please post.

racerX
  • 930
  • 9
  • 25

1 Answers1

1
  1. we don't specify a disk size, so default to 100GB
  2. I'm not aware of this right now. But if it's a custom container, you could just run it locally or on GKE for debugging purpose.
  3. are you looking for this? https://cloud.google.com/vertex-ai/docs/predictions/using-private-endpoints
Shawn
  • 1,598
  • 1
  • 12
  • 19
  • Thanks, can the default disk size be changed? Yes, I could run the custom container locally for debugging purposes, but it's really helpful to be able to ssh into the deployed container. Not sure why it's not possible, given that it is for custom training containers, which actually don't even run indefinitely and are terminated once the training job is over. – racerX Nov 19 '21 at 22:40
  • No, the default size is configured by GKE so there is no way to change the default value. For the second question, what do you want to get from the running prediction container? The reason it cannot be SSHed is security concerns. – Shawn Nov 21 '21 at 00:34