What is the advantage of SageMaker Neo vs the specialized native runtimes that every ML accelerator provides, such as NVIDIA TensorRT, Intel OpenVINO, DeepView RT, CoreML, ArmNN, etc.? I understand that Neo uses some of these frameworks, like TensorRT, under the hood, but what is the advantage of having the model compiled for Neo instead or TensorRT directly?
I suppose companies with edge ML workloads will standardize on a given platform, eg NVIDIA, ARM or Intel, and each vendor is probably the best positioned to provide an optimized runtime for its own platform with cross-compiling tools from other frameworks (everybody seems to support TensorFlow and ONNX). Is this correct? Have you seen different cases in the field?
Another point is that according to the official documentation, support for different ML model is limited in frameworks other than MXnet. Why would then a company choose Neo if it has models in PyTorch or Tensorflow/Keras?