Questions tagged [triton]

Triton is an open source project providing hybrid cloud computing infrastructure and is sponsored by Joyent.

Triton was formerly named SmartDataCenter and the github repository still uses the term SDC or SmartDataCenter interchangeably.

29 questions
5
votes
2 answers

Is there a way to get the config.pbtxt file from triton inferencing server

Recently, I have come across a solution of the triton serving config file disable flag "--strict-model-config=false" while running the inferencing server. This would enable to create its own config file while loading the model from the model…
4
votes
1 answer

The meaning of brackets around register in PTX assembly loads/stores

Below is an apparently legitimate PTX assembly code produced by Triton compiler. I'm puzzled by { %r1 } and { %r2 } used in load and store instructions. According to the PTX ISA documentation, it looks like an initializer list. But it does not make…
Dmitry Mikushin
  • 1,478
  • 15
  • 16
4
votes
1 answer

Error in exposing multiple ports with ALB Ingress on EKS

I have a Triton server on EKS listening on 3 ports, 8000 is for http requests, 8001 is for gRPC and 8002 is for prometheus metrics. So, I have created a Triton deployment on EKS which is exposed through NodePort service of EKS. I am also using ALB…
3
votes
1 answer

How to handle multiple pytorch models with pytriton + sagemaker

I am trying to adapt pytriton to host multiple models for a multi-model sagemaker setup. In my case, I am trying to get it to load all models that are hosted in the SAGEMAKER_MULTI_MODEL_DIR folder. I could not find any relevnt example here for a…
toing_toing
  • 2,334
  • 1
  • 37
  • 79
3
votes
0 answers

Cog vs Triton Inference Server

I'm considering Cog and Triton Inference Server for inference in production. Does someone know what is the difference in capabilities as well as in run times between the two, especially on AWS?
3
votes
1 answer

How does coreos compare to triton?

Recently some alternatives for running docker containers or even the app container have developed. I know that there is rkt from coreos (https://coreos.com/blog/rocket/) and triton from joyent (https://www.joyent.com/) How do these two approaches…
Georg Heiler
  • 16,916
  • 36
  • 162
  • 292
2
votes
1 answer

Integrating custom pytorch backend with triton + AWS sagemaker

I have a custom python backend that works well with AWS sagemaker MMS (multimodel server) using an S3 model repository. I want to adapt this backend to work with Triton python backend. I have a example dockerfile that runs the triton server with my…
toing_toing
  • 2,334
  • 1
  • 37
  • 79
2
votes
0 answers

How to deploy GPT-like model to Triton inference server?

The tutorials on deployment GPT-like models inference to Triton looks like: Preprocess our data as input_ids = tokenizer(text)["input_ids"] Feed input to Triton inference server and get outputs_ids = model(input_ids) Postprocess outputs…
1
vote
0 answers

Can I deploy kserve inference service using XGBoost model on kserve-tritonserver?

I want to deploy XGBoost model on kserve. I deployed it on default serving runtime. But I want to try it on kserve-tritonserver. I know kserve told me kserve-tritonserver supports Tensorflow, ONNX, PyTorch, TensorRT. And NVIDIA said triton inference…
1
vote
0 answers

how to work with text input directly in triton server?

examples here (https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-triton/nlp_bert/triton_nlp_bert.ipynb) show , that instead of sending text and tokenizing text in the server, it is done in the client side and tokenized input is…
suwa
  • 23
  • 4
1
vote
0 answers

How to run inference for T5 tensorrt model deployed on nvidia triton?

I have deployed T5 tensorrt model on nvidia triton server and below is the config.pbtxt file, but facing problem while inferencing the model using triton client. As per the config.pbtxt file there should be 4 inputs to the tensorrt model along with…
1
vote
0 answers

Facing Issues with Load Balancing using NGINX Load Balancer on AWS EKS

I am deploying a triton inference server on the Amazon Elastic Kubernetes Service (Amazon EKS) and using Nginx Open-Source Load Balancer for load-balancing. Our EKS Cluster is private (EKS Nodes are in private subnets) so that no one can access it…
1
vote
0 answers

nvidia-pyindex installed unsuccessfully on win10

I want to install tritonclient according to client_libraries.md on win10. Errors happened when I install nvidia-pyindex. How to solve it? Thanks! (py38trtc250) G:\client_py>pip install --user nvidia-pyindex Looking in indexes:…
1
vote
1 answer

DTrace missing Java frames with ustack(). Running on Joyent SmartOS infrastructure container

I cannot get any Java stack with dtrace in a Joyent SmartOS instance. I tried the java:15.1.1 image and a plain SmartOS 'base64' image, where I installed openjdk 8. I most basic example: cat Loop.java [root@7e8c2a25-c852-4967-b60c-7b4fbd9a1de5…
Gamlor
  • 12,978
  • 7
  • 43
  • 70
1
vote
1 answer

Getting docker.sock in joyent triton

I am trying to setup jwilder/nginx-proxy docker container on joyent's triton platform. This container needs access to docker.sock to read information about its environment. Basically it needs to do docker up -v…
lhahne
  • 5,909
  • 9
  • 33
  • 40
1
2