How to multi-thread in ONNX Runtime?

Question

Using ONNX Runtime to run inference on deep learning models. Lets say I have 4 different models, each with its own input image, can I run them in parallel in 4 threads? Would there be one "environment" and then 4 sessions (using same environment)?

Also, would the 4 sessions share the read-only state (weights, biases, etc)? — MSalters, Apr 28 '20 at 15:10
Well, in my case, I would actually have 4 separate models (each with own weights). But, yes, we need some doc on how ORT works (or doesn't) with multi-threading! — Tullhead, Apr 29 '20 at 17:16

score 6 · Accepted Answer · edited Jun 02 '20 at 07:29

6

Yes - one environment and 4 separate sessions is how you'd do it.

'read only state' of weights and biases are specific to a model.

A session has a 1:1 relationship with a model, and those sorts of things aren't shared across sessions as you only need one session per model given you can call Run concurrently with different input sizes (assuming the model supports dynamic batch/input sizes).

Regarding threading, the default is a per-session threadpools, but it's also possible to share global threadpools across sessions.

How you do that differs by the API used:

For the C API use CreateEnvWithGlobalThreadPools.
For the C++ API provide OrtThreadingOptions when constructing Ort::Env.

edited Jun 02 '20 at 07:29

Peyman Mohamadpour

17,954
24
89
100

answered Jun 02 '20 at 07:02

Scott McKay

190
1
8

but `OrtThreadingOptions` struct is not exposed by the `onnxruntime` cxx api – katrasnikj Jun 30 '20 at 13:11
`OrtThreadingOptions` can be created using the C-API `CreateThreadingOptions` function. The C++ API is just a convenience wrapper over the C-API. Use `ReleaseThreadingOptions` to free that once the environment is created. – Scott McKay Jun 30 '20 at 21:46

How to multi-thread in ONNX Runtime?

1 Answers1