Using ONNX Runtime to run inference on deep learning models. Lets say I have 4 different models, each with its own input image, can I run them in parallel in 4 threads? Would there be one "environment" and then 4 sessions (using same environment)?
Asked
Active
Viewed 8,204 times
7
-
1Also, would the 4 sessions share the read-only state (weights, biases, etc)? – MSalters Apr 28 '20 at 15:10
-
1Well, in my case, I would actually have 4 separate models (each with own weights). But, yes, we need some doc on how ORT works (or doesn't) with multi-threading! – Tullhead Apr 29 '20 at 17:16
1 Answers
6
Yes - one environment and 4 separate sessions is how you'd do it.
'read only state' of weights and biases are specific to a model.
A session has a 1:1 relationship with a model, and those sorts of things aren't shared across sessions as you only need one session per model given you can call Run concurrently with different input sizes (assuming the model supports dynamic batch/input sizes).
Regarding threading, the default is a per-session threadpools, but it's also possible to share global threadpools across sessions.
How you do that differs by the API used:
- For the C API use
CreateEnvWithGlobalThreadPools
. - For the C++ API provide
OrtThreadingOptions
when constructingOrt::Env
.

Peyman Mohamadpour
- 17,954
- 24
- 89
- 100

Scott McKay
- 190
- 1
- 8
-
but `OrtThreadingOptions` struct is not exposed by the `onnxruntime` cxx api – katrasnikj Jun 30 '20 at 13:11
-
`OrtThreadingOptions` can be created using the C-API `CreateThreadingOptions` function. The C++ API is just a convenience wrapper over the C-API. Use `ReleaseThreadingOptions` to free that once the environment is created. – Scott McKay Jun 30 '20 at 21:46