7

Using ONNX Runtime to run inference on deep learning models. Lets say I have 4 different models, each with its own input image, can I run them in parallel in 4 threads? Would there be one "environment" and then 4 sessions (using same environment)?

Tullhead
  • 565
  • 2
  • 7
  • 17
  • 1
    Also, would the 4 sessions share the read-only state (weights, biases, etc)? – MSalters Apr 28 '20 at 15:10
  • 1
    Well, in my case, I would actually have 4 separate models (each with own weights). But, yes, we need some doc on how ORT works (or doesn't) with multi-threading! – Tullhead Apr 29 '20 at 17:16

1 Answers1

6

Yes - one environment and 4 separate sessions is how you'd do it.

'read only state' of weights and biases are specific to a model.

A session has a 1:1 relationship with a model, and those sorts of things aren't shared across sessions as you only need one session per model given you can call Run concurrently with different input sizes (assuming the model supports dynamic batch/input sizes).

Regarding threading, the default is a per-session threadpools, but it's also possible to share global threadpools across sessions.

How you do that differs by the API used:

  • For the C API use CreateEnvWithGlobalThreadPools.
  • For the C++ API provide OrtThreadingOptions when constructing Ort::Env.
Peyman Mohamadpour
  • 17,954
  • 24
  • 89
  • 100
Scott McKay
  • 190
  • 1
  • 8
  • but `OrtThreadingOptions` struct is not exposed by the `onnxruntime` cxx api – katrasnikj Jun 30 '20 at 13:11
  • `OrtThreadingOptions` can be created using the C-API `CreateThreadingOptions` function. The C++ API is just a convenience wrapper over the C-API. Use `ReleaseThreadingOptions` to free that once the environment is created. – Scott McKay Jun 30 '20 at 21:46