WinML inference time on GPU 3 time slower than Tensorflow python

Question

I try to use a tensorflow model trained on python in WinML. I successfully convert protobuf to onnx. The following performance result are obtained :

WinML 43s
OnnxRuntime 10s
Tensorflow 12s

The inference on CPU take arround 86s.

On performance tools WinML doesn't seem to correctly use the GPU in comparison of other. It's seemed WinML use DirectML as backend (We observe DML prefix on Nvidia GPU profiler). Is it possible to use Cuda inference Engine with WinML ? Did anyone observe similar result, WinML being abnormally slow on GPU ?

score 3 · Accepted Answer · answered Apr 15 '20 at 05:22

3

I got some answer about this WinML performance. My network use LeakyRelu that was supported by DirectML only in Windows 2004. On Windows previous version, this issue disable the use of DirectML Metacommand thus bad performance. With the new windows version I got good performance with WinML.

answered Apr 15 '20 at 05:22

Erwan

587
4
23

How did you manage to figure out it was the LeakyRelu? I'm having a similar problem and don't know where to start – Tom Huntington Feb 01 '23 at 22:07

WinML inference time on GPU 3 time slower than Tensorflow python

1 Answers1