2

I was recently told that integrating/compiling my PyTorch model with the help of Apache TVM should speed inference up, but am very confused about the usage of TVM as there seem to be multiple ways to use it.

This blog on their website says

Usage is simple:

import torch_tvm
torch_tvm.enable()

That’s it! PyTorch will then attempt to convert all operators it can to known Relay operators during its JIT compilation process.

If just this much does the job, why are other specific tutorials so much more comprehensive? After going through a fair few tutorials another thing i assumed (not sure if i am right) was that TVM cannot really use the default way torch code is run and needs the 'jit' compiler (non-default way of running torch code) to work over. This assumption was based off of the fact that the github repo tutorial shows the following snippet

import torch
import torch_tvm

torch_tvm.enable()

# The following function will be compiled with TVM
@torch.jit.script
def my_func(a, b, c):
    return a * b + c   

where the function my_func is wrapped with a decorator that looks to be compiling the function with jit. But using this exact same function and timing its usage with and without the wrapper + TVM shows that the normal usage of this function is far more time efficient. If its not speeding things up, what is exactly does jit+tvm help in? and if it is supposed to be speeding things up, why is that not the case here?

P.S. I apologise for the comprehensive description of my problem. I was not able to figure this out and get whats happening even after reading and tinkering around a fair amount with torch_tvm. Would appreciate any link to resources or any explanation that might help me out.

404pio
  • 1,080
  • 1
  • 12
  • 32
ashenoy
  • 163
  • 1
  • 9
  • Hi, did you make progress on this issue ? Thanks ! – jhagege Nov 15 '20 at 09:12
  • 1
    @cyberjoac sort of. apparently the model has to be compiled a few times on test runs (they call it warmup). on doing this the runtime is faster then eager execution but pretty much stays the same as compared to the usual torchscript or trace execution. im not sure how strongly the speedup would depend on the size of the model, but for my model it did not make much of a difference. – ashenoy Nov 15 '20 at 11:25

0 Answers0