2

Can we use a different CPU architecture(and backend) for training(calibration) and inference of the quantized pytorch model?

The only post on this subject that I've found states:

static quantization must be performed on a machine with the same architecture as your deployment target. If you are using FBGEMM, you must perform the calibration pass on an x86 CPU; if you are using QNNPACK, calibration needs to happen on an ARM CPU

But there is nothing about this in the official documentation.

Serhiy
  • 4,357
  • 5
  • 37
  • 53

1 Answers1

1

The information in the link you posted is correct. You should use the same backend in both the cases. This is also mentioned in the official documentation-

"When preparing a quantized model, it is necessary to ensure that qconfig and the engine used for quantized computations match the backend on which the model will be executed."

Find it here

https://pytorch.org/docs/stable/quantization.html

Praveen Kumar
  • 222
  • 3
  • 6
  • 1
    So, they talk about `qconfig` and `engine`, both of these are configuration options. But state nothing about the `backend` of the machine used for training/calibration. – Serhiy Aug 16 '21 at 13:58