pytorch static quantization: different training(calibration) and inference backends

Question

Can we use a different CPU architecture(and backend) for training(calibration) and inference of the quantized pytorch model?

The only post on this subject that I've found states:

static quantization must be performed on a machine with the same architecture as your deployment target. If you are using FBGEMM, you must perform the calibration pass on an x86 CPU; if you are using QNNPACK, calibration needs to happen on an ARM CPU

But there is nothing about this in the official documentation.

score 1 · Answer 1 · answered Aug 13 '21 at 16:46

1

The information in the link you posted is correct. You should use the same backend in both the cases. This is also mentioned in the official documentation-

"When preparing a quantized model, it is necessary to ensure that qconfig and the engine used for quantized computations match the backend on which the model will be executed."

Find it here

https://pytorch.org/docs/stable/quantization.html

answered Aug 13 '21 at 16:46

Praveen Kumar

222
3
6

1

So, they talk about `qconfig` and `engine`, both of these are configuration options. But state nothing about the `backend` of the machine used for training/calibration. – Serhiy Aug 16 '21 at 13:58

pytorch static quantization: different training(calibration) and inference backends

1 Answers1