Questions tagged [quantization-aware-training]
58 questions
7
votes
2 answers
Quantization aware training in TensorFlow version 2 and BatchNorm folding
I'm wondering what the current available options are for simulating BatchNorm folding during quantization aware training in Tensorflow 2. Tensorflow 1 has the tf.contrib.quantize.create_training_graph function which inserts FakeQuantization layers…

MaartenVds
- 573
- 6
- 10
6
votes
1 answer
RuntimeError: Unsupported qscheme: per_channel_affine
I'm following a tutorial on quantization aware training here for a modified Resnet18 model which is here:
#!/usr/bin/env python
# coding: utf-8
# In[ ]:
# Modified from
#…

Manu Dwivedi
- 77
- 7
5
votes
3 answers
TensorFlow fake-quantize layers are also called from TF-Lite
I'm using TensorFlow 2.1 in order to train models with quantization-aware training.
The code to do that is:
import tensorflow_model_optimization as tfmot
model = tfmot.quantization.keras.quantize_annotate_model(model)
This will add fake-quantize…

Ohad Meir
- 714
- 8
- 18
4
votes
0 answers
Quantization aware training
I am little new in this domain. So, I apologize in advance if it's something very basic.
I am currently trying to follow this link…

Yassa Fareed
- 43
- 5
3
votes
0 answers
How does int8 inference really works?
Not sure if this is the right place to ask this kind of question, but I can’t really find an example of how int8 inference works at runtime. What I know is that, given that we are performing uniform symmetric quantisation, we calibrate the model,…

ИванКарамазов
- 442
- 2
- 17
3
votes
0 answers
How to properly quantize CNN into 4-bit using Tensorflow QAT?
I am trying to make 4-bit quantization and used this example
First of all I received the following warnings:
WARNING:tensorflow:AutoGraph could not transform

anatoly
- 327
- 2
- 13
2
votes
1 answer
TF Yamnet Transfer Learning and Quantization
TLDR:
Short term: Trying to quantize a specific portion of a TF model (recreated from a TFLite model). Skip to pictures below. \
Long term: Transfer Learn on Yamnet and compile for Edge TPU.
Source code to follow along is here
I've been trying to…

Anthony Rusignuolo
- 101
- 1
- 6
2
votes
1 answer
pytorch static quantization: different training(calibration) and inference backends
Can we use a different CPU architecture(and backend) for training(calibration) and inference of the quantized pytorch model?
The only post on this subject that I've found states:
static quantization must be performed on a machine with the…

Serhiy
- 4,357
- 5
- 37
- 53
2
votes
0 answers
TensorFlow's QAT does not seem to perform per-channel quantization with AllValuesQuantizer
I have created two QAT models with the AllValuesQuantizer, one with per-tensor and one with per-channel quantization. When inspecting their respective QuantizeWrapper layers I note that both have scalar values for the variables kernel_min and…

LucasStromberg
- 43
- 6
2
votes
1 answer
How to get quantized weights from TensorFlow's quantization aware training with experimental quantization
I'm using TensorFlow's quantization aware training API and wish to deploy a model with arbitrary bit-width. As only 8 bit quantization is supported for tflite deployment I will deploy with a custom inference algorithm, but I still need to access the…

LucasStromberg
- 43
- 6
2
votes
1 answer
Quantized TFLite model gives better accuracy than TF model
I am developing an end-to-end training and quantization aware traing example. Using the CIFAR 10 dataset, I load a pretrained MobilenetV2 model and then use the code from the TensorFlow Guide to quantize my model. After the whole process finishes…

Florence
- 59
- 5
2
votes
2 answers
Quantization aware training in tensorflow 2.2.0 producing higher inference time
I'm working on quantization in transfer learning using MobilenetV2 for personal dataset. There are 2 approaches that I have tried:
i.) Only post training quantization: It is working fine and is producing 0.04s average time for inference of 60 images…

Aparajit Garg
- 122
- 2
- 2
- 12
2
votes
1 answer
Tensorflow cannot quantize reshape function
I am going to train my model quantization aware. However, when i use it , the tensorflow_model_optimization cannot quantize tf.reshape function , and throws an error.
tensorflow version : '2.4.0-dev20200903'
python version : 3.6.9
the code:
import…

Ixtiyor Majidov
- 301
- 4
- 11
2
votes
1 answer
Quantization not yet supported for op: 'DEQUANTIZE' for tensorflow 2.x
I am conducting QAT by keras on a resnet model and I got this problem while converting to tflite full integer model. I have tried the newest version tf-nightly, but it does not solve the problem.
I use quantization annotated model for Batch…

dtlam26
- 1,410
- 11
- 19
2
votes
1 answer
How to implement TF Lite inference in Python
For research purposes, I'm trying to understand how TF Lite does its inference. I'm interested only in the software logic.
I'm using TensorFlow 2.1 and TensorFlow Model Optimization 0.3.0.
As an example, I use a very simple fully connected…

Ohad Meir
- 714
- 8
- 18