I would like to compare the size of a Keras model and its TFLite quantized version. After training, one saves the model in .h5 format, and then load it as:
model= tf.keras.models.load_model('dir/model.h5')
converter = tf.lite.TFLiteConverter.from_keras_model_file('dir.model.h5')
converter.post_training_quantize = True
quantized_model = converter.convert()
My question: Is there a standard functionality or method to estimate and compare the size of the models? I understand they may be saved in different formats resulting in different sizes, and I would like to know if there is a good way for ML practitioners to measure the actual size of the model (say, in the context of edge deployment where there is a tight constraint on the actual model size in the hard drive).
I tried to use:
sys.getsizeof('dir/model.h5') # returns ~100
sys.getsizeof(quantized_model) # returns ~10,000 -- yes, much larger, probably due to different formats
but the result does not establish any meaningful comparison.
Update: interested in the actual memory usage, or other practical measures that I am not aware of, but not the number of parameters, which should be preserved under quantization.