3

I would like to compare the size of a Keras model and its TFLite quantized version. After training, one saves the model in .h5 format, and then load it as:

model= tf.keras.models.load_model('dir/model.h5')

converter = tf.lite.TFLiteConverter.from_keras_model_file('dir.model.h5')

converter.post_training_quantize = True

quantized_model = converter.convert()

My question: Is there a standard functionality or method to estimate and compare the size of the models? I understand they may be saved in different formats resulting in different sizes, and I would like to know if there is a good way for ML practitioners to measure the actual size of the model (say, in the context of edge deployment where there is a tight constraint on the actual model size in the hard drive).

I tried to use:

sys.getsizeof('dir/model.h5') # returns ~100

sys.getsizeof(quantized_model) # returns ~10,000 -- yes, much larger, probably due to different formats

but the result does not establish any meaningful comparison.


Update: interested in the actual memory usage, or other practical measures that I am not aware of, but not the number of parameters, which should be preserved under quantization.

user3160996
  • 31
  • 1
  • 5

2 Answers2

3

When you try to compare two models, it depends on what you want to compare them on:

  1. Memory space occupied by the two models (if with same level of compression: this is less likely).
  2. Number of trainable parameters (See this: How to count total number of trainable parameters in a tensorflow model?)
  3. Total number of parameters.

Comparison Based on File Size on Memory

import os
# Get file size in bytes for a given model
os.stat('model.h5').st_size

Comparison Based on Number of Parameters

In most cases one would like to know the total number of trainable parameters. However, if you are interested in the total number of trainable and non-trainable parameters, you could get it as follows using model.summary(). You could look into the source code of LayersModel class to find out what is being done to get the result of model.summary(). The github source for print_summary() method called from withing model.summary() can be found here:
source: https://www.tensorflow.org/js/guide/models_and_layers#model_summary

enter image description here

References

  1. How to check file size in python?
  2. https://www.tensorflow.org/js/guide/models_and_layers#model_summary
CypherX
  • 7,019
  • 3
  • 25
  • 37
  • Updated my question, the summary is not what I am looking for in this case. I'd appreciate it if you have other suggestions. And thank you nonetheless! – user3160996 Oct 22 '19 at 23:37
  • @user3160996 Updated solution. See section: **Comparison Based on File Size on Memory**. I hope this helps. – CypherX Oct 24 '19 at 05:39
1

You should convert the float model into a tflite model as well and then the comparison will be accurate.

# Create float TFLite model.
float_converter = tf.lite.TFLiteConverter.from_keras_model(model)
float_tflite_model = float_converter.convert()

# Measure sizes of models.
_, float_file = tempfile.mkstemp('.tflite')
_, quant_file = tempfile.mkstemp('.tflite')

with open(quant_file, 'wb') as f:
  f.write(quantized_tflite_model)

with open(float_file, 'wb') as f:
  f.write(float_tflite_model)

print("Float model in Mb:", os.path.getsize(float_file) / float(2**20))
print("Quantized model in Mb:", os.path.getsize(quant_file) / float(2**20))

code excerpt from: Quantization aware training Keras example

theroguecode
  • 191
  • 2
  • 6