1

CNN algorithms like DenseNet DenseNet stress parameter efficiency, which usually results in less FLOPs. However, what I am struggling to understand is why this is important. For DenseNet, in particular, it has low inference speed. Isn't the purpose of decreased parameter size/FLOPs to decrease the time for inference? Is there another real world reason, such as perhaps less energy used, for these optimizations?

ddd
  • 121
  • 1
  • 1
  • 9
  • I’m voting to close this question because it is not about programming as defined in the [help] but about ML theory and/or methodology - please see the intro and NOTE in the `machine-learning` [tag info](https://stackoverflow.com/tags/machine-learning/info). – desertnaut Sep 20 '21 at 10:33

1 Answers1

1

There is a difference between overall inference time vs. per parameter/FLOPs training efficiency. Having lower parameter/FLOPs in training does not guarantee higher speed in inference. Because overall inference depends on the architecture and how predictions are computed.

patagonicus
  • 203
  • 2
  • 10