I am new in Computer Vision, so I don't understand how to choose my input size.I know input size matter in some point to not lose data but I want to use data as much as I can to speed up training, so it would be easier to use a formula to calculate how much storage of gpu does a CNN model use with input X.
I wonder how to calculate the necessary usage of gpu for a CNN model that gets input X such as (1,3,600,600),(16,3,400,400) and (16,1,800,800).
I came across to this question and the total parameter is 685060. Can I say it use 685060*4(is float size in byte) = 2740240 byte and also it is 2.74 mb. So does that mean this model in the question with (1,1,32,32) use 2.74 mb of gpu?
If the batch size would be (16,1,32,32), so does that mean I must multiply 2.74 mb with 16 to find total usage?