That is not a correct interpretation of utilization. 10% utilization means, roughly speaking, 10% of the time, a GPU kernel is running. 90% of the time, no GPU kernel is running. It does not tell you anything about what that GPU kernel is doing, or how many resources it is using. The answer given on superuser is wrong. The correct description is here. It is possible, as indicated there, to demonstrate 100% utilization for a GPU kernel that only uses one "core" (ie. a kernel that is only using one thread).
Regarding your question, you should not assume that there will be no change in performance whatsoever if you switch from a GPU with 4608 cores to a GPU that has 3000 cores. First of all that is not enough information to judge performance (things like clock speed and other things matter), and secondly, if you for example assumed they were GPUs of the same generation, a GPU with 3000 cores is likely to be somewhat slower than a GPU of 4608 cores. This is because for a given GPU architectural generation, other things like clock speed, memory bandwidth, etc. are all likely to be lower on the GPU with 3000 cores.
In short, I wouldn't assume the inference performance would be the same. It depends on other things besides what you have indicated here. I think it could be faster, and could also be slower, depending on the actual GPUs comparison.
With respect to CUDA GPUs that are currently available, almost anything is likely to be somewhat slower at inference performance than a Titan RTX. The difference may be small, perhaps negligible, or larger, depending on the specific GPU.