I can't still fully understand cuda's compute capability when compiling the source code.
Assuming the binary files are compiled by using the flags from (code=sm_30, compute=30) to (code=sm_62, compute=62) (nvcc version is 10.1),
what happens when the Turing device (e.g., RTX2080Ti) runs these binary files?
Even though binary files do not include code=sm_75, compute=75 for the Turing architecture, why do they run correctly on the Turing device?
Does the Turing device JIT compile the PTX code of compute=62 (because compute=75 is not mentioned) and generate Turing's SASS (code=sm_75) instead of 65's SASS on runtime?