Occupancy in CUDA is defined as
occupancy = active_warps / maximum_active_warps
What is the difference between a resident CUDA warp and an active one?
From my research on the web it seems that a block is resident (i.e. allocated along with its register/shared memory files) on a SM for the entire duration of its execution. Is there a difference with "being active"?
If I have a kernel which uses very few registers and shared memory.. does it mean that I can have maximum_active_warps
resident blocks and achieve 100% occupancy since occupancy just depends on the amount of register/shared memory used?