CUDA active warps vs resident warps

Question

Occupancy in CUDA is defined as

occupancy = active_warps / maximum_active_warps

What is the difference between a resident CUDA warp and an active one?

From my research on the web it seems that a block is resident (i.e. allocated along with its register/shared memory files) on a SM for the entire duration of its execution. Is there a difference with "being active"?

If I have a kernel which uses very few registers and shared memory.. does it mean that I can have maximum_active_warps resident blocks and achieve 100% occupancy since occupancy just depends on the amount of register/shared memory used?

Might be related to [this question](http://stackoverflow.com/questions/41607768/questions-of-resident-warps-of-cuda/41608401#41608401). — Taro, Jan 30 '17 at 08:34

talonmies · Accepted Answer · 2017-01-29T14:08:17.523

What is the difference between a resident CUDA warp and an active one?

In this context presumably nothing.

From my research on the web it seems that a block is resident (i.e. allocated along with its register/shared memory files) on a SM for the entire duration of its execution. Is there a difference with "being active"?

Now you have switched from asking about warps to asking about blocks. But again, in this context no, you could consider them to be the same.

If I have a kernel which uses very few registers and shared memory.. does it mean that I can have maximum_active_warps resident blocks and achieve 100% occupancy since occupancy just depends on the amount of register/shared memory used?

No because a warp and a block are not the same thing. As you yourself have quoted, occupancy is defined in terms of warps, not blocks. The maximum number of warps is fixed at 48 or 64 depending on your hardware. The maximum number of blocks is fixed at 8, 16 or 32 depending on hardware. There are two independent limits which are not the same. Both can influence the effective occupancy a given kernel can achieve.

Thanks, just one more question if I may: does `active` warp mean a warp currently being executed by one core or does it just mean `resident`, i.e. with resources for the block it is contained in allocated? — Dean, Jan 30 '17 at 10:44
A warp is never executed by one core. Each thread within a warp logically executes on a single core. The difference between resident and active is pure semantics -- you haven't actually quoted or provided a link where "resident threads" is used so I can't tell you what the intention of the author of the text I haven't read means — talonmies, Jan 30 '17 at 11:44

CUDA active warps vs resident warps

1 Answers1