Short answer: NO.
Assigning thread blocks to Streaming Multiprocessors is a scheduler's, and not a programmer's, job. So you will have no guarantee that the scheduler will decide to run blocks 0
and 1
on different Streaming Multiprocessors. This Stack Overflow thread
How CUDA Blocks/Warps/Threads map onto CUDA Cores?
will be helpful to you to understand. Also the whitepaper
NVIDIA’s Next Generation CUDA Compute Architecture: Fermi
although related to Fermi, will give you a deeper insight.
Be warned also that, to get the best performance out of a GPU, roughly speaking all the threads should execute the same instruction "at the same time". To achieve what you are saying in your post you should have conditional shared memory allocation which makes me think that you will end up having other conditional statements. This may impact performance.