I am beginner of Cuda programming. Apology for my simple question.
I read some document and examples. If I use a kernel function, I should do something like
kernelfun <<<number of block, number of thread>>>(args).
So there is no number for grid. Do we need to set the number of grid we plan to use?
According to my GPU, how should I set the number of block, and number of thread?
Because I saw the max number threads per block is 512. So I should to set the num of thread is 512 to full use the GPU.
The other question is should I calculate the memory of my project use when I set the numbers of block and thread? Or the computer will arrange this automatically and I do not need to concern the memory my project uses.