I'm writing simulation with MATLAB where I used CUDA acceleration.
Suppose we have vector x
and y
, matrix A
and scalar variables dt
,dx
,a
,b
,c
.
What I found out was that by putting x
,y
,A
into gpuArray()
before running the iteration and built-in functions, the iteration could be accelerated significantly.
However, when I tried to put variables like dt
,dx
,a
,b
,c
into the gpuArray()
, the program would be significantly slowed down, by a factor of over 30%. (Time increased from 7s to 11s).
Why it was not a good idea to put all the variables into the gpuArray()
?
(Short comment, those scalars were multiplied together with x
,y
,A
, and was never used during the iteration alone.)