Are GPU shaders Turing complete

Question

I understand that complete GPUs are behemoths of computing - including every step of calculation, and memory. So obviously a GPU can compute whatever we want - it's Turing complete.

My question is in regard to a single shader on various GPUs ("Stream Processor"/"CUDA Core"):
Is it Turing complete?
Can I (in theory) compute an arbitrary function over arbitrary inputs by using a single shader?
I'm trying to understand at what "scale" of computation shaders live.

If a language has an if and a goto it is turing complete. Most likely, modern generations of shader support are all trivially turing complete. — usr, Jul 05 '14 at 11:39

score 15 · Accepted Answer · edited Jun 20 '20 at 09:12

Did You mean shader as a program used to compute shading?

On wiki talk I found:

(...)Shader models 1.x and 2.0 are indeed not Turing complete, because they lack a generalised iteration capability (they do have some limited looping constructs, but this is effectively unrolled at compile time, so the number of iterations must be constant).

Shader model 3.0, which is used in the latest PC hardware and on Xbox 360, has fully general looping abilities and is Turing complete in the theoretical sense. This rather nicely highlights the difference between theory and practice, though! When people claim a device is Turing complete, what they actually mean is "if this had infinite time and infinite storage, it would be Turing complete". Shader model 3.0 is still extremely limited in register space and program instruction count, so it fails rather badly when put to any real world test.

Note that even shader 1.x can become Turing complete if you allow multiple passes of rendering operations. For instance it is trivial to implement the Game of Life using repeated render-to-texture operations. In this case the input and output textures provide a large amount of storage space, and the repeated render calls fills in for the missing iteration constructs. This is cheating, though, because it is depending on the CPU to issue the render calls!

as an example of non-turing-complete languages: Wikipedia Page on non-turing-complete Shaders

Generally it depends on shader language (and Your Turing-complete requirements), but I think that most recent shader languages can be called Turing complete (if we ignore any limitations of finite memory) bacause they can loop and read/write variables.

EDIT:

If I misunderstood Your question and You mean shader as shader processing unit (like Cuda core) then I think that single core should not be considered in category of Turing complete or not complete. GPU is not only built up on cores. Answering Your question you can program GPU with any number of cuda cores to "compute an arbitrary function over arbitrary inputs".

Yes I was asking about a single execution unit in a modern GPU. If I give it all the RAM that I want (say 2G), can a single compiled program for that shader be able to compute an arbitrary function (within the bound of mem requirements) ? — Trevor, Jul 07 '14 at 08:05
GPU using a single core gives You the same 'Turing capabilities' as manycore GPU. GPU is not only built up on cores. Cuda Core is only a part of straming multiprocessor unit and runnig GPU needs more recources than that. To execute Your gpu program You need computer with working GPU, not just part of it. I think that You should consider language in wich Your program is written as a Turing complete or not. — Kamil Czerski, Jul 07 '14 at 08:26

Are GPU shaders Turing complete

1 Answers1

Linked