216

This came up at the office today. I have no plans of doing such a thing, but theoretically could you write a compiler in SQL? At first glance it appears to me to be turing complete, though extremely cumbersome for many classes of problems.

If it is not turing complete, what would it require to become so?

Note: I have no desire to do anything like write a compiler in SQL, I know it would be a silly thing to do, so if we can avoid that discussion I would appreciate it.

Matthew Vines
  • 27,253
  • 7
  • 76
  • 97

8 Answers8

259

It turns out that SQL can be Turing Complete even without a true 'scripting' extension such as PL/SQL or PSM (which are designed to be true programming languages, so that's kinda cheating).

In this set of slides Andrew Gierth proves that with CTE and Windowing SQL is Turing Complete, by constructing a cyclic tag system, which has been proved to be Turing Complete. The CTE feature is the important part however -- it allows you to create named sub-expressions that can refer to themselves, and thereby recursively solve problems.

The interesting thing to note is that CTE was not really added to turn SQL into a programming language -- just to turn a declarative querying language into a more powerful declarative querying language. Sort of like in C++, whose templates turned out to be Turing complete even though they weren't intended to create a meta programming language.

Oh, the Mandelbrot set in SQL example is very impressive, as well :)

Qantas 94 Heavy
  • 15,750
  • 31
  • 68
  • 83
Jan de Vos
  • 3,778
  • 1
  • 20
  • 16
  • 2
    Oracle SQL is also turing complete, although in a rather sick way: http://blog.schauderhaft.de/2009/06/18/building-a-turing-engine-in-oracle-sql-using-the-model-clause/ – Jens Schauder Aug 19 '14 at 05:38
  • 3
    >It turns out that SQL Shouldn't it say: It turns out that SQL:1999? Just saying this because CTEs were added in version 99 and way too many people associate standard sql with Sql 92. – Ernesto Nov 13 '14 at 14:22
  • 3
    @JensSchauder that can be generalised to "Oracle $technology is $some_good_feature, although in a rather sick way" – Rob Grant Oct 04 '16 at 12:49
  • 3
    It's been 9 years but this might be interesting https://beta.observablehq.com/@pallada-92/sql-3d-engine – Loupax Jan 16 '19 at 07:26
45

A given programming language is said to be Turing-complete if it can be shown that it is computationally equivalent to a Turing machine.

The TSQL is Turing Complete because we can make a BrainFuck interpreter in TSQL.

BrainFuck interpreter in SQL - GitHub

The provided code works in-memory and doesn't modify a database.

-- Brain Fuck interpreter in SQL

DECLARE @Code  VARCHAR(MAX) = ', [>,] < [.<]'
DECLARE @Input VARCHAR(MAX) = '!dlroW olleH';

-- Creates a "BrainFuck" DataBase.
-- CREATE DATABASE BrainFuck;

-- Creates the Source code table
DECLARE @CodeTable TABLE (
    [Id]      INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
    [Command] CHAR(1) NOT NULL
);

-- Populate the source code into CodeTable
DECLARE @CodeLen INT = LEN(@Code);
DECLARE @CodePos INT = 0;
DECLARE @CodeChar CHAR(1);

WHILE @CodePos < @CodeLen
BEGIN
    SET @CodePos  = @CodePos + 1;
    SET @CodeChar = SUBSTRING(@Code, @CodePos, 1);
    IF @CodeChar IN ('+', '-', '>', '<', ',', '.', '[', ']')
        INSERT INTO @CodeTable ([Command]) VALUES (@CodeChar)
END

-- Creates the Input table
DECLARE @InputTable TABLE (
    [Id]   INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
    [Char] CHAR(1) NOT NULL
);

-- Populate the input text into InputTable
DECLARE @InputLen INT = LEN(@Input);
DECLARE @InputPos INT = 0;

WHILE @InputPos < @InputLen
BEGIN
    SET @InputPos = @InputPos + 1;
    INSERT INTO @InputTable ([Char])
    VALUES (SUBSTRING(@Input, @InputPos, 1))
END

-- Creates the Output table
DECLARE @OutputTable TABLE (
    [Id]   INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
    [Char] CHAR(1) NOT NULL
);

-- Creates the Buffer table
DECLARE @BufferTable TABLE (
    [Id]     INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
    [Memory] INT DEFAULT 0  NOT NULL
);
INSERT INTO @BufferTable ([Memory])
VALUES (0);

-- Initialization of temporary variables 
DECLARE @CodeLength INT = (SELECT COUNT(*) FROM @CodeTable);
DECLARE @CodeIndex  INT = 0;
DECLARE @Pointer    INT = 1;
DECLARE @InputIndex INT = 0;
DECLARE @Command    CHAR(1);
DECLARE @Depth      INT;

-- Main calculation cycle
WHILE @CodeIndex < @CodeLength
BEGIN
    -- Read the next command.
    SET @CodeIndex = @CodeIndex + 1;
    SET @Command = (SELECT [Command] FROM @CodeTable WHERE [Id] = @CodeIndex);

    -- Increment the pointer.
    IF @Command = '>'
    BEGIN
        SET @Pointer = @Pointer + 1;
        IF (SELECT [Id] FROM @BufferTable WHERE [Id] = @Pointer) IS NULL
            INSERT INTO @BufferTable ([Memory]) VALUES (0);
    END

    -- Decrement the pointer.
    ELSE IF @Command = '<'
        SET @Pointer = @Pointer - 1;

    -- Increment the byte at the pointer.
    ELSE IF @Command = '+'
        UPDATE @BufferTable SET [Memory] = [Memory] + 1 WHERE [Id] = @Pointer;

    -- Decrement the byte at the pointer.
    ELSE IF @Command = '-'
        UPDATE @BufferTable SET [Memory] = [Memory] - 1 WHERE [Id] = @Pointer;

    -- Output the byte at the pointer.
    ELSE IF @Command = '.'
        INSERT INTO @OutputTable ([Char]) (SELECT CHAR([Memory]) FROM @BufferTable WHERE [Id] = @Pointer);

    -- Input a byte and store it in the byte at the pointer.
    ELSE IF @Command = ','
    BEGIN
        SET @InputIndex = @InputIndex + 1;
        UPDATE @BufferTable SET [Memory] = COALESCE((SELECT ASCII([Char]) FROM @InputTable WHERE [Id] = @InputIndex), 0) WHERE [Id] = @Pointer;
    END

    -- Jump forward past the matching ] if the byte at the pointer is zero.
    ELSE IF @Command = '[' AND COALESCE((SELECT [Memory] FROM @BufferTable WHERE [Id] = @Pointer), 0) = 0
    BEGIN
        SET @Depth = 1;
        WHILE @Depth > 0
        BEGIN
            SET @CodeIndex = @CodeIndex + 1;
            SET @Command = (SELECT [Command] FROM @CodeTable WHERE [Id] = @CodeIndex);
            IF @Command = '[' SET @Depth = @Depth + 1;
            ELSE IF @Command = ']' SET @Depth = @Depth - 1;
        END
    END

    -- Jump backwards to the matching [ unless the byte at the pointer is zero.
    ELSE IF @Command = ']' AND COALESCE((SELECT [Memory] FROM @BufferTable WHERE [Id] = @Pointer), 0) != 0
    BEGIN
        SET @Depth = 1;
        WHILE @Depth > 0
        BEGIN
            SET @CodeIndex = @CodeIndex - 1;
            SET @Command = (SELECT [Command] FROM @CodeTable WHERE [Id] = @CodeIndex);
            IF @Command = ']' SET @Depth = @Depth + 1;
            ELSE IF @Command = '[' SET @Depth = @Depth - 1;
        END
    END
END;

-- Collects and prints the output
DECLARE @Output VARCHAR(MAX);
SELECT @Output = COALESCE(@Output, '') + [Char]
FROM @OutputTable;

PRINT @Output;
Go

SQL BrainFuck Interpretter

Miroslav Popov
  • 3,294
  • 4
  • 32
  • 55
  • That's transact SQL which is Turing complete, ANSI SQL I understood is not TC. But good effort! – alimack Jun 14 '16 at 16:43
  • 1
    SQL is now standardized by ISO/IEC. And since its 1999 version, it is now Turing Complete, due to a new feature named Common Table Expressions (or CTE), that adds hierarchical and recursive queries to SQL. – Hilton Fernandes May 07 '21 at 23:47
32

https://web.archive.org/web/20110807062050/http://channel9.msdn.com/forums/TechOff/431432-SQL-Turing-Completeness-question

Is a discussion of this topic. A quote:

SQL as such (i.e. the SQL92 standard) is not turing complete. However, many of the languages derived from SQL, such as Oracle's PL/SQL and SQL Server's T-SQL and others are turing complete.

PL/SQL and T-SQL certainly qualify as programming languages, whether SQL92 itself qualifies is open for debate. Some people claim that any piece of code that tells a computer what to do qualifies as a programming language; by that definition SQL92 is one, but so is e.g. HTML. The definition is rather vague, and it's imo a pointless thing to argue about.

Gordon
  • 19,811
  • 4
  • 36
  • 74
Aiden Bell
  • 28,212
  • 4
  • 75
  • 119
  • 1
    This answer needs updating. The SQL92 version is from 1992. Since then, recursive common table expressions have made standard SQL Turing-complete. – Andy Carlson Apr 01 '22 at 03:26
20

Strictly speaking, SQL is now a turing complete language because the latest SQL standard includes the "Persistent Stored Modules" (PSMs). In short, a PSM is the standard version of the PL/SQL language in Oracle (and other similar procedural extensions of current DBMS).

With the inclusion of these PSMs, SQL became turing complete

Jordi Cabot
  • 8,058
  • 2
  • 33
  • 39
15

An ANSI select statement, as originally defined in SQL-86, is not turing complete because it always terminates (except for recursive CTEs and only if the implementation supports arbitrarily deep recursion). It is therefore not possible to simulate any other turing machine. Stored procedures are turing complete but thats cheating ;-)

usr
  • 168,620
  • 35
  • 240
  • 369
5

Simply, execute this query:

select * 
from (SELECT 0 union SELECT 1) 
JOIN (SELECT 0 union SELECT 1) 
JOIN (SELECT 0 union SELECT 1)  
JOIN (SELECT 0 union SELECT 1)  
JOIN (SELECT 0 union SELECT 1) 
JOIN (SELECT 0 union SELECT 1)  
JOIN (SELECT 0 union SELECT 1) 
JOIN (SELECT 0 union SELECT 1) 

and you will generate the first 256 numbers of 8 digits. Now use combinations of SELECT, ANDs and ORs and you can write any 8 bits machine.

As you can see there is no table involved, only select, union and join

That looks like Turing complete to me

zajonc
  • 1,935
  • 5
  • 20
  • 25
riemaxi
  • 155
  • 2
  • 5
3

There remains a great deal of confusion among the general public about what Turing Completeness is and what constitutes a Turing-Complete language. I wrote a blog post to try to clear up some of these misconceptions, which is to say I watched a TV show about super-intelligent computers and it presented a really useful analogy:

https://github.com/ubuvoid/writings/blob/main/miller/miller_is_sql.md

In this answer, I'll elaborate on the case of SQL.

We actually have enough information after a cursory at a glance to determine that SQL is Turing-Complete.

That's because it has affordances for branching choices ("CASE IF THEN"), and because even in SQL dialects that restrict recursion, it's trivial to write output to a table and re-read it in a subsequent query as though it were a memory register. Therefore, you can implement an interpreter for an assembly language in SQL without thinking very hard about it. If you want to demonstrate this with a minimum of labor, implement a single-instruction-set computer with an operation like 'subleq' (see https://en.wikipedia.org/wiki/One-instruction_set_computer ).

The fact that some implementations try to sandbox execution for SQL queries doesn't change the core functionality of the language, it just slaps a band-aid on the problem that well-meaning "convenience" features tend to cut through like a knife.

For a demonstration, consider the following:

  • Say you have a restricted-execution "Non"-Turing Complete SQL interpreter, the kind that only allows you to run a single query or update operation at once and imposes limits on that query.
  • Now, say you can invoke that interpreter from the shell.
  • Finally, say you type the words "while true:" above the command to execute this SQL interpreter in a shell. Now you have Turing-complete execution.

Is this a meaningful distinction? I would argue that if the only difference you can articulate between your language and Turing Completeness is the fact that there's no while loop, then your framework for interpreting Turing Completeness is not particularly useful.

Non-Turing Complete languages in widespread use tend to describe "inert" data artifacts, and lack affordances for branching paths or explicit procedure invocations. Simply put, if a language has no "verbs" and/or no equivalent to "if then else", you can be sure it's Non-Turing Complete. Examples include interface definition languages (DIANA IDL, CORBA, Proto, Thrift, JSON Typedef, Coral, Ion, etc.), markup languages (XML, HTML, Markdown), data representation languages (JSON, text-format protocol buffers, ascii DIANA representation), and so on.

It's often useful to impose restrictions on Turing-complete execution at strategic parts of a software architecture to achieve separation of concerns, and non-TC languages offer a simple and robust means of making those guarantees.

A classic example is validating the schema of JSON payloads or other data artifacts in an HTTP or RPC request/response, a labor-intensive and error-prone task that can be automated with IDL's and code templates.

Keep circulating the artifacts.

(Edit 2022/04/08: Fun fact. Amazon carries the dubious distinction of having invented not one but two of the above-mentioned IDL's, while simultaneously using a third one that was invented at Facebook. Remember that the primary purpose of an interface definition language is to define interfaces, so I hope they have an IDL for converting one IDL to another. One colleague of mine compared Amazon's software infrastructure to the Paris sewers.)

datagram
  • 130
  • 1
  • 3
2

Oracle's PLSQL and Microsoft's TSQL both are turing complete. Oracle's select statement itself is also turing complete.

sahossaini
  • 474
  • 1
  • 5
  • 11