7

Datalog is not Turing complete.

But what is its computational class?

Is it equivalent to Finite state machine or Pushdown machine (i.e. context free) ... or is it something in between?

ggorlen
  • 44,755
  • 7
  • 76
  • 106
Mustafa
  • 931
  • 1
  • 13
  • 26

1 Answers1

3

Let's assume we have access, whether built-in or defined in the language by us, to the following predicates:

Head(x, y) iff y is a list and x is the first element in the list
Tail(x, y) iff x and y are lists and x is the same as y but is missing y's first element
Equal(x, y) iff x and y are the the same thing

First off, I think it's clear that this language can accept all the regular languages. By the Myhill-Nerode theorem, all states in a minimal DFA for a regular language correspond to a unique equivalence class under the indistinguishability relation. It seems like we could have one predicate per equivalence class/state to represent whether a list corresponding to an input string belongs to that class, and then another predicate that is true only if one of the predicates corresponding to an accepting state is true. So, for the language over {a, b} with an even number of a and an odd number of b, a minimal DFA has four states:

 O
 |
 V
q0<---a--->q1
 ^          ^
 |          |
 b          b
 |          |
 V          V
q2<---a--->q3

Here, q2 is the only accepting state. Our DataLog program might look like:

Q0(()).
Q0(x) :- Head(y, x), Equal(y, 'a'), Tail(z, x), Q1(z).
Q0(x) :- Head(y, x), Equal(y, 'b'), Tail(z, x), Q2(z).
Q1(x) :- Head(y, x), Equal(y, 'a'), Tail(z, x), Q0(z).
Q1(x) :- Head(y, x), Equal(y, 'b'), Tail(z, x), Q3(z).
Q2(x) :- Head(y, x), Equal(y, 'a'), Tail(z, x), Q3(z).
Q2(x) :- Head(y, x), Equal(y, 'b'), Tail(z, x), Q0(z).
Q3(x) :- Head(y, x), Equal(y, 'a'), Tail(z, x), Q2(z).
Q3(x) :- Head(y, x), Equal(y, 'b'), Tail(z, x), Q1(z).
EvenAOddB(x) :- Q2(x).

Based on this example I think it's clear we can always encode transitions in this fashion and so any regular language can be accepted. Thus, DataLog is at least as powerful as deterministic finite automata.

We can define this:

// Last(x, y) iff x is the last element of y
Last(x, y) :- Head(x, y), Tail(z, y), Equal(z, ()).

// AllButLast(x, y) iff x and y are the same list but x is missing the last element of y
AllButLast((), (x)).
AllButLast(x, y) :- Head(w, x), Head(z, y), Equal(w, z),
                    Tail(w', x), Tail(z', y), AllButLast(w', z').

Now we can recognize lists corresponding to strings in the context-free language a^n b^n:

// ANBN(x) iff x is a list beginning with n 'a's followed by n 'b's
ANBN(()).
ANBN(x) :- Head(y, x), Equal(y, 'a'), Tail(z, x),
           Last(w, z), Equal(w, 'b'), AllButLast(z', z),
           ANBN(z').

It's easy to tweak that predicate to find the language of even-length palindromes, and from there it's easy to tweak to find the language of all palindromes. I feel confident we can also get it to accept languages like balanced parentheses, etc. Based on this experience, I am guessing that we can accept all the context-free languages.

Can we get a context-sensitive language? Let's try a^n b^n c^n. If we assume DataLog has built-in predicates like this for integer types:

Zero(x) iff x is equal to zero
Successor(x, y) iff x and y are integers and x = y + 1

Then I think we can, as follows:

ANBNCN(()).
ANBNCN(x) :- Zero(y), ANBNCNZ(x, y).

ANBNCNZ(x, y) :- BN(x, y).
ANBNCNZ(x, y) :- Head(w, x), Equal(w, 'a'),
                 Last(z, x), Equal(z, 'c'),
                 Tail(u, x), AllButLast(v, u),
                 Successor(r, y), ANBNCNZ(v, r).

BN(x, y) :- Head(w, x), Equal(w, 'b'),
            Successor(y, z), Tail(u, x),
            BN(u, z).

What the above says is the following:

  • the empty string is in a^n b^n c^n
  • otherwise, a string is in a^n b^n c^n if f(s, 0) is true
  • f(s, n) is true if s consists only of n instances of 'b'
  • f(s, n) is true if s starts with a, ends with c, and f(s', n + 1) is true of everything in the middle

This should work since each recursive call of f(s, n) strips one a and one c off the ends and remembers how many it has counted. It then counts off that many instances of b once all the a and c are gone.

My feeling based on this is that we can probably do some or all of the context-sensitive languages as well. Probably, the lack of unbounded execution is precisely what distinguishes the linear-bounded automata (productions in phrase-structure grammars must have a RHS no longer than the LHS) from general unrestricted grammars (whose intermediate forms can grow and shrink arbitrarily).

Patrick87
  • 27,682
  • 3
  • 38
  • 73
  • 1
    Thanks, but wouldn't adding `Head(x, y)` and `Tail(x, y)` to datalog make the language Turing complete. I can create `Cons(h, t, l) :- Head(h, l), Tail(t, l).` then I basically can create any arbitrary structure like Lisp and Prolog. For example I can now store the state of a Turing machine in a list. And have predicts that operate on the state `Step(state, instruction) :- ...` – Mustafa Jan 18 '20 at 23:19
  • @Mustafa I have to admit I didn't know whether DataLog had these predicates and just assumed it did. If not, this answer is not really applicable. How do you access lists in DataLog? Does DataLog do lists? – Patrick87 Jan 18 '20 at 23:32
  • I think DataLog store lists as relations like in SQL. Example: `Students(1, Adam).`, `Students(2, Sara).`, `Students(3, John).`. Then we query `:? Students(x, y)` to get the list of students. – Mustafa Jan 18 '20 at 23:41
  • So it's possible to create lists in DataLog that way but only at compile time. We can't dynamically change the lists at run-time. – Mustafa Jan 19 '20 at 00:30