Understanding Prolog's empty lists

Question

I am reading Bratko's Prolog: Programming for Artificial Intelligence. The easiest way for me to understand lists is visualising them as binary trees, which goes well. However, I am confused about the empty list []. It seems to me that it has two meanings.

When part of a list or enumeration, it is seen as an actual (empty) list element (because somewhere in the tree it is part of some Head), e.g. [a, []]
When it is the only item inside a Tail, it isn’t an element it literally is nothing, e.g. [a|[]]

My issue is that I do not see the logic behind 2. Why is it required for lists to have this possible ‘nothingness’ as a final tail? Simply because the trees have to be binary? Or is there another reason? (In other words, why is [] counted as an element in 1. but it isn't when it is in a Tail in 2?) Also, are there cases where the final (rightmost, deepest) final node of a tree is not ‘nothing’?

Lists are actually simpler than trees. They're more like strings; so much so that in some languages strings are defined as lists of characters. The empty list is for lists, what the empty string is for strings. — S.L. Barth is on codidact.com, Dec 01 '16 at 09:18
`[a, [1,2]]` is a valid list of two elements where the first element is `a` and the second element is a list `[1,2]`. Same idea as `[a, []]`: a list of two elements, first being `a`, second being the empty list `[]`. You can also have `[a | [1,2]]` which is a list with head `a` and tail `[1, 2]`, the same as `[a, 1, 2]`. So both 1 & 2 in your question treat `[]` consistently as a list. I'm not sure I understand what you mean when you say, *... **required** for lists to have this possible ‘nothingness’ as a final tail*. It is not really *required*, but merely a property. Just like `a + 0 = a`. — lurker, Dec 01 '16 at 11:14

score 3 · Accepted Answer · edited May 23 '17 at 11:54

3

In other words, why is [] counted as an element in 1. but it isn't when it is in a Tail in 2?

Those are two different things. Lists in Prolog are (degenerate) binary trees, but also very much like a singly linked list in a language that has pointers, say C.

In C, you would have a struct with two members: the value, and a pointer to the next list element. Importantly, when the pointer to next points to a sentinel, this is the end of the list.

In Prolog, you have a functor with arity 2: ./2 that holds the value in the first argument, and the rest of the list in the second:

.(a, Rest)

The sentinel for a list in Prolog is the special []. This is not a list, it is the empty list! Traditionally, it is an atom, or a functor with arity 0, if you wish.

In your question:

[a, []] is actually .(a, .([], []))
[a|[]] is actually .(a, [])

which is why:

?- length([a,[]], N).
N = 2.

This is now a list with two elements, the first element is a, the second element is the empty list [].

?- [a|[]] = [a].
true.

This is a list with a single element, a. The [] at the tail just closes the list.

Question: what kind of list is .([], [])?

Also, are there cases where the final (rightmost, deepest) final node of a tree is not ‘nothing’?

Yes, you can leave a free variable there; then, you have a "hole" at the end of the list that you can fill later. Like this:

?- A = [a, a|Tail], % partial list with two 'a's and the Tail
   B = [b,b], % proper list
   Tail = B. % the tail of A is now B
A = [a, a, b, b], % we appended A and B without traversing A
Tail = B, B = [b, b].

You can also make circular lists, for example, a list with infinitely many x in it would be:

?- Xs = [x|Xs].
Xs = [x|Xs].

Is this useful? I don't know for sure. You could for example get a list that repeats a, b, c with a length of 7 like this:

?- ABCs = [a,b,c|ABCs], % a list that repeats "a, b, c" forever
   length(L, 7), % a proper list of length 7
   append(L, _, ABCs). % L is the first 7 elements of ABCs
ABCs = [a, b, c|ABCs],
L = [a, b, c, a, b, c, a].

In R at least many functions "recycle" shorter vectors, so this might be a valid use case.

See this answer for a discussion on difference lists, which is what A and Rest from the last example are usually called.

See this answer for implementation of a queue using difference lists.

edited May 23 '17 at 11:54

Community

1
1

answered Dec 01 '16 at 09:25

I'm on mobile now so I'll just give a quick reply. First, thank you for your extensive reply! Seeing the tail as a 'pointer' to the next item makes sense. I don't quite get the part in the final non-empty Tail, though. Can't you write your A as `[a,a|[b|[b|[]]]]`? Meaning, it eventually will need that empty list to finish? – Bram Vanroy Dec 01 '16 at 09:40
1

I added a bit more detail. Make sure to read also the two other answers I linked, they talk quite a bit about the uses of "partial lists". As for `[a,a|[b|[b|[]]]]`, yes, but the point is that if you keep the (free) `Rest` in your hand, you can use it at a later time point to do something with it (again, see the linked answers, too). – Dec 01 '16 at 09:46
@BramVanroy you can try the `X = [1|[2|3]].` query. It returns with `X = [1, 2|3].`. And `X = [1|[2|[]]].` returns with `X = [1, 2].`. Which makes sense, because `[1, 2|[]] = [1, 2].` returns **true**. – Will Ness Dec 01 '16 at 11:32
2

@WillNess You should have mentioned that `[1, 2|3]` is not a valid list! – Dec 01 '16 at 11:59
1

"valid" as interpreted by `is_list/1`, yes. – Will Ness Dec 01 '16 at 14:50
1

@WillNess I guess the right term is "well-formed" or "proper"? I am not sure, really. Even the [SWI-Prolog documentation of `is_list/1`](http://eu.swi-prolog.org/pldoc/doc_for?object=is_list/1) avoids using any English for this and instead shows a possible native Prolog definition for `is_list/1`, plus a "cyclic list" disclaimer. – Dec 01 '16 at 15:26
1

Your link gives a perfectly clear recursive English definition: "Term is bound to the empty list ([]) or a term with functor ''[|]'' and arity 2 and the second argument is a list." There is no way `[1,2|3]` satisfies this. `[1,2|3]` is also invalid as interpreted by `length/2` and as a first argument of `append/3`, BTW. – Isabelle Newbie Dec 01 '16 at 22:01
@IsabelleNewbie Yep, it _defines_ what `is_list/1` does, but it doesn't _name_ it in any other way than calling it, well, a list. I was answering to Will Ness, who commented that `[1, 2|3]` is not a "valid" list _as interpreted by_ `is_list/1` (so, maybe valid in some other way?) – Dec 01 '16 at 23:05
@IsabelleNewbie I could define the following predicate: `foo(3, 3). foo([X|Xs], [Y|Ys]) :- Y is 2*X, foo(Xs, Ys).` With this, I get: `?- foo([1,2|3], R). R = [2,4|3].` So I guess I can make it work ;-) – Dec 01 '16 at 23:09
1

@Boris I just wanted a notion of "valid list" to have a concrete meaning. – Will Ness Dec 02 '16 at 12:09

score 1 · Answer 2 · edited May 23 '17 at 11:54

Your confusion comes from the fact that lists are printed (and read) according to a special human-friendly format. Thus:

[a, b, c, d]

... is syntactic sugar for .(a, .(b, .(c, .(d, [])))).

The . predicate represents two values: the item stored in a list and a sublist. When [] is present in the data argument, it is printed as data. In other words, this:

[[], []]

... is syntactic sugar for .([], .([], [])). The last [] is not printed because in that context it does not need to. It is only used to mark the end of current list. Other [] are lists stored in the main list.

I understand that but I don't quite get why there is such a need for that final empty list.

The final empty list is a convention. It could be written empty or nil (like Lisp), but in Prolog this is denoted by the [] atom. Note that in prolog, you can leave the sublist part uninstantiated, like in:

[a | T]

which is the same as:

.(a, T)

Those are known as difference lists.

I understand that but I don't quite get why there is such a need for that final empty list. Perhaps the idea of 'the pointer to nothing' as Boris suggests is the best way to look at it. — Bram Vanroy, Dec 01 '16 at 09:43
@BramVanroy Not a pointer to nothing, to a sentinel. Or at least the value of this "nothing" need to be well-defined. — , Dec 01 '16 at 09:49

score 1 · Answer 3 · answered Dec 02 '16 at 12:30

Your understanding of 1. and 2. is correct -- where by "nothing" you mean, element-wise. Yes, an empty list has nothing (i.e. no elements) inside it.

The logic behind having a special sentinel value SENTINEL = [] to mark the end of a cons-cells chain, as in [1,2,3] = [1,2|[3]] = [1,2,3|SENTINEL] = .(1,.(2,.(3,SENTINEL))), as opposed to some ad-hoc encoding, like .(1,.(2,3)) = [1,2|3], is types consistency. We want the first field of a cons cell (or, in Prolog, the first argument of a . functored term) to always be treated as "a list's element", and the second -- as "a list". That's why [] in [1, []] counts as a list's element (as it appears as a 1st argument of a .-functored compound term), while the [] in [1 | []] does not (as it appears as a 2nd argument of such term).

Yes, the trees have to be binary -- i.e. the functor . as used to encode lists is binary -- and so what should we put there in the final node's tail field, that would signal to us that it is in fact the final node of the chain? It must be something, consistent and easily testable. And it must also represent the empty list, []. So it's only logical to use the representation of an empty list to represent the empty tail of a list.

And yes, having a non-[] final "tail" is perfectly valid, like in [1,2|3], which is a perfectly valid Prolog term -- it just isn't a representation of a list {1 2 3}, as understood by the rest of Prolog's built-ins.

I understand it now *I think*. I never had any fundamentals of programming, so I know how to *use* a lot of programming languages but in many cases I want to ge more in-depth and find out *why* a language works this or that way. Thank you for your reply (+1), however I am going to accept Boris' answer because of its examples and the reference to pointers in C, which makes the concept easier to imagine. — Bram Vanroy, Dec 04 '16 at 14:50
the "cons cell" that I mention is also a reference to pointers ... [*in Lisp*](http://www-formal.stanford.edu/jmc/recursive.html). :) — Will Ness, Dec 04 '16 at 20:41

Understanding Prolog's empty lists

3 Answers3