What is the downside of using difference lists in an append operation in Prolog?

Question

This is a follow-up question to this question about difference lists.

Generally, append/3 or conc/3 is written as follows:

append([], L, L).
append([H1|T1], L2, [H1|L3]):-
    append(T1, L2, L3).

The downside of this recursion is that it doesn't run efficiently when taking a long list as first argument because the whole list needs to be processed before the actual appending can begin.

To make this more efficient, we can use different lists which only requires variable instantiation and no recursion to append lists.

append_d(A1-Z1, Z1-Z2, A1-Z2).

The thing is that wherever I am looking for definitions of the append/3 predicate, the former is used, and not the supposedly more efficient alternative. Why is that? What is the downside of working with difference lists, and don't they justify the efficiency gain?

See also [here](http://stackoverflow.com/a/26967655/1812457) — , Jan 12 '17 at 11:09
Of interest: [Wikibooks - Difference Lists](https://en.wikibooks.org/wiki/Prolog/Difference_Lists) - explains `append_d(A1-Z1, Z1-Z2, A1-Z2).` usage: `append_d([a,b,c|E]-E, [x,y,z|W]-W, O).` — Guy Coder, Jan 12 '17 at 12:04

score 4 · Accepted Answer · edited May 23 '17 at 11:53

I guess the only "downside" of using a difference list version of append instead of the classical append/3 is that you maybe don't have a difference list to start with. In other words, if you have a list like this:

[a,b,c]

instead of this:

[a,b,c|Back]

... then you must traverse it if you want to append something to its end.

As long as you are going to append lists to each other, you will indeed use difference lists from the start. BTW, in modern Prolog separate arguments for the list and the rest are usually used. Here are a few SWI-Prolog library predicates that use difference lists:

findall/4
read_pending_codes/3
phrase/3 (DCGs are discussed in the answer by @mat)

This list is definitely not exhaustive. The point is that these predicates, by using difference lists, allow you to add more to the back of the list in constant time.

As soon as you find yourself using append/3 for appending lists in a program you are writing, you should consider using difference lists instead. As for downsides, as long as you don't need to append lists, you need to carry around an extra argument that you don't use for anything: then, a just a list is good enough.

Furthermore, append/3, as it is usually implemented, can do many more things than just append lists. Try the following queries:

?- append(A, B, [1,2,3]). % split a list
?- append([1,2,3], Back, List). % make a difference list
?- append(Prefix, _, [a,b,c,d]). % list prefix
?- append(_, Suffix, [a,b,c,d]). % list suffix

and so on.

One nice example of a good use of difference lists are Prolog queues (see at the bottom of this answer, example is stolen, in simplified form, from "The Craft of Prolog" by Richard O'Keefe).

Afterword

While you might define a "difference list append" predicate, usually it is unnecessary, as it only adds a level of indirection. Say I'm writing a predicate that takes a binary tree and gives you the difference list (list and its back) resulting from in-order traversal:

% tree_inorder_list(+Tree, -List, -Back)
tree_inorder_list(nil, List, List).
tree_inorder_list(t(X, L, R), List, Back) :-
    tree_inorder_list(L, List, [X|List0]),
    tree_inorder_list(R, List0, Back).

Using a separate predicate for appending would only add code.

And yes, this could be written as a DCG instead! It looks much better:

tree_inorder(nil) --> [].
tree_inorder(t(X, L, R)) -->
    tree_inorder(L),
    [X],
    tree_inorder(R).

Instead of querying:

?- tree_inorder_list(Tree, List, []).

you would have to query, for the DCG version:

?- phrase(tree_inorder(T), List).

You can use listing/1 to see how this DCG definition translates to a Prolog predicate:

?- listing(tree_inorder//1).
tree_inorder(nil, A, A).
tree_inorder(t(D, A, E), B, G) :-
    tree_inorder(A, B, C),
    C=[D|F],
    tree_inorder(E, F, G).

true.

This is identical to the difference list definition above (except for one unification, C=[D|F], which is not inlined).

And because there is a difference-list phrase/3, you can use tree_inorder//1 on more than one tree in the difference list mode, and concatenate the lists in constant time:

?- list_to_search_tree([c,a,b], A),
   phrase(tree_inorder(A), List, Back),
   list_to_search_tree([z,y,z,z,x], B),
   phrase(tree_inorder(B), Back).
A = t(b, t(a, nil, nil), t(c, nil, nil)),
B = t(y, t(x, nil, nil), t(z, nil, nil)),
List = [a, b, c, x, y, z],
Back = [x, y, z].

(Thank you to @WillNess for suggesting this addition)

perhaps it is worth adding a small remark that `phrase(tree_inorder(T), List)` is equivalent to `phrase(tree_inorder(T), List, [])` which is also the same as `tree_inorder(T, List, [])`. In the two last calls it is conceivable for a free logvar to be used instead of the `[]` as 3rd argument, to have `List` produced as an open-ended list which can be continued by traversing some other tree perhaps, etc. (this also follows from the listing that you show at the end of your answer). — Will Ness, Jan 14 '17 at 21:34
@WillNess Thank you for the suggestion, added a small example. — , Jan 15 '17 at 08:29

mat · Answer 2 · 2017-01-12T12:20:06.910

3

I would actually reformulate this question as:

What is the downside of using append/3 over other methods?

Answer: Frequent use of append/3 typically indicates a problem with your datastructures, and in such situations you should take a step back and consider using DCGs instead.

The fact that repeated use of append/3 typically means quadratic overhead in situations where using a DCG improves this to linear time (totalling over all operations) alone already often justifies this.

In addition, using a DCG often makes your code easier to read, since you need to keep track of fewer arguments and fewer variables.

Since you have now posted several related questions about list differences, my recommendation for you is to forget about them for now. In my opinion, the material you are using for learning Prolog may put too little emphasis on DCGs, so that you are running into the much harder topic of list differences too soon.

When tempted to use append/3, always consider using DCGs for describing lists instead! In my view, the fact that they are internally compiled to predicates that reason over list differences need not concern you for now. You will understand list differences much more easily later!

Note that when using DCGs, there is no need at all for append/3, since a concatenation of lists that are both described by nonterminals is much more naturally written for example as:

?- phrase((a,b), Ls).

where a and b are both nonterminals. Note how (',')//2 can be read as "and then" in DCGs, completely eliminating the need for append/3, and also eliminating the need to understand list differences so soon.

EDIT: I would like to back up my arguments with a quote from the publication Teaching Beginners Prolog — How to teach Prolog:

The second mistake is to introduce differences too early in a course. It is tempting to present list differences first and only then definite clause grammars, but beginners are much more comfortable with grammar rules.

This is definitely not an empirical study as requested in the comments. Still, I think the published experience of a Prolog teacher who has taught Prolog to students at universites in at least 3 different countries over several decades should be taken into account to some extent, even if we ignore the fact that we have a concrete beginner at hand who has already filed 3 different questions about list differences in the last 2 days.

edited Jan 12 '17 at 12:20

answered Jan 12 '17 at 10:09

mat

40,498
3
51
78

It is true that the material that I have at hand is quite brief on difference lists. Moreover we only studied DCGs for one session and it was very limited, even though I clearly see the benefit especially because I am very interested in linguistic parsers, where DCGs seem to be able to shine. – Bram Vanroy Jan 12 '17 at 10:37
1

Well, then read more on DCGs! See the references on the [DCG tag](http://stackoverflow.com/tags/dcg/info). A good introduction to Prolog will have a significant percentage of the material dedicated to DCGs and show many possible applications. Not only for list differences, but also for differences in general (even though they must be wrapped in a list). How DCGs are implemented internally need not concern you for now. For now, simply use them for describing lists, and to get rid of too many uses of `append/3`. You will rarely need to reason about list differences explicitly. Let DCGs do it! – mat Jan 12 '17 at 10:42
Difference lists and DCGs are not mutually exclusive, neither do they somehow rob away attention from each other. I just can't agree with your recommendation to "forget about them" while pushing DCGs forward. DCGs are great, but they have their intricacies, too. – Jan 12 '17 at 11:12
2

You are quoting selectively. I said forget about them **for now**. It is evident that OP struggles with something he is not fully ready to understand. That's completely OK! It took many years to discover list differences! Focus on something simpler **for now**! DCGs are by far easier to understand in my experience. – mat Jan 12 '17 at 11:20
1

Yes, I was quoting selectively, sorry. Your experience is anecdotal evidence. DCGs are more difficult to use without getting stuck in strange situations, _in my experience_. Unless I see an empirical study that compares the two, my experience is just another anecdote. BTW, designing and performing such a study is a difficult task, but experimental science is always difficult. – Jan 12 '17 at 11:32
Besides `Well, then read more on DCGs!` I would add that you have to use them in practice. Reading is one thing, but finding what works and doesn't works IMHO is best gained by experimenting and writing lots of test cases. One test case that has me stuck is given a grammar and an ending of the sentence (suffix) how does one generate the start of the sentence (prefix). I can get it to work if the number of variables for the prefix is fixed, but if I want it to be a variable number of words for the prefix, I have problems. – Guy Coder Jan 12 '17 at 12:17
Thank you for the link to the publication. I definitely did not want to say that I of all people would know anything more than Ulrich Neumerkel (or you for that matter) on the topic of teaching Prolog. – Jan 12 '17 at 14:34
Anyway, I know very well that there are very few empirical studies on topics like programming (as the process of people writing, reading, understanding computer programs), or teaching programming. Unfortunately, I am an experimentalist (a weird career choice, I know). I often notice too late that I am unnecessarily critical of opinions, as if there is something wrong in having an opinion (as opposed to carefully wording a hypothesis that haven't been proven false yet). Déformation professionnelle. – Jan 12 '17 at 14:44
That's all OK and understandable. I think your position is in fact particularly difficult also because you are among a very small minority of Prolog programmers who comparatively quickly have seen to the core of many quite involved procedural aspects. That's definitely not the norm. From your perspective, your experience is "just another anecdote", but from what I can tell it is a particularly rare exceptional case. In more typical situations, such concepts do indeed "rob away attention from each other", so I try to get simpler concepts across first. I do not want to end up with WAM code here. – mat Jan 12 '17 at 15:01
@mat The road any of us takes is somehow different, for sure. I appreciate all the ways that Prolog saves us from thinking about "procedure" because I have written a fair amount of (bad) C code for money. Then again, I cannot prevent myself from looking at the procedural meaning of code.... – Jan 13 '17 at 14:14
Personally, I have adopted this approach mostly after seeing that precisely the people whom I *knew* were among the most hard core and most impressive low-level procedural/machine language programmers (like Ulrich and Richard O'Keefe) were the *strongest* advocates of *everything* that simplified the teaching and learning experience and most pushed in the direction of more declarative approaches, even if Ulrich was busy with PAPI counters for microprocessors and mergemem for the Linux kernel, and Richard was supervising PhD theses consisting of mostly assembly code... – mat Jan 13 '17 at 14:25
Personally, I can't understand why DLs must be a hard topic. [List concatting is associative](https://en.wikipedia.org/wiki/Definite_clause_grammar#Difference_lists), that's all there is to it. If we understand [TRMCO](https://en.wikipedia.org/wiki/Tail_call#Tail_recursion_modulo_cons) we already understand DLs; and that was first described [in 1974](http://www.cs.indiana.edu/cgi-bin/techreports/TRNNN.cgi?trnum=TR19), even for any monoidal data, including recursive summation of numbers being turned into an iterative accumulation. It's all the same *simple* thing, `a+(b+c+...) = (a+b)+(c+...)`. – Will Ness Jan 15 '17 at 23:35
Somewhat funny aside: Unification modulo (only) associativity is among the [tough problems](http://www.math.muni.cz/~klima/Math/Thesis/autoref.ps) in unification theory... For beginners, list differences are tough for different reasons: First, because (in contrast to grammars, which are often already known even to beginners!), it's yet another new formalism. Second, using list differences automatically involves more arguments. It's definitely harder than using DCGs, and therefore: Why not start with something simpler that turns out equally suitable and more convenient in most cases? – mat Jan 16 '17 at 00:53
Personally, I prefer to know how things are working (I already understood C pointers and linked lists before studying Prolog btw). Growing an open-ended list at its back is easy to understand (not that it's presented this way, but it should, IMO). Keeping two pointers for a list -- front and back -- is a standard C technique (queues etc.). Regrouping things (like when counting apples) :) is [easy to understand](https://www.youtube.com/watch?v=-oot7KC-ADM). Being presented with a new syntax that "just works" is harder, for me. But of course "YMMV" always applies. :) thanks for the link! – Will Ness Jan 16 '17 at 11:34
I understand the desire to know how things work internally. It is definitely tempting to try out how far one can stretch oneself and understand the actual control flow. With JIT indexing, tabling, and even only with regular execution, it quickly becomes impossibly hard to understand though, because there are so many different cases to consider. C is a lot easier to understand procedurally, because the code is so moded and can only be used in one direction: Literally everything is already instantiated. If you apply this reading to Prolog, you typically (have to) limit your view considerably. – mat Jan 16 '17 at 15:33
I meant to understand just a general gist of it. of course knowing every last detail is hard. :) Maybe show some pictures, something [like this](http://imgur.com/a/c7Lxy)? Spend 5 minutes on it, then show DCG as a convenient notation for specifying those list *prefixes* / *infixes*. A good picture goes a long way. And those pictures are *not* procedural. :) Without a picture, the word "difference" itself doesn't explain much. It was all very mysterious sounding to me, when I first read about it in The Art of Prolog - a *lot* of words, too. (again, YMMV) – Will Ness Jan 16 '17 at 17:49

What is the downside of using difference lists in an append operation in Prolog?

2 Answers2

Afterword