How to identify wasteful representations of Prolog terms

Question

What is the Prolog predicate that helps to show wasteful representations of Prolog terms?

Supplement

In a aside of an earlier Prolog SO answer, IIRC by mat, it used a Prolog predicate to analyze a Prolog term and show how it was overly complicated.

Specifically for a term like

[op(add),[[number(0)],[op(add),[[number(1)],[number(1)]]]]]

it revealed that this has to many [].

I have searched my Prolog questions and looked at the answers twice and still can't find it. I also recall that it was not in SWI-Prolog but in another Prolog so instead of installing the other Prolog I was able to use the predicate with an online version of Prolog.

If you read along in the comments you will see that mat identified the post I was seeking.

What I was seeking

I have one final note on the choice of representation. Please try out the following, using for example GNU Prolog or any other conforming Prolog system:

| ?- write_canonical([op(add),[Left,Right]]). 
'.'(op(add),'.'('.'(_18,'.'(_19,[])),[]))

This shows that this is a rather wasteful representation, and at the same time prevents uniform treatment of all expressions you generate, combining several disadvantages.

You can make this more compact for example using Left+Right, or make all terms uniformly available using for example op_arguments(add, [Left,Right]), op_arguments(number, [1]) etc.

Evolution of a Prolog data structure

If you don't know it already the question is related to writing a term rewriting system in Prolog that does symbolic math and I am mostly concentrating on simplification rewrites at present.

Most people only see math expressions in a natural representation

x + 0 + sin(y)

and computer programmers realize that most programming languages have to parse the math expression and convert it into an AST before using

add(add(X,0),sin(Y))

but most programming languages can not work with the AST as written above and have to create data structures See: Compiler/lexical analyzer, Compiler/syntax analyzer, Compiler/AST interpreter

Now if you have ever done more than dipped your toe in the water when learning about Prolog you will have come across Program 3.30 Derivative rules, which is included in this, but the person did not give attribution.

If you try and roll your own code to do symbolic math with Prolog you might try using is/2 but quickly find that doesn't work and then find that Prolog can read the following as compound terms

 add(add(X,0),sin(Y))

This starts to work until you need to access the name of the functor and find functor/3 but then we are getting back to having to parse the input, however as noted by mat and in "The Art of Prolog" if one were to make the name of the structure accessible

 op(add,(op(add,X,0),op(sin,Y)))

now one can access not only the terms of the expression but also the operator in a Prolog friendly way.

If it were not for the aside mat made the code would still be using the nested list data structure and now is being converted to use the compound terms that expose the name of the structure. I wonder if there is a common phrase to describe that, if not there should be one.

Anyway the new simpler data structure worked on the first set of test, now to see if it holds up as the project is further developed.

Try it for yourself online

Using GNU Prolog at tutorialspoint.com enter

:- initialization(main).
main :- write_canonical([op(add),[Left,Right]]).

then click Execute and look at the output

sh-4.3$ gprolog --consult  file main.pg  
GNU Prolog 1.4.4 (64 bits)  
Compiled Aug 16 2014, 23:07:54 with gcc  
By Daniel Diaz  
Copyright (C) 1999-2013 Daniel Diaz  
compiling /home/cg/root/main.pg for byte code...  
/home/cg/root/main.pg:2: warning: singleton variables [Left,Right] for main/0  
/home/cg/root/main.pg compiled, 2 lines read - 524 bytes written, 9 ms  
'.'(op(add),'.'('.'(_39,'.'(_41,[])),[]))| ?-

Clean vs. defaulty representations

From The Power of Prolog by Markus Triska

When representing data with Prolog terms, ask yourself the following question:

Can I distinguish the kind of each component from its outermost functor and arity?

If this holds, your representation is called clean. If you cannot distinguish the elements by their outermost functor and arity, your representation is called defaulty, a wordplay combining "default" and "faulty". This is because reasoning about your data will need a "default case", which is applied if everything else fails. In addition, such a representation prevents argument indexing, and is considered faulty due to this shortcoming. Always aim to avoid defaulty representations! Aim for cleaner representations instead.

Whether a term is "overly complicated" may depend upon the context, use case, and semantic meaning of the term. For example, if you wanted to represent a single atom `a`, you could easily argue that a representation such as `[[a]]` is overly complex (the list structure is superfluous/unnecessary). However, in a broader context, if `[[a]]` is a trivial case for a more complex structure defined for a given application, it might be appropriate. So.... what's the definition of "overly complex"? — lurker, Jul 14 '17 at 12:05
If what you wish to do is quantify level of compound term nesting for the purposes of developing a measure of complexity (without getting into whether it is "overly" complex, which is subjective), then you can look at the Prolog term processing predicates, such as `=../2`, etc, and count levels of term nesting, number of term arguments, etc. — lurker, Jul 14 '17 at 12:07
@lurker I knew a comment like that was coming. I was planning on changing the title after I saw the answer or if someone was able to provide a reference to the actual post. IIRC the section was eloquent and described it better. The best I can do with regards to your question is to think of it as a tree that has nodes that are only pointers to other nodes without any data in the node. In other words each node should have data, e.g. terminal, or have data and a pointer, e.g. non-terminal. A node with just a pointer and no data is unnecessary. — Guy Coder, Jul 14 '17 at 12:13
It would be helpful to show how that tree looks for the example term you gave. — lurker, Jul 14 '17 at 12:20
@false That is the problem; I can't find the post and am not absolutely sure if mat wrote it. If I could find it I would have my answer. — Guy Coder, Jul 14 '17 at 12:21
Even if you can't find the original post, you must be asking this question out of a curiosity or need and might have some "complexity" definition in mind? If "complexity" is a measure of "non-data nodes" in some kind of tree representation of the term, then can you describe how you would map a term to a tree in that case? — lurker, Jul 14 '17 at 12:26
Are you looking for this post, at the end? https://stackoverflow.com/a/42722823/1613573 It uses `write_canonical/1` to obtain the *canonical* representation of term. This is very useful when learning Prolog. — mat, Jul 14 '17 at 12:30
@mat Yes, the very last part is it. You can post it as an answer and I will accept it. — Guy Coder, Jul 14 '17 at 12:36
Just to nit pick a little (:)) I still struggle with the term "wasteful". It's application dependent. The answer shows how to do a canonical view of a term which can be visually examined to determine the number of single-element lists (among other things) by checking the number of empty list arguments in the canonical form. But the presence of single-element lists may or may not be wasteful. The question is really more objectively, "How to determine the number of single-element lists contained in a term?". — lurker, Jul 14 '17 at 14:40
@lurker Thanks for the discussion, it helps to make all of the info here better for others. I am not totally satisfied with "wasteful" either. I do find your title better. As far "number of single-element lists" is what's being sought, or if it's other?, I don't know. Since the terms are for a rewrite system of symbolic math expressions I am trying to find a representation that works. I know that, as mat says, it needs `make all terms uniformly available`. To be continued. — Guy Coder, Jul 14 '17 at 14:46
And as is noted in the "The Art of Prolog" pg. 81, "Both Programs ... use a 'natural' representation from mathematics where terms represent themselves. A term such as sin(X) is represented using a unary structure sin. If a different representation were used, for example, unary_term(sin,X) where the name of the structure is made accessible, then the problem with the chain rule disappears." reinforces the need for a different structure. When I started with the rewrite system I did use a natural representation and that had problems. To be continued. — Guy Coder, Jul 14 '17 at 14:51
Then I switched to the representation in the question, but I don't believe that one to be correct either and am looking at other representations. One representation that I found, but have yet to try is, `op(add,number(0),op(add,number(0),number(0)))` While I do have questions about this, since I don't know what exactly I need I can't ask objective SO questions. I might try the SWI-Prolog forum as there subjective questions are allowed, but want to do more work on it and see if I can find the solution on my own. — Guy Coder, Jul 14 '17 at 14:56
@lurker I think what might be better is to sidestep the whole problem of trying to tie the title to Prolog terms and instead just make it about finding an old SO answer. The problem I see with that is that the answer is of use to others and a title is a nice attention getter that shows up in a Google search. Feel free to change the title; as I have noted before I consider all post of mine at SO public and free to be changed as they are Creative Commons. — Guy Coder, Jul 14 '17 at 15:01
I think this is a stimulating and interesting question about information representation in Prolog even though it has a subjective component to it, and I'm glad you posted it. What constitutes a good (or, perhaps, efficient) representation is probably a combination of common principles plus what the needs are of the application for accessing the information. — lurker, Jul 14 '17 at 15:05
And... I don't know if I actually have a better idea for a title, now that we've dug into it a bit. So I'm leaving it alone. :) — lurker, Jul 14 '17 at 15:19
Of interest: SWI-Prolog [5.1 Lists are special](http://www.swi-prolog.org/pldoc/man?section=ext-lists) `Another place that is affected is write_canonical/1.` Need to look into this more. — Guy Coder, Jul 26 '17 at 12:45
@lurker Found something of value related to this question; see the added section: Clean vs. defaulty representations — Guy Coder, Aug 01 '17 at 11:17

score 4 · Accepted Answer · edited Jul 14 '17 at 15:53

Please see the last part of:

https://stackoverflow.com/a/42722823/1613573

It uses write_canonical/1 to display the canonical representation of a term.

This predicate is very useful when learning Prolog and helps to clear several misconceptions that are typical for beginners. See for example the recent question about hyphens, where it would have helped too.

Note that in SWI, the output deviates from canonical Prolog syntax in general, so I am not using SWI when explaining Prolog syntax.

lurker · Answer 2 · 2017-07-15T03:28:24.597

You could also programmatially count how many subterms are a single-element list using something like this (not optimized);

single_element_list_subterms(Term, Index) :-
    Term =.. [Functor|Args],
    (   Args = []
    ->  Index = 0
    ;   maplist(single_element_list_subterms, Args, Indices),
        sum_list(Indices, SubIndex),
        (   Functor = '.', Args = [_, []]
        ->  Index is SubIndex + 1
        ;   Index = SubIndex
        )
    ).

Trying it on the example compound term:

| ?- single_element_list_subterms([op(add),[[number(0)],[op(add),[[number(1)],[number(1)]]]]], Count).

Count = 7

yes
| ?-

Indicating that there are 7 subterms consisting of a single-element list. Here is the result of write_canonical:

| ?- write_canonical([op(add),[[number(0)],[op(add),[[number(1)],[number(1)]]]]]).
'.'(op(add),'.'('.'('.'(number(0),[]),'.'('.'(op(add),'.'('.'('.'(number(1),[]),'.'('.'(number(1),[]),[])),[])),[])),[]))

yes
| ?-

Thanks. Nice example. Hope others learn from it or at least ask questions if they do not understand it. — Guy Coder, Jul 15 '17 at 09:54

How to identify wasteful representations of Prolog terms

2 Answers2

Linked