Debugging a command-line program

Question

If I have a program that I use as a command-line tool, what are my options for debugging?

For the sake of the example, let's say that the program looks like this.

Listing of do_stuff.pl:

main :-
    current_prolog_flag(argv, Argv),
    do_stuff(Argv),
    halt.
main :-
    halt(1).

With SWI-Prolog, I can compile this with:

swipl --goal=main -o do_stuff -c do_stuff.pl

And I can run it by simply calling

$ ./do_stuff foo bar baz

As it stands, this will exit with 1 if do_stuff/1 fails. How can I see the first (earliest, deepest) goal that failed? Or even better, the whole backtrace? I assumed that I should be able to use debug and leash, for example:

main :-
    current_prolog_flag(argv, Argv),
    debug, leash(+fail),
    do_stuff(Argv),
    halt.

... but there was nothing that I tried that worked.

The only half-assed idea I had was to throw an error for every predicate that I expect to succeed deterministically but does not. This is of course doable but seems a bit excessive?

Motivation

A program used as a command line tool is (usually) meant to run once, take its arguments, read its input, write output. In this context, what does failure mean? My interpretation is, an unexpected failure is an error in the program.

Unit testing might help (testing predicates in isolation); however, this will by definition not help for errors that are due to the programmers lack of understanding of the problem, the scope, or the tools. Only running the program with realistic input will catch this class of errors.

So, given the example above, if a certain use case causes do_stuff/1 to fail, and the program to exit with a non-zero code, what options does the programmer have in figuring out which predicate failed?

The answer linked in the comments gives one solution. But (if I understand it correctly) this does require that the programmer systematically checks along the execution flow until the offending predicate call is found.

This is exactly what I was hoping to avoid.

Have you looked at [this answer](http://stackoverflow.com/a/30791637/772868)? — false, Apr 12 '16 at 15:39
@false I have seen both answers to the question before. As I read it, yours is definitely cleaner than using `format`, but not fundamentally different: I still need to search myself for the source of the failure. What am I missing? — , Apr 12 '16 at 19:02
@false But I could of course define a symbol/meta-predicate for "succeed without a choice point", and use it generously (that is, for every predicate that I know should succeed without a choice point); would that be a reasonable solution? — , Apr 12 '16 at 19:07
I am not sure that this property of "full" determinism is **that** useful that you should test it all the time. I found that this property is relatively easy to maintain by testing it occasionally. — false, Apr 13 '16 at 17:00
@false See the edit to my question, where I try to explain my motivation. — , Apr 13 '16 at 18:49

score 3 · Accepted Answer · edited May 23 '17 at 12:33

Failures are a very unusual thing in Prolog - compared to more command oriented languages. And it has interested people literally from day 1. In fact, even in Prolog 0 (the version prior to Prolog I), there has been beside the trace option ECRIRE a special option IMPASSES that only showed the failures.

Later on, there is particularly work by Mirelle Ducassé that tries to automatically figure out how failures might be explained.

What is so odd about failures is that they are not necessarily an indication that something went wrong. But sometimes, they are.

I'd say, that there are two different directions how failure can be understood. The first is more procedural, and the second more declarative.

Annotations

In many programs, I use (@)/1 to indicate that I expect a goal to succeed always. Thanks to the operator declaration, this is just one extra character:

   ...,
   @goal_aux_togoalaux_spec(OQuery, FVect0, Query, Spec),
   ...

In case of a failing goal, an error is issued. It is also important that nested exceptions are documented too. Should there be something time critical, these @ have to be removed. However, I just counted ~400 in 120kLOP.

Note that @ also works nicely for goals with several answers. Like @member(1,[X,Y]).

This technique works well for de facto moded programs. Think of the preparation of a failure-slice (that's the example above). There, you are in the situation of primarily thinking: Here is a program, what is a fitting slice? In such a situation an answer: "No there is no slice" would not be an answer. You really expect it to succeed always. In case you do not have such a mode program, you can often transform an existing unmoded one by enforcing steadfastness:

p(X, Y) :-
   wellformed(X),
   @p_old(X, Yc),
   Yc = Y.

The technique rapidly loses attractiveness in purely relational, declarative code. Take a recent example of the zebra-puzzle. There, it is practically impossible to add the @ - except for the very first goals. In such situations a more declarative approach is needed like the following.

Generalizations

For more complex issues, the @ do not work so well. Instead, program modification/slicing is needed. One will need to generalize a program by adding a prefix *. See this answer for a collection of such debugging sessions here on SO that use this technique manually. The major point of this technique is that you do not have to understand the real meaning of the program while determining a maximal generalization. You simply need to keep an eye on the failing goal.

Ideally, such generalizations would be automatically produced. However, there are a lot of obstacles. For one, they only work for pure monotonic code (in fact this is one good motivation why one should stick to such code). One would thus first have to analyze and categorize the existing code. This is even more difficult if systems do not conform and change their behavior randomly (like the system you mention).

So pretty much, a) throw for any predicate that should always succeed once, or b) start working your way through your program looking for your mistake. The "change their behaviour randomly" bit I don't get: pretty much **any** piece of software I have ever had the pleasure of working with has had errors in it. Figuring out if the error is in your own code or elsewhere is the point of the exercise. — , Apr 18 '16 at 11:42
@Boris: Misunderstanding: a) is about success, not about succeeding once. Also @member(1,[1,1]) succeeds twice. In b you missed that the criteria for generalization are independent of your understanding of the program. Will add more... — false, Apr 18 '16 at 14:27
No need, I think I am starting to get it. Indeed, always succeeding and succeeding only once are not the same thing. About generalizing the program: yes, I guess I see now. My confusion (I think) was due to mixing up "program fails" and "program doesn't do what I thought it should". — , Apr 19 '16 at 11:40

Debugging a command-line program

Motivation

1 Answers1

Annotations

Generalizations