Safe sorting of terms

Question

Standard term order takes variables as syntactic objects only. As a consequence beside other problems, sort/2 breaks the common assumptions of a relational definition. Neither can answer substitutions be a guarantee for solutions nor are ground instances of the second argument sorted.

?- X = 1, sort([1,X],[1]).
   X = 1.
?-        sort([1,X],[1]).   % generalization fails
false.
?-        sort([1,X],[2,1]).
   X = 2.                    % unsorted list succeeds
?- X = 2, sort([1,X],[2,1]). % answer substitution fails
false.

But even if we ignore declarative concerns we are still faced with the problem that variable ordering only remains constant during the creation of a sorted list. So depending on the "age" of a variable we get

?- _=X+Y, sort([X,Y],[2,1]).
   X = 2, Y = 1.
?- _=Y+X, sort([X,Y],[2,1]).
   Y = 2, X = 1.

and age may change over time, even relative age.

We really should be protected by all such erratic behavior.

What I have tried so far is a safe definition, which produces the very same type errors as sort/2 but produces instantiation errors when the list to be sorted is not ground. As long as no instantiation error occurs, the definition is perfect, indeed.

sort_si(List, Sorted) :-
   sort(List, ISorted),
   ( sort([], [_|Sorted]) ; true ), % just to get the type_error
   (  ground(ISorted)
   -> Sorted = ISorted
   ;  throw(error(instantiation_error, _Imp_def))
   ).

?- sort_si([X,X],S).
caught: error(instantiation_error, _97) % unexpected; expected: S = [X]
?- sort_si([-X,+Y], S).
caught: error(instantiation_error, _97) % unexpected; expected: S = [+Y,-X]
?- sort_si([_,_], []).
caught: error(instantiation_error, _97) % unclear, maybe failure

My question is: How to improve in ISO Prolog upon this definition by reducing the number of unnecessary instantiation errors? Ideally, an instantiation error for sort_si(List, Sorted) only occurs, when there are instances of the goal sort(List, Sorted) with different outcome. An optimal number of instantiation errors is not required. Maybe the fact that the list ISorted is sorted can be used to some degree.

And, also Sorted might be considered as now there will be never unsorted ground lists solutions to Sorted.

score 2 · Answer 1 · answered Aug 08 '21 at 18:10

My take is go through ISorted and check whether every pair of successive terms are "safely comparable", meaning that there is no further instantiation that would change their order. I captured that idea in the comparable/2 predicate with the following cases:

If the two arguments are variable, then they must be the same variable in order to be comparable.
If the two arguments are ground, then they are comparable. (Not sure this case is necessary, beside to shortcut the manual traversal of terms.)
Otherwise, the two arguments must be non-variable, and at least one of them contains a variable. We need to compare the terms further, but only if both are complex terms with the same functor and same number of arguments. In that case, the first argument needs to be comparable. Then, the following ones need to be comparable only if the first one does not discriminate between the two terms, which happens when the first argument is equal (==).

I've added a test in the main predicate to fail when ISorted can't be unified with Sorted, with the idea that sort_si/2 would only differ from sort/2 when sort/2 would succeed "incorrectly", but not when it fails, but I am not sure on that point.

sort_si(List, Sorted) :-
   sort(List, ISorted),
   ( sort([], [_|Sorted]) ; true ), % just to get the type_error
   ( Sorted \= ISorted % just to fail whenever sort/2 would fail
   -> fail
   ;  pairwise_comparable(ISorted)
   -> Sorted = ISorted
   ;  throw(error(instantiation_error, _Imp_def))
   ).


pairwise_comparable([]).
pairwise_comparable([_]).
pairwise_comparable([X1,X2|Xs]):-
   comparable(X1,X2),
   pairwise_comparable([X2|Xs]).

comparable(X1,X2):-
   var(X1),
   var(X2),
   X1==X2.
comparable(X1,X2):-
   ground(X1),
   ground(X2).
comparable(X1,X2):-
   nonvar(X1),
   nonvar(X2),
   functor(X1,F1,A1),
   functor(X2,F2,A2),
   (  F1\=F2
   -> true
   ;  A1\=A2
   -> true
   ;  X1=..[_|Xs1],
      X2=..[_|Xs2],
      comparable_args(Xs1,Xs2)
   ).

comparable_args([],[]).
comparable_args([X1|Xs1],[X2|Xs2]):-
   comparable(X1, X2),
   (  X1\==X2
   -> true
   ;  comparable_args(Xs1,Xs2)
   ).

`sort_si([X,a], [a]), X = a.` fails, yet, `X = a, sort_si([X,a], [a]).` succeeds. It is exactly such situations that should be avoided by issuing appropriate instantiation errors. — false, Aug 09 '21 at 16:28
Then, I assume one should remove the test `Sorted \= ISorted`, meaning that instantiation errors take priority over failures, which makes sense I guess. — jnmonette, Aug 09 '21 at 16:42
Yes, although now `sort_si([X,a], [n,importe])` produces an error which is a pity. Nevertheless (just for the sake of it) one could also do a check on the sortedness of the result! — false, Aug 10 '21 at 09:19
Well, I think that is acceptable: In the same way as `sort([a|_],[n,importe])` produces an error, while it could also fail if it did check the output argument first. — jnmonette, Aug 10 '21 at 10:06

Safe sorting of terms

1 Answers1