Dynamic union find algorithm in Prolog

Question

Lets assume I have sets S1,..,Sn and we want to find smallest covers C1,..,Cm so that in each cover there are never disjoint connected components.

for example with the sets S1=[X,Y], S2=[Y,Z], S3=[T] I would find the covers C1=[X,Y,Z] and C2=[T]. What about a dynamic algorithm that can split covers dynamically?

Assume the element Y dies, then we are left with S1'=[X], S2'=[Z] and S3'=[T]. The covers are now C1'=[X], C2'=[Z] and C3'=[T]. So the number of covers has increased.

The union find algorithm can determine a cover for a given collection of sets, but I am afraid that recalculating the full collection whenever an element dies, isn't efficient.

Through some "die" event. In practice in a SAT solver for example an element might be a variable and then instantiated to 0 or 1, and not anymore participate in the union find algorithm. — , Nov 10 '20 at 16:03

score 2 · Answer 1 · answered Nov 10 '20 at 12:42

To tap the resources of a Prolog system, I made a little union find algorithm based on copy_term/2 and keysort/2. The main entry point of the algorithm here does the following:

covers(L, S) :-
   vars_list(L, K),
   copy_term(K, R),
   make_keys(L, R, H),
   keysort(H, J),
   collect_keys(J, S).

Here is an example run:

?- covers([X+Y,Y+Z,T], C).
C = [[X+Y, Y+Z], [T]]

To get a dynamic algorithm we might try the following. Maintain a backtrackable structure that allows finding covers from elements. Then if an element dies, only recalculate the cover that belongs to the died element.

This would reduce the complexity a little bit. Otherwise I don't have a more better idea here, except the observation that a died element only splits its own cover into smaller covers.

can you answer this please -https://stackoverflow.com/questions/64775744/find-minimum-of-list-inside-list-on-the-basis-of-one-value — gANDALF, Nov 10 '20 at 20:14
@gANDALF thats a different problem. Check library(aggregate) to solve your problem. — , Nov 10 '20 at 23:17

score 0 · Answer 2 · answered Nov 11 '20 at 16:35

I think there are two hard parts to this:

a very efficient map data structure
an efficient data structure for keeping track of what to recompute

The union-find data structure itself is already implemented in Prolog but hidden behind the somewhat obscure name of "variables". If you have a way of associating terms with variables representing their union-find equivalence classes, then:

the find operation is the lookup of the class variable
the test whether two classes are the same is Class1 == Class2
the union operation is Class1 = Class2

So find is a bottleneck. In imperative languages the lookup is typically constant time, using something like foo.equivalence_class or equivalence_classes[foo.id]. In Prolog we don't have an equivalent (near-)constant-time mapping, in general. But since you only seem to be interested in variables, SWI-Prolog's attributed variables do fit the bill!

We can compute the union-find equivalence classes for a list of terms like this:

compute_classes([]).
compute_classes([Term | Terms]) :-
    term_variables(Term, Variables),
    variables_class(Variables, _NewEquivalenceClass),
    compute_classes(Terms).

variables_class([], _EquivalenceClass).
variables_class([Var | Vars], EquivalenceClass) :-
    (   get_attr(Var, equivalence_class, ExistingEquivalenceClass)
    ->  ExistingEquivalenceClass = EquivalenceClass
    ;   put_attr(Var, equivalence_class, EquivalenceClass) ),
    variables_class(Vars, EquivalenceClass).

Using your example:

?- compute_classes([X+Y, Y+Z, T]).
put_attr(X, equivalence_class, _2772),
put_attr(Y, equivalence_class, _2772),
put_attr(Z, equivalence_class, _2772),
put_attr(T, equivalence_class, _2814).

We can kind of see that X, Y, and Z all share an equivalence class, and T is in a separate one.

Some utilities:

var_class(Var, Class) :-
    get_attr(Var, equivalence_class, Class).

var_var_sameclass(Var1, Var2) :-
    var_class(Var1, Class1),
    var_class(Var2, Class2),
    Class1 == Class2.

var_var_union(Var1, Var2) :-
    var_class(Var1, Class1),
    var_class(Var2, Class2),
    Class1 = Class2.

Continuing the example:

?- compute_classes([X+Y, Y+Z, T]), var_class(X, ClassX), var_class(Y, ClassY), var_class(T, ClassT).
ClassX = ClassY,
put_attr(X, equivalence_class, ClassY),
put_attr(Y, equivalence_class, ClassY),
put_attr(Z, equivalence_class, ClassY),
put_attr(T, equivalence_class, ClassT).

?- compute_classes([X+Y, Y+Z, T]), var_var_sameclass(X, Y).
put_attr(X, equivalence_class, _3436),
put_attr(Y, equivalence_class, _3436),
put_attr(Z, equivalence_class, _3436),
put_attr(T, equivalence_class, _3478).

?- compute_classes([X+Y, Y+Z, T]), var_var_sameclass(X, T).
false.

?- compute_classes([X+Y, Y+Z, T]), var_var_union(Z, T), var_var_sameclass(X, T). 
put_attr(X, equivalence_class, _3502),
put_attr(Y, equivalence_class, _3502),
put_attr(Z, equivalence_class, _3502),
put_attr(T, equivalence_class, _3502).

That is, X and Y really are in the same class, while X and T are not. If we union the classes for Z and T, then suddently X and T are in the same class.

Killing variables is where it gets more tedious. The idea here (as you suggested in the question) is to only recompute "affected" parts of the input. I think this can be done by associating a set of affected terms with each equivalence class. I will use lists here, but I do not suggest that lists are a good choice in practice.

Computing equivalence classes for a list of terms, and a "watchlist" for each equivalence class:

compute_classes_and_watchlists(Terms) :-
    compute_classes(Terms),
    maplist(compute_watchlist, Terms).

compute_watchlist(Term) :-
    term_variables(Term, [RepresentativeVariable | _OtherVars]),
    var_class(RepresentativeVariable, Class),
    (   get_attr(Class, class_watchlist, Watchlist)
    ->  true
    ;   Watchlist = [] ),
    put_attr(Class, class_watchlist, [Term | Watchlist]).

For example:

?- compute_classes_and_watchlists([X+Y, Y+Z, T]).
put_attr(X, equivalence_class, _2932),
put_attr(_2932, class_watchlist, [Y+Z, X+Y]),
put_attr(Y, equivalence_class, _2932),
put_attr(Z, equivalence_class, _2932),
put_attr(T, equivalence_class, _3012),
put_attr(_3012, class_watchlist, [T]).

So if you were to kill any of X, Y, or Z, the watchlist for their class _2932 would tell you that you would need to recompute equivalence classes for the terms Y+Z and X+Y (but nothing else).

Killing itself grabs the killed variable's class and its watchlist (which it "returns") and clears the equivalence class for each variable in that class:

kill_var(Var, TermsToRecompute) :-
    var_class(Var, Class),
    get_attr(Class, class_watchlist, TermsToRecompute),
    del_attr(Class, class_watchlist),
    maplist(clear_class, TermsToRecompute).

clear_class(Term) :-
    term_variables(Term, [RepresentativeVariable | _OtherVars]),
    del_attr(RepresentativeVariable, equivalence_class).

Killing only makes sense if you immediately (a) bind the killed variable to a ground term, and (b) recompute the equivalence classes for the affected terms. In your example:

?- compute_classes_and_watchlists([X+Y, Y+Z, T]), kill_var(Y, TermsToRecompute), Y = y_is_now_bound, compute_classes_and_watchlists(TermsToRecompute).
Y = y_is_now_bound,
TermsToRecompute = [y_is_now_bound+Z, X+y_is_now_bound],
put_attr(X, equivalence_class, _4640),
put_attr(_4640, class_watchlist, [X+y_is_now_bound]),
put_attr(Z, equivalence_class, _4674),
put_attr(_4674, class_watchlist, [y_is_now_bound+Z]),
put_attr(T, equivalence_class, _4708),
put_attr(_4708, class_watchlist, [T]).

This is getting hard to read, but the point is that after killing and binding Y and recomputing the union-find structure over the affected terms, X and Z are now in separate equivalence classes. T is just sitting around unaffected.

All of this kind of assumes that you won't backtrack across a union, although my reading of SWI's attributed var docs is that this wouldn't even do anything terribly wrong. Since put_attr is backtrackable, with some more care this might turn out to be a fairly flexible implementation in which backtracking simply splits classes apart again. Backtracking across a kill could also be made to work, I think.

TODOs:

in the watchlist case, the definition of var_var_union must be changed to merge the watchlists for the two classes (if they are distinct); for actual lists this would be append, but some actual set or more specialized data structure would be better, especially if you expect to have some sort of "stack-like" behavior where the next variable to be killed is most likely one on which you did a recent union operation
especially with the watchlist approach, users must be prevented from accidentally unifying equivalence_class terms; this could be done by using some sort of class(<unique_id>, NakedClassVariable) instead of just the naked variable to represent an equivalence class
the repeated calls to term_variables could probably be be optimized somehow -- in addition to a watchlist of terms, you could also keep a watchlist of the variables of interest

All in all, this is not quite production ready code, but it might give you some ideas.

Dynamic union find algorithm in Prolog

2 Answers2