Memorising (and caching) solutions found in a Prolog query?

Question

In this question on StackExchange I've asked (and it has been solved) about a Prolog program I have been trying to create. But while it works in principle, it doesn't scale to my real world need.

Before I start learning yet another language (Datalog), I'd like to try my already done work and know how I can implement in Prolog a way to memorise results from earlier queries such that the same query is only executed once. So, I'm looking for a way to add the result of a successful query to a List and if that same query is asked again, it doesn't redo the calculation, but uses the remembered result.

My main problem is that I cannot find a way to keep the result of a successful query in a list that is passed 'up the chain'.

In

% get this out of the way, quickly
isARelation( Source, Relation, Target, _) :-
    isADirectRelation( Source, Relation, Target).

% Structural Chains
isARelation( Source, Relation, Target, Visited) :-
    \+ member( [Source,Relation,Target], Visited),
    structuralOrDependencyRelation( RelationOne),
    structuralOrDependencyRelation( RelationTwo),
    weakest( Relation, RelationOne, RelationTwo),
    isADirectRelation( Source, RelationOne, Intermediate),
    isARelation( Intermediate, RelationTwo, Target, [[Source,RelationOne,Intermediate]|Visited]).
isARelation( Source, Relation, Target, Visited) :-
    \+ member( [Source,Relation,Target], Visited),
    structuralOrDependencyRelation( RelationOne),
    structuralOrDependencyRelation( RelationTwo),
    weakest( Relation, RelationOne, RelationTwo),
    isADirectRelation( Source, RelationTwo, Intermediate),
    isARelation( Intermediate, RelationOne, Target, [[Source,RelationTwo,Intermediate]|Visited]).

How do I implement that the first call

isARelation(A, B, C, []).

does the calculation of the results, and a second call

isARelation(A, B, C, []).

uses the earlier found result, which is kept 'globally'?

If you're using SWI, there is [a new tabling module](http://www.swi-prolog.org/pldoc/man?section=tabling). I'd expect performance to be better with one of "those other" Prolog distributions that has had tabling for ages, like [XSB](http://xsb.sourceforge.net/shadow_site/manual1/node46.html) or [Yap](https://www.dcc.fc.up.pt/~vsc/Yap/documentation.html#Tabling) or [B-Prolog](http://www.picat-lang.org/bprolog/) but I have not had luck using them myself. — Daniel Lyons, Jan 19 '17 at 20:05
You can also do it manually; just make a new predicate for the cache, check for values in the cache first, if they are not there, compute and store in the cache with `assertz/1` or `asserta/1` depending. — Daniel Lyons, Jan 19 '17 at 20:06
Thanks. I tried tabling in SWI Prolog (simple to do). The memory used quickly ran up to 50GB (SWI Prolog uses just one core) on a 8GB RAM macOS machine (memory is compressed and swapped of course, but as the disk is SSD the penalty is less severe than on a machine with a HD) after which macOS paused the program because I was 'out of application memory' and I had to quit it. I'm really curious is my program works, but it seems that without really big horsepower, I'll never find out. — gctwnl, Jan 19 '17 at 22:12
It actually sounds to me like you're probably generating unnecessary test cases and you should winnow that down. I don't see 50 GB of data here. — Daniel Lyons, Jan 20 '17 at 03:42
In http://stackoverflow.com/questions/41724116/optimising-a-prolog-program-remove-duplicates-repeat-recalculation/41747539#41747539 the OP mentions "60 or so entities and 800 or so direct relations" that we're not given. 50 GB still seems a lot. However, I'm fairly sure that for tabling the visited list could and *should* be removed from the predicate. — Isabelle Newbie, Jan 20 '17 at 18:26
gctwnl: I had another go at this, but you are really making it hard to help you. The code you posted here is missing a definition for `structuralOrDependencyRelation/1`, which you use here but have now removed from the three different versions of your code in this thread and the previous one. You also use `weakest/3`, but the definitions you show in the other thread are still the broken ones. Please edit this question to ensure that it has a complete, self-contained, correct program (including inputs) that *only* needs to be optimized, but does not need to be debugged. — Isabelle Newbie, Jan 21 '17 at 18:45
I found it difficult to decide between changing a question (and thus invalidate a lot of comments made) and updating a question on the basis of what feedback there is. But I just noticed that the platform keeps older versions, so in the future I can update. — gctwnl, Jan 22 '17 at 21:57
Anyway, I've found one reason for the scaling problem (there may be more), which is that I did not (understand, nor) use cuts (!). So where my predicate could do with any first solution found (on eis enough, 'there is a'), I always got everything and there were a lot of calls to that base predicate.That doesn't scale well. I'm now trying to make a combination of using cuts effectively and the Visited list to prevent circularity. — gctwnl, Jan 22 '17 at 21:57

score 4 · Answer 1 · edited May 23 '17 at 12:08

This is not really an answer to your question :(

The other answer has the right idea, but the implementation has a problem. Let's say we want to make a memoized version of squaring a number:

:- dynamic mem_square/2.

square(N, S) :-
    (   mem_square(N, S)
    ;   S is N*N,
        assertz(mem_square(N, S))
    ).

BTW, the parentheses in the other answer are completely unnecessary. These are also unnecessary, but this is how you usually wrap a disjunction, just in case it is part of a conjunction. In other words, this: a ; (b, c) is the same as a ; b, c, but (a ; b), c is not the same.

Now, if I load this program from the toplevel and query:

?- square(3, S).
S = 9. % first time it's fine

?- square(3, S).
S = 9 ;
S = 9. % now there's two

?- square(3, S).
S = 9 ;
S = 9 ;
S = 9. % now three

If you keep on querying a memoized fact, and you backtrack into it, you will keep on computing again and again and adding more and more identical copies of it. Instead, you can try for example this:

:- dynamic mem_square/2.

square(N, S) :-
    (   mem_square(N, S)
    ->  true
    ;   S is N*N,
        assertz(mem_square(N, S))
    ).

Now, there is no choice point.

This is still not a clean implementation if you are meant to have choice multiple solutions. Any solutions after the first will be cut by the ->.

Thanks for elaborating on this! – Daniel Lyons Jan 20 '17 at 20:53 — Daniel Lyons, Jan 20 '17 at 20:53

score 0 · Answer 2 · answered Jan 19 '17 at 20:10

0

This is advice on how to generically do what tabling does. I haven't followed this advice in ages myself so there may be inaccuracies here. Hopefully the rest of the gang will show up and correct me if I'm off-base.

You have a predicate foo/4 that is inefficient.
Add this to your file:
```
:- dynamic(cached_foo/4).
```
Rename foo/4 to compute_foo/4 or something.

Make a new predicate foo/4 that looks like this:

foo(X, Y, Z, Q) :-
  cached_foo(X, Y, Z, Q) ; 
  (
    compute_foo(X, Y, Z, Q), 
    assertz(cached_foo(X, Y, Z, Q))
  ).

answered Jan 19 '17 at 20:10

Daniel Lyons

22,421
2
50
77

1

This will push an unnecessary choice point. It might be better to use a `->` to get rid of it. – Jan 20 '17 at 07:35
@Boris I'm not entirely sure what you mean, can you illustrate? – Daniel Lyons Jan 20 '17 at 14:55
1

Try to run this code for an already "cached" (memoized) fact and you will see -- there will be an unnecessary choice point after it succeeds, and if you backtrack into it, `compute_foo` will be evaluated again and a new (identical to the previous) `cached_foo` will be `assertz`-ed. – Jan 20 '17 at 18:17
1

You should also try and see what happens if you try to collect all solutions with `findall`. – Jan 20 '17 at 20:17

Memorising (and caching) solutions found in a Prolog query?

2 Answers2