0

I have following two Prolog files:

ontology.pl:

isSite(Url) :- string(Url).
guestPostPublished(GuestPostId, Date, Site, Url) :-
 string(GuestPostId),
 date(Date),
 isSite(Site),
 string(Url),
 \+(guestPostPublished(GuestPostId, _, _, _)).

invalidFile.pl:

isSite('somesite.com').
guestPostPublished(
    'gp1',
    date(2016,2,2),
    'somesite.com',
    'someUrl').

guestPostPublished(
    'gp1',
    date(2016,2,2),
    'somesite.com',
    'anotherUrl').

invalidFile.pl is invalid because it violates the rule specified in ontology.pl that all GuestPostIds must be unique.

When I load that data into my engine, I except it to throw some exception indicating that the data are invalid. But it doesn't.

What am I doing wrong? How can I make sure that when I feed invalid data to TuProlog engine, I get notification of some sort (e. g. an exception or some flag) ?

Here's the relevant fragment of my code (you can find the entire code here):

@Test
public void test2() throws InvalidObjectIdException, IOException,
        MalformedGoalException, InvalidTheoryException, UnknownVarException, NoSolutionException,
        NoMoreSolutionException, InvalidLibraryException {
    final Prolog engine = createEngine();

    try
    {
        loadPrologFiles(engine, new String[]{
                "src/main/resources/ontology.pl",
                "src/main/resources/invalidFile.pl"
        });
        Assert.fail("Engine swallows invalid Prolog file.");
    }
    catch (final Exception exception) {
        // TODO: Check that the right exception is thrown
    }
    final List<String> result = getResults(engine, "guestPostPublished(_,X,_,_).", "X");
    System.out.println("result: " + result);
}

private Prolog createEngine() throws InvalidObjectIdException {
    final Prolog engine = new Prolog();
    engine.addOutputListener(new OutputListener() {
        public void onOutput(OutputEvent outputEvent) {
            System.out.println(String.format("PROLOG: %s", outputEvent.getMsg()));
        }
    });
    Library lib = engine.getLibrary("alice.tuprolog.lib.OOLibrary");
    ((OOLibrary)lib).register(new Struct("stdout"), System.out);
    return engine;
}

private void loadPrologFiles(final Prolog engine, final String[] files) throws IOException, InvalidTheoryException {
    final List<String> paths = Arrays.asList(files);
    final StringBuilder theoryBuilder = new StringBuilder();

    for (final String path : paths) {
        theoryBuilder.append(System.lineSeparator());
        theoryBuilder.append("% ");
        theoryBuilder.append(path);
        theoryBuilder.append(" (START)");
        theoryBuilder.append(System.lineSeparator());
        theoryBuilder.append(FileUtils.readFileToString(new File(path)));
        theoryBuilder.append(System.lineSeparator());
        theoryBuilder.append("% ");
        theoryBuilder.append(path);
        theoryBuilder.append(" (END)");
        theoryBuilder.append(System.lineSeparator());
    }

    final Theory test1 = new Theory(theoryBuilder.toString());
    engine.setTheory(test1);
}

private List<String> getResults(final Prolog engine, final String query, final String varName) throws
        MalformedGoalException, NoSolutionException, UnknownVarException, NoMoreSolutionException {
    SolveInfo res2 = engine.solve(query);

    final List<String> result = new LinkedList<String>();
    if (res2.isSuccess()) {
        result.add(res2.getTerm(varName).toString());
        while (engine.hasOpenAlternatives()) {
            res2 = engine.solveNext();
            final Term x2 = res2.getTerm("X");
            result.add(x2.toString());
        }
    }
    return result;
}
Glory to Russia
  • 17,289
  • 56
  • 182
  • 325
  • 1
    Usual Prologs do not have a built-in mechanism for data integrity constraints in the same way that a relational database does. Unless TuProlog offers this as a library, you can't just declare a primary key. You can implement it on your own very easily, but: is the database dynamic? Is it going to change during the program's lifetime? Do you need to insert new rows to that table? –  Feb 04 '16 at 08:09
  • @Boris The Prolog files won't change at runtime. What I want the program to do, is this: Read a set of Prolog files and tell me, whether they comply with the rules specified in `ontology.pl`. Then, the program terminates. – Glory to Russia Feb 04 '16 at 08:12
  • What it seems you think you are doing with the two files, `ontology.pl` and `invalidFile.pl`: one is declaring data integrity constraints, and the other contains all the tables. This is not how it works. –  Feb 04 '16 at 08:12
  • @Boris So what is the Prolog way of detecting that fact X contradicts fact Y (preferably without putting any imperative constructs into Prolog code) ? – Glory to Russia Feb 04 '16 at 08:13
  • You might want to change your question title to something like "Data integrity constraints on fact tables". Just a suggestion. –  Feb 04 '16 at 08:31
  • 1
    If you *consult* `invalidFile.pl`, Prolog assumes that the contents establish *facts* and are true by definition. If you want to treat the contents of `invalidFile.pl` as queries, then you want to read the file and attempt to call each fact and check for success or failure on the call. – lurker Feb 04 '16 at 13:23
  • @lurker Which of the two approaches (run facts in files as queries, create predicates for consistency checks as suggested by Boris) is better and why? – Glory to Russia Feb 04 '16 at 13:34
  • 1
    It depends upon a broader context which approach is better for your application. My suggestion assumes that you would have, for example, a predicate that reads in a file of trial facts and attempts to call them and will (in some way) let you know if there was a failure, or will provide a list of failed facts, or whatever it is you want (you haven't really said how you want to handle failure, individually or *en masse*). Boris' solution assumes you already have facts asserted in your database (*e.g.*, you've already consulted `invalidFile.pl`) that you want to question the validity of. – lurker Feb 04 '16 at 13:55
  • @lurker Let's say I add some information as Prolog facts every day. Think progress reports encoded as Prolog facts. Once in a while I want to analyze them. Before I do that, I run the program over all files (one file per day) to find out, whether there is inconsistent data in those files. Then I correct the data (manually, by changing the files) and run the program, until it says that everything is correct. – Glory to Russia Feb 04 '16 at 14:08
  • 1
    If you want to check the files before you consult/assert anything questionable that's in them, then my suggestion would work for that. – lurker Feb 04 '16 at 14:10

2 Answers2

1

To set data integrity constraints on a Prolog table of facts, you need to approach this differently. I would suggest you first try to do it in pure Prolog, without the Java bits, just to get some understanding of what is going on.

If the database is static and does not change, it is easy: just load it, then run queries against it that do the data integrity checks. For example, you have a table site/1 with a single column, and you want to make sure that all values are strings:

There is no site(S) so that S is not a string

\+ ( site(S), \+ string(S) )

If you want to wrap this into a predicate, you must name the predicate with a different name than your table!

site_must_be_string :-
    \+ ( site(S), \+ string(S) ).

Or, for the other one, a unique column (primary key):

There are no duplicates among the first arguments to guest_post_published/4

findall(ID, guest_post_published(ID, _, _, _), IDs),
length(IDs, Len),
sort(IDs, Sorted),   % sort/2 removes duplicates!
length(Sorted, Len). % length does not change after sorting

You probably need to wrap this up in a predicate of its own, too.

1

If you want to check the validity of "alleged" facts before asserting them, you would want to read, rather than consult, the file and attempt to call each alleged fact to see if it succeeds.

As a very simple example, you can do the following:

open('invalidFile.pl', read, S),
read(S, TestFact),
call(TestFact).

The call(TestFact) will succeed if the term read from invalidFile.pl succeeds given your existing facts and rules, otherwise it will fail. You can use this sequence and read all of the alleged facts and test them:

validate_file(File) :-
    open(File, read, S),
    read_terms(S, Terms),
    maplist(call, Terms),   % This will fail if *any* term fails
    close(S).

read_terms(Stream, []):- 
    at_end_of_stream(Stream). 

read_terms(Stream, [Term|Terms]):- 
    \+  at_end_of_stream(Stream), 
    read(Stream, Term), 
    read_terms(Stream, Terms).

In this case, validate_file will fail if any term in the file is false. As an exercise, you can make this smarter by tracking a "term count" or something like that in read_terms and write a predicate that checks a term and feeds back the term number if it fails so you can see which one(s) fail.

lurker
  • 56,987
  • 9
  • 69
  • 103