0

I'm interested in incrementally parsing the sequence as the terms come in one at a time. This requires that I'm able to determine which of the rewrite rules can consume the current input. So, conceptually I am looking for something like the following.

For the grammar:

abc1 --> [a, b, c]. 
ab --> [a, b].
abc2 --> ab, [c].

In a similar way that we specify arguments that satisfy a goal as variables, I would like to specify the functor itself as a variable e.g., T.

Specifically, I want the functor T([a,b,c]) to return: T = abc1. But then ultimately something recursive like: [abc1, abc2(ab)]

I was thinking to create a top-level rule, which would in its body enumerate all of the grammar's rule heads and perhaps somehow return the trees T that can be constructed by those rules that match the input but this enumeration seems silly since Prolog enumerates the rule heads under the hood anyway and I'm still unsure if I can make this solution work.

Other than that I thought of the lambda package for prolog, but after a glance it doesn't seem to achieve what I need.

Update:

I was pointed in the direction of representing a functor as a list e.g., using =../2 but it seemed that such a solution requires that I explicitly instantiate T to the functor to be called as a list. Whereas, for me the point is to have prolog find that functor by attempting to unify the variable T with available rules until some of them resolve to true with the arguments [a, b, c], [] I supplied.

In other words, I need call(T, [a, b, c],[]). to work but it says T is not sufficiently instantiated.

Update 2:

The solution I'm currently going with is to query all available functors via current_functor to see which ones are satisfied by the input [a, b, c], []. Once I have them I just need that they self-identify e.g., by passing their name up to the arg from the caller, which also seems feasible. Then I'll basically have the names of the rules that matched the input.

false
  • 10,264
  • 13
  • 101
  • 209
user1048677
  • 229
  • 3
  • 13
  • 4
    Could you give a more concrete example of how you would like this to work, more than just `T([a,b,c])`? Or rather what would a more specific query look like? Are you also familiar with the `=../2` predicate? `s(A,B,C) =.. [s, A, B, C]`, and you can query `s(A, B, C)` with `T = s, call(T, A, B, C)`. As stated, it looks like, `s(T, [it, sleeps], []), T =.. [X|_].` would solve your problem (`X` result). – lurker Feb 17 '16 at 02:05
  • The following is of course a degenerate example, but basically for a grammar like: `abc1 --> a, b, c.` `ab --> a, b.` `abc2 --> ab, c.` I want the functor `T([a,b,c])` to return: `T = abc1`. But then ultimately something recursive like: `[abc1, abc2(ab)]`. – user1048677 Feb 17 '16 at 02:21
  • I heard that functors can be represented as lists, but haven't looked into this yet. Thanks! Before I got your comment I started with a more bruteforce solution using `current_functor`, `findall`, and some "return whichever are true" functors in the body of that top-level rule I mentioned in the question. – user1048677 Feb 17 '16 at 02:21
  • @lurker It seems that your suggested solution requires that I know `s` beforehand, and explicitly instantiate `T` to it to be called as a list. Whereas, for me the point is to have prolog find that `s` by attempting to unify the variable `T` with available rules until some of them resolve to true with the arguments `[a, b, c], []` I supplied. That said, I need `call(T, [a, b, c],[]).` to work but it says `T` is not sufficiently instantiated and `=../2` is of no use to me since I don't know what to instantiate T to beforehand. – user1048677 Feb 17 '16 at 02:59
  • 1
    Yeah you're right, `=../2` requires that the functor be instantiated, so that clearly won't work for you. I didn't initially fully understand what you were trying to achieve. You need a pre-defined predicate that can enumerate all of your clauses into a list or something so that you can call them and know which one you called. I don't think such a predicate exists. Typically, if you need your rules to be self-identifying, you would build the self-identification into the rules. – lurker Feb 17 '16 at 03:09
  • There is self-identification: `s(s(NP,VP)) --> np(NP),vp(VP).` the head on the LHS returns its corresponding parse tree segment (this bogus argument `s(NP,VP)`) that binds to the variable in the caller. Now, as I said, to collect all of the rules I can put a disjunction of all rule heads I have defined in the body of the top-level function. But I'm trying to get those automatically via `current_functor` or such. – user1048677 Feb 17 '16 at 03:17
  • 2
    That's because you have the self-identification built into that rule. :) Your actual rule, which has `s(s(NP,VP))` as a head is sort of a "meta-rule" for what you're identifying, which is `s(NP, VP)`. But what you're really after is the rule itself, it sounds like. To apply that concept to your simple example of `abc1 --> a, b, c.` (which really should be written properly as `abc1 --> [a,b,c].`, you would have written it as `s(abc1) --> [a,b,c].`. Then you could say, `phrase(s(T), [a,b,c])` and come up with `T = abc1`. So `s` has identified itself as rule `abc1`. – lurker Feb 17 '16 at 03:20
  • 2
    Could you try to summarize the comments into your question? And maybe work out a full example of what your input is going to be and what results you expect? As you said, maybe you are not thinking about the problem in the most useful terms, but I find it very hard to judge. –  Feb 17 '16 at 07:38
  • 2
    maybe [this answer](http://stackoverflow.com/a/20522716/874024) could be useful – CapelliC Feb 17 '16 at 09:01
  • @Boris, please see the updates. – user1048677 Feb 17 '16 at 15:50
  • Thanks, @CapelliC, I'm now digging through the book's section on bottom-up parsing. It's so far fascinating but I wish I knew beforehand if what they are describing is indeed what I want. I need to avoid the top-downness of sort `s()`, where I have to explicitly mention the top-level constituent `s`. Rather, I need some meta-functor (which I believe is what these guys are doing) which says `parse(Rule, )` and gives which `Rule`'s succeed at parsing those ``. – user1048677 Feb 17 '16 at 16:13
  • I think you could get some useful insight studing [dcg_util](http://www.swi-prolog.org/pack/list?p=dcg_util) by Michael Hendricks (for its semplicity and usefulness). It's (really basic) applied metaprogramming on DCGs – CapelliC Feb 17 '16 at 16:18
  • @CapelliC, so the moment of truth: if I modify the following call of theirs: `parse(s(Tree), [a, program, halts], []).` to `parse(Rule, [a, program], []).`, thereby throwing that top-down `s` away, would it give me rules such as `np`? – user1048677 Feb 17 '16 at 16:20
  • overall, I'm unsure on what is the focus of the problem. A DCG is an extremely thin layer on Prolog. So, adapting to different needs could be not worthy. Specifically, why not reapply the top level nonterminal when you get more tokens ? Don't be afraid of efficiency, since the evaluation is pretty fast - Prolog pattern matching at its best – CapelliC Feb 17 '16 at 16:32
  • @CapelliC I can't reapply the top-level rule because the input is not guaranteed to fully parse into that top-level constituent. I'm interested in finding pieces of input that parse with at least some rules and ultimately seek to generate some of input by the program to complete the parse. – user1048677 Feb 17 '16 at 16:40
  • 1
    If you can get your hands on a copy of "The Craft of Prolog" by Richard O'Keefe, do it. There is a chapter on DCGs (it can be read in isolation, you don't have to go through the rest of the book). It might be that you find answers to your questions there. My experience with this book is that it gets truly useful after you have struggled with a problem on your own for a while... –  Feb 18 '16 at 08:16

1 Answers1

0

I ended up explicitly enumerating all of my grammar's self-identifying rule heads in one top-level rule:

parse(RuleSubtree, Input, []).

where

parse(T) -->
    rule1(T);
    rule2(T);

...

    ruleN(T).

and each rule is self-identified as follows

ruleN(ruleN(SubRule1, SubRule2, ... SubRuleN)) --> 
    ruleX(SubRule1), ruleY(SubRule2), ... ruleZ(SubRuleN).

which gives me the rule structure that matched the input, in the variable T.

Maybe a little hardcoded but it works for my purpose and allows me to advance with my experiment, to which there are more serious challenges on the way.

user1048677
  • 229
  • 3
  • 13