0

Is there an easy way to perform a partial parse with python ply? In other words: instead of parsing the entire source at once, is it possible to parse up to the end of an expression, yield its result and hand back over control?

For instance, the following toy grammar produces the intended output.

import ply.lex as lex
import ply.yacc as yacc

data = '''\
a sentence.  another
sentence.
'''

tokens = ('WORD', 'DOT')

t_WORD = r'\w+'
t_DOT = r'\.'
t_ignore = ' \t\n'

lexer = lex.lex()

def p_text(p):
    '''text : text sentence DOT
            | sentence DOT'''
    p[0] = '\n'.join(p[1:-1])

def p_sentence(p):
    '''sentence : sentence WORD
                | WORD'''
    p[0] = ' '.join(p[1:])

parser = yacc.yacc()
print(parser.parse(data))      # parses all sentences at once!

However, how can one consume sentence per sentence (as with a generator) and not all sentences at once?

# intended behaviour: does not work!
for sent in parser.partial_parse(data):
    do_something_with(sent)
raywib
  • 33
  • 6
  • Normally, with standard YACC, I would propose a [pull-parser](https://stackoverflow.com/questions/15895124/what-is-push-approach-and-pull-approach-to-parsing) but I don't know whether this is supported by PLY. – Piotr Siupa Aug 21 '23 at 05:56

1 Answers1

0

Each p_ function is invoked when the PLY parser recognizes an instance of the production in the function's docstring. So in your example, p_sentence is invoked each time the parser recognizes a sentence. If you want to pause after each sentence, just insert a pause into p_sentence. E.g.:

def p_sentence(p):
    '''sentence : sentence WORD
                | WORD'''
    p[0] = ' '.join(p[1:])
    input(f"p_sentence just recognized {p[0]!r} Hit Enter to continue...")
Michael Dyck
  • 2,153
  • 1
  • 14
  • 18