7

I was trying to parse the function definition for the python language with PLY. I am encountering issues related to the indentation. For instance for a for statement, I would like to be able to know when the block ends. I read the python grammar here: http://docs.python.org/2/reference/grammar.html And the grammar for this part is:

for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]
suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT

I don't know how to describe the INDENT and DEDENT tokens with PLY. I was trying something like:

def t_indentation(t):
    r'    |\t'
    #some special treatment for the indentation.

But it seems that PLY consider that regexes with spaces match the empty string and does not build the lexer... Even if I would have managed to have the INDENT token I am not sure about the way to get the DEDENT one...

Is there a way to do that with PLY?

joetde
  • 1,556
  • 1
  • 16
  • 26
  • The rules about INDENT and DEDENT in Python are more complex than four spaces or a tab. In other parser generators the problem is solved using semantic actions that determine if the amount of leading space is a valid indent or dedent, or by intervening the tokenizer to inject INDENT or DEDENT tokens. – Apalala Nov 05 '13 at 02:04
  • Yeah, I was thinking about doing the token injection with the t_indentation function. The regex was wrong for PLY, r'[ ]{4}|\t' is better. Thanks. – joetde Nov 05 '13 at 20:23
  • It seems that with plyplus, there is an easier way to do so: https://github.com/erezsh/plyplus/blob/master/plyplus/grammars/python_indent_postlex.py – joetde Jan 03 '14 at 19:39

2 Answers2

3

You have to use states to parse INDENT and UNDENT.

example of parsing python like language

woshifyz
  • 84
  • 5
  • 1
    That example does not use states to parse INDENT and UNDENT; it uses a postprocessing step. – ibid Apr 05 '16 at 14:51
3

PLY includes in its examples one for a subset of Python to demonstrate how to handle indentation:

https://github.com/dabeaz/ply/tree/1321375e013425958ea090b55aecae0a4b7face6/example/GardenSnake

0 _
  • 10,524
  • 11
  • 77
  • 109
  • links dead now, since it was removed in the most recent version of PLY, but the old version is here: https://github.com/dabeaz/ply/blob/3.11/example/GardenSnake/GardenSnake.py – Mark Harviston Aug 01 '21 at 08:09