I am using CUP with JFlex to validate expression syntax. I have the basic functionality working: I can tell if an expression is valid or not.
Next step is to implement simple arithmetic operations, such as "add 1". For example, if my expression is "1 + a", the result should be "2 + a". I need access to parse tree to do that, because simply identifying a numeric term won't do it: the result of adding 1 to "(1 + a) * b" should be "(1 + a) * b + 1", not "(2 + a) * b".
Does anyone have a CUP example that generates a parse tree? I think I will be able to take it from there.
As an added bonus, is there a way to get a list of all tokens in expression using JFlex? Seems like a typical use case, but I cannot figure out how to do it.
Edit: Found a promising clue on stack overflow: Create abstract tree problem from parser
Discussion of CUP and AST:
http://pages.cs.wisc.edu/~fischer/cs536.s08/lectures/Lecture16.4up.pdf
Specifically, this paragraph:
The Symbol returned by the parser is associated with the grammar’s start symbol and contains the AST for the whole source program
This does not help. How to traverse the tree given Symbol instance, if Symbol class does not have any navigation pointers to its children? In other words, it does not look or behave like a tree node:
package java_cup.runtime;
/**
* Defines the Symbol class, which is used to represent all terminals
* and nonterminals while parsing. The lexer should pass CUP Symbols
* and CUP returns a Symbol.
*
* @version last updated: 7/3/96
* @author Frank Flannery
*/
/* ****************************************************************
Class Symbol
what the parser expects to receive from the lexer.
the token is identified as follows:
sym: the symbol type
parse_state: the parse state.
value: is the lexical value of type Object
left : is the left position in the original input file
right: is the right position in the original input file
******************************************************************/
public class Symbol {
/*******************************
Constructor for l,r values
*******************************/
public Symbol(int id, int l, int r, Object o) {
this(id);
left = l;
right = r;
value = o;
}
/*******************************
Constructor for no l,r values
********************************/
public Symbol(int id, Object o) {
this(id, -1, -1, o);
}
/*****************************
Constructor for no value
***************************/
public Symbol(int id, int l, int r) {
this(id, l, r, null);
}
/***********************************
Constructor for no value or l,r
***********************************/
public Symbol(int sym_num) {
this(sym_num, -1);
left = -1;
right = -1;
value = null;
}
/***********************************
Constructor to give a start state
***********************************/
Symbol(int sym_num, int state)
{
sym = sym_num;
parse_state = state;
}
/*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/
/** The symbol number of the terminal or non terminal being represented */
public int sym;
/*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/
/** The parse state to be recorded on the parse stack with this symbol.
* This field is for the convenience of the parser and shouldn't be
* modified except by the parser.
*/
public int parse_state;
/** This allows us to catch some errors caused by scanners recycling
* symbols. For the use of the parser only. [CSA, 23-Jul-1999] */
boolean used_by_parser = false;
/*******************************
The data passed to parser
*******************************/
public int left, right;
public Object value;
/*****************************
Printing this token out. (Override for pretty-print).
****************************/
public String toString() { return "#"+sym; }
}