You could use TokenStreamRewriter
to get the source code without the purpose node (or accomplish many other rewriting tasks). Here's an example from an application where I conditionally add a top level LIMIT
clause to a MySQL query:
/**
001 * Parses the query to see if there's already a top-level limit clause. If none was found, the query is
002 * rewritten to include a limit clause with the given values.
003 *
004 * @param query The query to check and modify.
005 * @param serverVersion The version of MySQL to use for checking.
006 * @param sqlMode The current SQL mode in the server.
007 * @param offset The limit offset to add.
008 * @param count The row count value to add.
009 *
010 * @returns The rewritten query if the original query is error free and contained no top-level LIMIT clause.
011 * Otherwise the original query is returned.
012 */
013 public checkAndApplyLimits(query: string, serverVersion: number, sqlMode: string, offset: number,
014 count: number): [string, boolean] {
015
016 this.applyServerDetails(serverVersion, sqlMode);
017 const tree = this.startParsing(query, false, MySQLParseUnit.Generic);
018 if (!tree || this.errors.length > 0) {
019 return [query, false];
020 }
021
022 const rewriter = new TokenStreamRewriter(this.tokenStream);
023 const expressions = XPath.findAll(tree, "/query/simpleStatement//queryExpression", this.parser);
024 let changed = false;
025 if (expressions.size > 0) {
026 // There can only be one top-level query expression where we can add a LIMIT clause.
027 const candidate: ParseTree = expressions.values().next().value;
028
029 // Check if the candidate comes from a subquery.
030 let run: ParseTree | undefined = candidate;
031 let invalid = false;
032 while (run) {
033 if (run instanceof SubqueryContext) {
034 invalid = true;
035 break;
036 }
037
038 run = run.parent;
039 }
040
041 if (!invalid) {
042 // Top level query expression here. Check if there's already a LIMIT clause before adding one.
043 const context = candidate as QueryExpressionContext;
044 if (!context.limitClause() && context.stop) {
045 // OK, ready to add an own limit clause.
046 rewriter.insertAfter(context.stop, ` LIMIT ${offset}, ${count}`);
047 changed = true;
048 }
049 }
040 }
051
052 return [rewriter.getText(), changed];
053 }
What is this code doing:
- Line 017: the input is parsed to get a parse tree. If you have done that already, you can pass in the parse tree, of course, instead of parsing again.
- Line 022 prepares a new TokenStreamRewriter instance with your token stream.
- Line 023 uses ANTLR4's XPATH feature to get all nodes of a specific context type. This is where you can retrieve all your purpose contexts in one go. This would also be a solution for your point 2).
- The following lines only check if a new LIMIT clause must be added at all. Not so interesting for you.
- Line 046 is the place where you manipulate the token stream. In this case something is added, but you can also replace or remove nodes.
- Line 052 contains probably what you are most interested in: it returns the original text of the input, but with all the rewrite actions applied.
With this code you can create a temporary java file for compilation. And it could be used to execute two actions from your list at the same time (collect the purposes and remove them).