I'm parsing SQL using the excellen JSQLParser library, which uses JavaCC internally.
The grammar consists of TOKEN
s and SPECIAL_TOKEN
s. The latter is used to remove the single and multi-line comments from the token stream before the parser is called, like this:
SPECIAL_TOKEN:
{
< LINE_COMMENT: ("--" | "//") (~["\r","\n"])*>
| < MULTI_LINE_COMMENT: "/*" (~["*"])* "*" ("*" | (~["*","/"] (~["*"])* "*"))* "/">
}
I can use the AST to find all the SPECIAL_TOKENS by just using .next
on the root node and resulting nodes, but then I lose the structure. This gives me just the contents without the parse-context.
I would like to use the context to implement a code-formatter.
My example:
-- 1. this is
-- 2. an example
SELECT * /* All cols */ FROM aap -- two
JOIN b ON a.c=d.c
WHERE /* inline comment */ true
-- an example
;
I want it to be formatted somewhat like this:
-- 1. this is
-- 2. an example
SELECT
* /* All cols */
FROM
aap -- two
JOIN
b ON a.c=d.c
WHERE /* inline comment */ true
-- an example
;
What is the correct approach using javacc?