3

So, due to the assignment I was given, I've ended up wondering around what actually the next step might be.

To clarify things, I am supposed to implement a DSL language by using Java. The language should enable the "user" the specify and create questionnaire forms based of course on the input.

The following input should provide the output as given bellow:

form taxOfficeExample { 
 "Did you sell a house in 2010?"
   hasSoldHouse: boolean
 "Did you buy a house in 2010?"
   hasBoughtHouse: boolean
 "Did you enter a loan?"
   hasMaintLoan: boolean

if (hasSoldHouse) {
 "What was the selling price?"
   sellingPrice: money
 "Private debts for the sold house:"
   privateDebt: money
 "Value residue:"
   valueResidue: money = (sellingPrice - privateDebt)
}
}

Step 1 Step 2

The parser technology which I've chosen is the ANTLRv4 which seams to be the best option for this platform. Anyway, I am familiar with all the models, terminology etc - such as parsing, lexems, grammars etc - but one thing is still lacking - the bridge between Java and ANTLRv4.

So basically what I would like to know based on your experience, what is the bridge between ANTLRv4 and Java? For example, once I define a grammar for the DSL, how can that grammar (language) be applied? What is the bridge between those two entities?

I'm asking this questions only because I'm quite new within this area, therefore, any tips, pointers to research papers etc will be appreciated!

Thanks

dsafa
  • 783
  • 2
  • 8
  • 29
  • 1
    Parse the DSL specification instance, build a tree, walk the tree and generate Java code to implement the specification. You are building a tiny compiler. – Ira Baxter Feb 02 '15 at 22:39

1 Answers1

5

You write an ANTLR4 grammar; you'll get an "AST" for free.

Walk the tree --> visit all the nodes.

At each node, you want to try to generate a text string in your targeted language that realizes the effect of that node, assuming that the effects of other nodes already visited have been realized.

As a practical matter, sometimes you have to generate code out-of-order ("a+b" parsed will have "+" as the root and that will get visited first, but it is clear that "a" must be fetched [so need code to generate that], and "b" must be fetched first), or that you will have to collect data from "far away in the tree" to do code generation. This means that sometimes during the tree walk, you must navigate to some other node in the tree that is not the next or previous node in the tree walk.

Often your code generation consists of printing fixed boilerplate code, alternating with text from other places (e.g., "push(A)", "push(B)", "add", where 'push(' ')' and 'add' are boilerplate, and 'A' and 'B' are text generated from other tree nodes.

The code you generate: it could be in any language; you seem to have chosen to do it in Java, so you get to generate Java source code.

The GUI elements are implied by your DSL: most of your operations are "paint a label, paint a yes/no box", so your generated java code is likely to call on some GUI library of your choice that can achieve those actions.

All of this is a standard realization of so-called "syntax-directed translation". If it isn't clear to you at this point, you need to read an article or book on the topic (Aho/Sethi/Ullman Compilers is good), and you to have to actually build a bad one (means, "start coding") in order to understand what goes right and what goes wrong. Having done one, you'll have a lot of insight into what went wrong, and then you can try to do it better.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • Dear Ira, after doing some further research - the approach is completely valid. Except of one thing, AST's are unfortunately deprecated in ANTLR4. And since a requirement is to represent the internal structure of the DSL by using AST's - any idea on the transformation? I've stumbled upon this class - org.antlr.v4.tool.ast - but still haven't found any reference to it. And I do know the reasons why it is deprecated, and tried to make a tradeoff in order to ignore the requirement - but it seems that a requirement is a requirement. Any idea? – dsafa Feb 03 '15 at 19:07
  • ASTs are deprecated in ANTLR4? Maybe you mean "manually defined ASTs"; AFAIK, ANTLR4 will build you a parse tree (OK, not technically an AST but there is little point in belaboring the distinction; you can still walk the tree and build code). As far as spitting text, ANTLR has something called "string templates". – Ira Baxter Feb 03 '15 at 20:38
  • That's right - ASTs are deprecated within ANTLR4. This decision was made due to the fact that there were efficiency and maintainability issue related to this technique. Anyway, I'll go along with v4, and try to model an AST from a parse tree by using OO modelling. Correct me if the approach is not valid ;) – dsafa Feb 05 '15 at 13:01
  • What you are attempting to do is re-invent the AST ("with OO modeling") from the CST. If you really think that is needed... well, then you can do that. I build a system (DMS, see my bio) that arguably only has CSTs, We have found that the "extra abstraction" that a real AST brings isn't a big enough win to bother with. See http://stackoverflow.com/a/1916687/120163 – Ira Baxter Feb 05 '15 at 15:45