I know no two programming languages are perfectly match but I want to ask if I have a simple program like hello world and I run compilation translation phases Such as lex, parse then get the AST tree can I send it to another environment say Some c AST tree and interpret it with Java
3 Answers
The short answer: No.
The longer version:
If you had two different language implementations which documented and exported their AST interfaces, and the two interfaces were sufficiently similar that you could translate between them, then you could compile to an AST and then try to pass the AST to one of those implementations.
I can only speak hypothetically here, because it is pretty uncommon for language implementations to include a externally-accessible AST interface. (One exception is Python, which allows you to compile to an AST, create or modify ASTs, and then compile from an AST. Here, "compile" means "compile to VM code". See the Python docs for more information.)
In particular, I don't know of a Java implementation which that. Both GCC and clang can output something resembling an AST, but neither of them accept one, and the output might not be sufficiently complete to define all aspects of the translation units.

- 234,347
- 28
- 237
- 341
-
Thanks rici but am going to create an argument let say I serialize the AST into a stream of bytes, send it over a socket, and deserialize it back into a tree of objects in a program written in another language. Using JSON, YAML, and XML which are simple, fairly standard languages for serializing and deserializing arbitrary data, Then find parsers for them in the desired language. Which I think is technically possible though? Let me hear your take – user22092 Mar 29 '15 at 18:53
-
@user22092: You're certainly free to try to put some flesh on those bones. It is *theoretically possible*, yes, although your description skates over a number of issues. As I said in the answer, most languages do not define a standard set of AST objects, much less make them available in the standard runtime. Moreover, an AST does not capture the entirety of a parse -- there is also, for example, a symbol table -- and there is no guarantee that the totality of the processed input is a tree at all. (None of the interchange formats you mention can handle graphs.) But like I say, go for it! – rici Mar 30 '15 at 03:44
-
Okay I will give it try currently still working on it – user22092 Apr 02 '15 at 03:45
-
@user22092: consider building a Java AST for "x+y" as, well, "(+ x y)", where x and y are string types. Nothing will stop you from sending that tree to a C compiler that is willing to accept it. But C's interpretation of "(+ ....)" only allows numeric arithmetic. So shipping trees unchanged is a recipe for a semantic disaster. Languages differ enough in their interpretation of operators so this pretty much doesn't work when the operators have the same name. Worse than that, what does the recipient langauge do with an operator it recieves, which is not in its language? ("setjmp" in Java?) – Ira Baxter Apr 04 '15 at 08:57
I'm not aware of any standardized AST
representation formats which would enable such sharing (assuming we're talking about languages with similar semantics), but for instance in the Clang+LLVM architecture it seems that the AST
output can be fed into multiple code generators (compilers).
As far as if there's an universal Java
any-language interpreter reading AST
I guess such thing does not exist and I doubt if it would be even possible to build it as the meaning of words in different programming languages is different.
EDIT 2015-03-30 after clarifying comments
Let's say I serialize the
AST
into a stream of bytes, send it over a socket, and deserialize it back into a tree of objects in a program written in another language. UsingJSON
,YAML
,XML
which are simple, fairly standard languages for serializing and deserializing arbitrary data, Then find parsers for them in the desired language. I think it is technically possible
Having a concrete simple subset of a concrete programming language, let's say a concrete procedural language, e.g. Tiny C, you can on one computer built it's parse trees and send them to another computer for "interpreting". Google query ast intermediate representation
can give you some hints like http://icps.u-strasbg.fr/~pop/gcc-ast.html or http://lambda-the-ultimate.org/node/716, but it's different problem then your original any language with AST
and universal interpreter in Java
I'm working on an experiment
asm.js is a modern version of "parse program in a language on one machine and send it to another machine for interpreting" problem. Where the another machine is any modern web browser and the serialization format is subset of JavaScript
. With several billions of web browser over the planet experiments using this can be both commercially beneficial and useful as this project welcomes some further support or research from guys like you (?)
See also:
-
Thanks xmojmr but am going to create an argument let say I serialize the AST into a stream of bytes, send it over a socket, and deserialize it back into a tree of objects in a program written in another language. Using JSON, YAML, and XML which are simple, fairly standard languages for serializing and deserializing arbitrary data, Then find parsers for them in the desired language. Which I think is technically possible though? Let me hear your take – user22092 Mar 29 '15 at 18:53
-
@user22092 I see. I've added one more chapter to my answer. If you're happy with it and no one provides something better within a reasonable timeframe, then established Stack Overflow way of saying "thanks" is by voting or by [accepting the answer](http://stackoverflow.com/help/someone-answers) – xmojmr Mar 30 '15 at 03:10
Echoing Rici's response: Short answer, no.
This idea has been tried more than once. Usually it fails at least because you cannot define a single AST node for "add" that means one thing for all languages. Semantics just plain differ, and you have to be able to differentiate the meaning of the operator in the specific langauge context in which it is found. There are lots of other troubles, like agreeing on the details of the representation (tree? DAG? graph?) and how much information is carried (AST? Symbol tables? Control flow? ...)
People keep trying.
The Object Management Group has a specification of an Abstract Syntax Tree Model, which attempts to define universal ASTs. What the OMG discovered was, to make this practical, alongside their nirvana-style "General AST Model" (ick, "GASTM"), they needed to also have so-called "Specific AST Models" ("SASTM"), e.g,. ASTs that are specific to the language, nay, even a specific parser for that language, in order to be able to interpret the meaning of the operators and the operands accurately, as produced by that parser.
[I build a tool that handles multiple languages at the same time. It resolves the issue of the meaning of a node by essentially tagging each node with both the operator, e.g., "+", and the "domain" (notational system) in which the operator should be interpreted. In effect, this is the same as the SASTM solution. We dont believe in the GASTM and so don't bother with it.].

- 93,541
- 22
- 172
- 341
-
Okay I have learn so but what I did now is create a Source text, with my own rules and Grammars. Then when I create the tree for that new language I wrote and interpreter in both python and Java to evaluate the tree give me the result. Man I would love for this Universal tree to be a standard. that way we can benchmark code without even after to write it all over or move from language simple – user22092 Apr 14 '15 at 18:15
-
It is one thing to define a "Specific Abstract Syntax Tree (Model)" [SASTM!] with a well defined meaning, such that two different applications can read and correctly interpret that specific model. It is another to believe that you can build a Universal syntax. You're tilting at windmills. – Ira Baxter Apr 14 '15 at 18:26
-
-
I've been doing this for about 40 years. No answers yet. Good luck with that here. – Ira Baxter Apr 14 '15 at 20:22