Quoting the answer from https://github.com/tree-sitter/tree-sitter/discussions/831:
I think the biggest downside to using a Tree-sitter parser in a
compiler front-end is that, while we've done a lot of work on
Tree-sitter's error recovery, we haven't yet built out functionality
for error messages. So it isn't trivial to find out the exact
token/position where the error initiated, and get a list of expected
tokens, and things like that.
Also, the error recovery currently isn't customizable in
domain-specific ways (e.g. as soon as the word "function" appears,
assume that the user meant to write an entire function definition).
Down the road, I would love to invest in both of these things, but
because there's so much other stuff we're working on, it may be a
while before this happens.
I managed to use a tree-sitter parser for a toy language to implement an interpreter in Rust: https://github.com/sgraf812/tree-sitter-lambda/blob/35fe05520e806548dedb48e7f97118847b531b26/src/main.rs
Having done that, I can't recommend it:
- (Rust is a bit of a horrible language to do this, with all the cyclic references. There might be better ways, though.)
- There is no AST, and no means to generate one because tree-sitter does not allow specification of reduction actions (because that again would tie the meta language to the specification language, as is the case for
bison
and C). This means you have to switch over Node::kind
, a string. Inefficient and incomplete matches everywhere.
- The syntax tree nodes only store ranges, not the associated source code string, leading to a bit of an unwieldy API, see the uses of
ut8_text
.
I have a feeling that tree-sitter is best in class only when you don't need a typed overlay of the syntax tree.
See also https://github.com/tree-sitter/tree-sitter/discussions/831#discussioncomment-5797368 for another experience report.