11

I have been looking at the Roslyn code base and noticed that they have two versions of syntax(One internal and one public). Often these appear to be referred to as "Red" nodes and "Green" nodes. I am wondering if anyone can explain what the reasoning is for having two versions of syntax like this.

dave
  • 4,812
  • 4
  • 25
  • 38
kthompson
  • 803
  • 6
  • 16
  • 2
    See also http://stackoverflow.com/questions/10417169/are-roslyn-syntaxnodes-reused/10417510#10417510 – Eric Lippert Oct 26 '16 at 16:12
  • For more background on this, you can read a paper I've written some time ago. Page 35: https://www.dropbox.com/s/rc9edahndlog0je/MainPaper.pdf?dl=0 – Jeroen Vannevel Oct 27 '16 at 22:30

1 Answers1

16

From Persistence, Facades and Roslyn’s Red-Green Trees:

The “green” tree is immutable, persistent, has no parent references, is built “bottom-up”, and every node tracks its width but not its absolute position. When an edit happens we rebuild only the portions of the green tree that were affected by the edit, which is typically about O(log n) of the total parse nodes in the tree.

The “red” tree is an immutable facade that is built around the green tree; it is built “top-down” on demand and thrown away on every edit. It computes parent references by manufacturing them on demand as you descend through the tree from the top. It manufactures absolute positions by computing them from the widths, again, as you descend.

You, the consumer of the Roslyn API, only ever see the red tree; the green tree is an implementation detail. (And if you use the debugger to peer into the internal state of a parse node you’ll in fact see that there is a reference to another parse node in there of a different type; that’s the green tree node.)

Incidentally, these are called “red/green trees” because those were the whiteboard marker colours we used to draw the data structure in the design meeting. There’s no other meaning to the colours.

Community
  • 1
  • 1
stuartd
  • 70,509
  • 14
  • 132
  • 163