This is a followup to Using symbol font / math notation in graphviz, and was also posted on the Graphviz discussion forum, http://www.graphviz.org/content/subscripts-greek-letters-dot-edge-labels, but there has been no response.
[Env: graphviz 2.38, Windows 7]
I'm working on a project to create path diagrams for structural equation models in R with the sem package The sem package contains a function, pathDiagram, that does this reasonably well, by constructing the required code for dot.
We use two back-end renderers: dot itself, with -Tpdf, and the R DiagrammeR package, that uses javascript libraries grViz and mermaid.
We recently added code to allow rendering edge labels using greek letters and subscripts, by using the UTF-8 character equivalents, eg
"beta" "β" "β"
"gamma" "γ" "γ"
and
subscripts <- c("₀", "₁", "₂", "₃", "₄", "₅", "₆",
"₇", "₈", "₉")
We find that this works perfectly with DiagrammeR. With dot, we do get the Greek letters, but nothing we have tried allows us to get subscripts from the standard command dot -T pdf -o file.pdf file.dot All we get are those little boxes with the 4-digit character code.
Is this a bug or limitation of dot? Is there any work-around?
Here is an example of a dot file generated by our software that illustrates this behavior.
digraph "union.sem" {
rankdir=LR;
size="8,8";
node [fontname="Helvetica" fontsize=14 fillcolor="transparent" shape=box style=filled];
edge [fontname="Helvetica" fontsize=10];
center=1;
{rank=min "x1"}
{rank=min "x2"}
"y1" [fillcolor="transparent"]
"y2" [fillcolor="transparent"]
"y3" [fillcolor="transparent"]
"x2" -> "y1" [label="γ̂&2081;&2082;=-0.09" color=red penwidth=1.001];
"y1" -> "y2" [label="β₂₁=-0.28" color=red penwidth=1.001];
"x2" -> "y2" [label="γ₂₂=0.06" color=black penwidth=1.001];
"y1" -> "y3" [label="β₃₁=-0.22" color=red penwidth=1.001];
"y2" -> "y3" [label="β₃₁=0.85" color=black penwidth=1.001];
"x1" -> "y3" [label="γ₃₁=0.86" color=black penwidth=1.001];
"x1" -> "x2" [label="σ₁₂=7.14" dir=both color=black penwidth=1.001];
// variable labels:
"y1" [label="Deference"];
"y2" [label="Activism"];
"y3" [label="Sentiment"];
"x1" [label="Years"];
"x2" [label="Age"];
}
And the result (using -Tpng)
(In this example, I also tried using Unicode characters for the subscripts in the x2 -> y1
path, these taken from How to find the unicode of the subscript alphabet?, but these just appear as their &2081;
strings.)