5

this is the first question I'm posting on stackoverflow so I apologize for any mishaps in layout and so on (advice welcome). Your help is much appreciated!

I'm trying to visualize the output of DecisionTreeRegressor with multiple outputs (as described in http://scikit-learn.org/stable/auto_examples/tree/plot_tree_regression_multioutput.html#example-tree-plot-tree-regression-multioutput-py) in png or pdf format using pydot.

The code I tried looks like this:

...
dtreg = tree.DecisionTreeRegressor(max_depth=3)
dtreg.fit(x,y)

tree.export_graphviz(dtreg, out_file='tree.dot') #print dotfile

dot_data = StringIO()
tree.export_graphviz(dtreg, out_file=dot_data)
print dot_data.getvalue()
pydot.graph_from_dot_data(dot_data.getvalue()).write_pdf("pydot_try.pdf") 

Writing the pdf gives the following errors:

pydot.InvocationException: Program terminated with status: 1. stderr follows: Warning: /tmp/tmpAy7d59:7: string ran past end of line Error: /tmp/tmpAy7d59:8: syntax error near line 8 context: >>> [ <<< 0.20938667] Warning: /tmp/tmpAy7d59:18: string ran past end of line Warning: /tmp/tmpAy7d59:20: string ran past end of line

and so on with more "string ran past end of line" errors.

I've never worked with .dot before, but I suspect there might be a problem with the multi-output format. For example, part of the tree looks like this:

digraph Tree {
0 [label="X[0] <= 56.0000\nmse = 0.0149315126135\nsamples = 41", shape="box"] ;
1 [label="X[0] <= 40.0000\nmse = 0.0137536911947\nsamples = 25", shape="box"] ;
0 -> 1 ;
2 [label="X[0] <= 24.0000\nmse = 0.0152142545276\nsamples = 21", shape="box"] ;
1 -> 2 ;
3 [label="mse = 0.0140\nsamples = 15\nvalue = [[ 0.83384667]
 [ 0.20938667]
 [ 0.08511333]
 [ 0.04234667]
 [ 0.08158   ]
 [ 0.17948667]
 [ 0.03616   ]
 [ 0.00995333]
 [ 0.99529333]
 [ 0.13715333]
 [ 0.10294667]
 [ 0.06632667]]", shape="box"] ;
2 -> 3 ;
4 [label="mse = 0.0170\nsamples = 6\nvalue = [[ 0.69588333]
 [ 0.20275   ]
 [ 0.0953    ]
 [ 0.0436    ]
 [ 0.1216    ]
 [ 0.17248333]
 [ 0.04393333]
 [ 0.01178333]
 [ 0.99913333]
 [ 0.12348333]
 [ 0.10838333]
 [ 0.06973333]]", shape="box"] ;
2 -> 4 ;
}

I don't know how to solve this, because that's just the output I get from DecisionTreeRegressor.

I also tried converting the dot file:

dot -Tpng tree.dot -o tree.png

But this gives the same errors (string ran past end of line) I also tried visualizing tree.dot using xdot and that gave the same error.

CSquare
  • 624
  • 6
  • 16

2 Answers2

1

Follow the instructions below to view the decision tree.

•Using sklearn, we can export the tree in a dot format. A ‘dot’ format file is a text file.

•‘Dot’ file can be converted to an image file using ‘graphviz’ utility

•Download ‘graphviz.msi’ from the website - http://www.graphviz.org/Download_windows.php

•Ensure that ‘\graphviz\bin’ is added to the ‘path’ in environment variables.

A ‘dot’ file can be extracted using sklearn module with the help of following commands

from sklearn import tree
tree.export_graphviz(clf,out_file='tree.dot')

In the command prompt execute the following to convert the ‘.dot’ file to ’.png’ file.

 dot -Tpng tree.dot -o tree.png
Praveen Gupta Sanka
  • 609
  • 1
  • 8
  • 25
0

The error message appears to be telling you that there is a problem with the multiline strings (labels). As shown here, to specify multiline labels in dot you can use \n, or alternatively as described in the DOT language documentation:

As another aid for readability, dot allows double-quoted strings to span multiple physical lines using the standard C convention of a backslash immediately preceding a newline character.

That said, when I attempted to generate your plot using dot on Graphviz version 2.39.20141007.0445 it worked absolutely fine:

enter image description here

I can't find a reference to the format changing, however it may be worth having another attempt with the latest version of Graphviz installed.

Community
  • 1
  • 1
mfitzp
  • 15,275
  • 7
  • 50
  • 70
  • The same code also works fine for me now. I'm not sure what happened/ got fixed, but thanks for pointing it out! – CSquare May 27 '16 at 10:46