10

I'm a noob in using sciki-learn so please bear with me.

I was going through the example: http://scikit-learn.org/stable/modules/tree.html#tree

>>> from sklearn.datasets import load_iris
>>> from sklearn import tree
>>> iris = load_iris()
>>> clf = tree.DecisionTreeClassifier()
>>> clf = clf.fit(iris.data, iris.target)
>>> from StringIO import StringIO
>>> out = StringIO()
>>> out = tree.export_graphviz(clf, out_file=out)

Apparently the graphiz file is ready for use.

But how do I draw the tree using the graphiz file? (the example did not go into details as to how the tree is drawn).

Example code and tips are more than welcomed!

Thanks!


Update

I'm using ubuntu 12.04, Python 2.7.3

Ram Narasimhan
  • 22,341
  • 5
  • 49
  • 55
DjangoRocks
  • 13,598
  • 7
  • 37
  • 52
  • Scikit-learn from version 0.21 has method `plot_tree` which is much easier to use than exporting to graphviz. Anyway, there is also a very nice package dtreeviz. Here is a comparison of the visualization methods for sklearn trees: [blog post link](https://mljar.com/blog/visualize-decision-tree/). – pplonski Jul 04 '20 at 15:09

2 Answers2

5

Which OS do you run? Do you have graphviz installed?

In your example, StringIO() object, holds graphviz data, here is one way to check the data:

...
>>> print out.getvalue()

digraph Tree {
0 [label="X[2] <= 2.4500\nerror = 0.666667\nsamples = 150\nvalue = [ 50.  50.  50.]", shape="box"] ;
1 [label="error = 0.0000\nsamples = 50\nvalue = [ 50.   0.   0.]", shape="box"] ;
0 -> 1 ;
2 [label="X[3] <= 1.7500\nerror = 0.5\nsamples = 100\nvalue = [  0.  50.  50.]", shape="box"] ;
0 -> 2 ;
3 [label="X[2] <= 4.9500\nerror = 0.168038\nsamples = 54\nvalue = [  0.  49.   5.]", shape="box"] ;
2 -> 3 ;
4 [label="X[3] <= 1.6500\nerror = 0.0407986\nsamples = 48\nvalue = [  0.  47.   1.]", shape="box"] ;
3 -> 4 ;
5 [label="error = 0.0000\nsamples = 47\nvalue = [  0.  47.   0.]", shape="box"] ;
4 -> 5 ;
6 [label="error = 0.0000\nsamples = 1\nvalue = [ 0.  0.  1.]", shape="box"] ;
4 -> 6 ;
7 [label="X[3] <= 1.5500\nerror = 0.444444\nsamples = 6\nvalue = [ 0.  2.  4.]", shape="box"] ;
3 -> 7 ;
8 [label="error = 0.0000\nsamples = 3\nvalue = [ 0.  0.  3.]", shape="box"] ;
7 -> 8 ;
9 [label="X[0] <= 6.9500\nerror = 0.444444\nsamples = 3\nvalue = [ 0.  2.  1.]", shape="box"] ;
7 -> 9 ;
10 [label="error = 0.0000\nsamples = 2\nvalue = [ 0.  2.  0.]", shape="box"] ;
9 -> 10 ;
11 [label="error = 0.0000\nsamples = 1\nvalue = [ 0.  0.  1.]", shape="box"] ;
9 -> 11 ;
12 [label="X[2] <= 4.8500\nerror = 0.0425331\nsamples = 46\nvalue = [  0.   1.  45.]", shape="box"] ;
2 -> 12 ;
13 [label="X[0] <= 5.9500\nerror = 0.444444\nsamples = 3\nvalue = [ 0.  1.  2.]", shape="box"] ;
12 -> 13 ;
14 [label="error = 0.0000\nsamples = 1\nvalue = [ 0.  1.  0.]", shape="box"] ;
13 -> 14 ;
15 [label="error = 0.0000\nsamples = 2\nvalue = [ 0.  0.  2.]", shape="box"] ;
13 -> 15 ;
16 [label="error = 0.0000\nsamples = 43\nvalue = [  0.   0.  43.]", shape="box"] ;
12 -> 16 ;
}

you can write it as .dot file and produce image output, as showed in source you linked:

$ dot -Tpng tree.dot -o tree.png (PNG format output)

theta
  • 24,593
  • 37
  • 119
  • 159
  • Hi thanks! I'm using Ubuntu 12.04, Python version 2.7.3. I was wondering if there's anyway i can do it within the python script and not in the command line ? – DjangoRocks May 13 '12 at 12:43
  • 1
    Sure, just grab one of available [Python bindings to graphviz](https://www.google.com/search?q=python+graphviz+binding) and you should be able to do it from within python shell – theta May 13 '12 at 14:01
  • is there any method to do the task in python3? – soupault Nov 25 '14 at 07:40
4

You were very close! Just do:

graph_from_dot_data(out.getvalue()).write_pdf("somefile.pdf")
J.J.
  • 350
  • 2
  • 6
  • 1
    this will only work if #classes is small enough that the nvalue arrays in the text are not broken across lines...in this case I've had to manually search/replace \n with '' (preserving the legitimate ones, of course)...bit of a pain. ditto for one-hot encoded labels...they'll throw errors right away. – user1269942 Jan 07 '15 at 21:54