2

I'm trying to plot a decision tree but I get this error:

'Pipeline' object has no attribute 'tree_' 

At first I build my model from a preprocessor (data types int and object):

preprocessor = ColumnTransformer([
    ('one-hot-encoder', categorical_preprocessor, categorical_columns),
    ('standard_scaler', numerical_preprocessor, numerical_columns)])

model3 = make_pipeline(preprocessor, DecisionTreeClassifier())

My pipeline

Then I fit the model and generate the predictions:

model3 = model3.fit(data_train, target_train)
y_pred3 = model3.predict(data_test)

After that I try to plot the tree:

tree.plot_tree(model3)

but I get the error:

AttributeError                            Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_22012/3111274197.py in <module>
----> 1 tree.plot_tree(model3)

~\anaconda3\lib\site-packages\sklearn\tree\_export.py in plot_tree(decision_tree, max_depth, feature_names, class_names, label, filled, impurity, node_ids, proportion, rounded, precision, ax, fontsize)
    193         fontsize=fontsize,
    194     )
--> 195     return exporter.export(decision_tree, ax=ax)
    196 
    197 

~\anaconda3\lib\site-packages\sklearn\tree\_export.py in export(self, decision_tree, ax)
    654         ax.clear()
    655         ax.set_axis_off()
--> 656         my_tree = self._make_tree(0, decision_tree.tree_, decision_tree.criterion)
    657         draw_tree = buchheim(my_tree)
    658 

AttributeError: 'Pipeline' object has no attribute 'tree_'

How can I plot my tree? Or is this impossible because I use a pipeline?

Flavia Giammarino
  • 7,987
  • 11
  • 30
  • 40
PawelKinczyk
  • 91
  • 10
  • 1
    You should access the `DecisionTreeClassifier` instance in your pipeline to be able to plot the tree, which you can do as follows `plot_tree(model3.named_steps['decisiontreeclassifier'])`, where the step name `'decisiontreeclassifier'` is implied by the way [`make_pipeline`](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.make_pipeline.html) is implemented. – amiola Jun 21 '22 at 21:46
  • 1
    @amiola You should write this comment as an answer, because it's spot on. – Matt Hall Jun 21 '22 at 23:30

1 Answers1

2

As stated in comments, you should access the DecisionTreeClassifier instance in your pipeline to be able to plot the tree, which you can do as follows:

plot_tree(model3.named_steps['decisiontreeclassifier'])

named_steps being a property of the Pipeline allowing to access the pipeline's steps by name and 'decisiontreeclassifier' being the step name implied by make_pipeline:

This is a shorthand for the Pipeline constructor; it does not require, and does not permit, naming the estimators. Instead, their names will be set to the lowercase of their types automatically.

amiola
  • 2,593
  • 1
  • 11
  • 25
  • 1
    Works perfect!!! Thank you! Sorry for my another question but can you give me advice how to get all "feature_names" from my pipeline? I try o use this(but it print me an error and i think i will not get all names) : "model3.named_steps['enc'].transformers_[1][1]\ .named_steps['one-hot-encoder'].get_feature_names(categorical_features)" – PawelKinczyk Jun 22 '22 at 21:08
  • 1
    `model3[:-1].get_feature_names_out()` should do the trick (sklearn version 1.0.2 has enriched all transformers with method `.get_feature_names_out()`, though `OneHotEncoder` and `StandardScaler` were already exposing it if I am not wrong). You might find some other details in https://stackoverflow.com/questions/70993316/get-feature-names-after-sklearn-pipeline/71048229#71048229 or https://stackoverflow.com/questions/71830448/how-to-extract-feature-names-from-sklearn-pipeline-transformers?noredirect=1&lq=1 – amiola Jun 22 '22 at 21:18