0

I am practicing with this life expectancy dataset from Kaggle (https://www.kaggle.com/datasets/kumarajarshi/life-expectancy-who?select=Life+Expectancy+Data.csv) and I want to train and visualize a classification and regression tree model. however, I keep getting an error that says "InvocationException: GraphViz's executables not found". I am wondering if this is because of the nature of the continuous numerical target dataset type? how can I visualize the model?

code:

import warnings
warnings.filterwarnings('ignore') 

import pandas as pd
import numpy as np
import seaborn as sn
from sklearn import datasets
from sklearn import metrics
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
from sklearn.tree import export_graphviz
import matplotlib.pyplot as plt,pydotplus
from IPython.display import Image,display

data = pd.read_csv('Life Expectancy Data.csv')
data = data.dropna(how = 'any')

#feature selection
data = data.drop(columns=['infant deaths', ' thinness 5-9 years', 'Alcohol', 'percentage expenditure', 'Hepatitis B', 'Total expenditure', 'Population', ' thinness 5-9 years', 'Year', 'Country'])

# Creating a instance of label Encoder.
le = LabelEncoder()

# Using .fit_transform function to fit label
# encoder and return encoded label
label = le.fit_transform(data['Status'])

# removing the column 'Status' from df
data.drop('Status', axis=1, inplace=True)

# Appending the array to our dataFrame
# with column name 'Status'
data['Status'] = label

#training model
model_data = data
X = data.drop(columns=['Life expectancy '])
y = data['Life expectancy ']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)

model = DecisionTreeRegressor()
model.fit(X_train, y_train)

#visualizing tree
LEtree = tree.export_graphviz(model, 
                feature_names = ['Adult Mortality', 'Measles', ' BMI', 'under-five deaths', 'Polio', 'Diphtheria', ' HIV/AIDS', 'GDP', ' thinness  1-19 years', 'Income composition of resources', 'Schooling', 'Status'],
               class_names = y,
               label = 'all',
               rounded = True,
               filled = True)

graph=pydotplus.graph_from_dot_data(LEtree)
display(Image(graph.create_png()))

full error message:

InvocationException                       Traceback (most recent call last)
Input In [27], in <cell line: 2>()
      1 graph=pydotplus.graph_from_dot_data(LEtree)
----> 2 display(Image(graph.create_png()))

File ~\Anaconda3\lib\site-packages\pydotplus\graphviz.py:1797, in Dot.__init__.<locals>.<lambda>(f, prog)
   1792 # Automatically creates all the methods enabling the creation
   1793 # of output in any of the supported formats.
   1794 for frmt in self.formats:
   1795     self.__setattr__(
   1796         'create_' + frmt,
-> 1797         lambda f=frmt, prog=self.prog: self.create(format=f, prog=prog)
   1798     )
   1799     f = self.__dict__['create_' + frmt]
   1800     f.__doc__ = (
   1801         '''Refer to the docstring accompanying the'''
   1802         ''''create' method for more information.'''
   1803     )

File ~\Anaconda3\lib\site-packages\pydotplus\graphviz.py:1959, in Dot.create(self, prog, format)
   1957     self.progs = find_graphviz()
   1958     if self.progs is None:
-> 1959         raise InvocationException(
   1960             'GraphViz\'s executables not found')
   1962 if prog not in self.progs:
   1963     raise InvocationException(
   1964         'GraphViz\'s executable "%s" not found' % prog)

InvocationException: GraphViz's executables not found
  • You probably need to install Graphviz (and maybe other software). For Graphviz, go here: https://graphviz.org/download/ – sroush May 25 '22 at 15:09
  • @sroush thank you. I have Graphviz installed but I'm still getting the same error. –  May 25 '22 at 17:36
  • Not to be argumentative, but sadly, there are two "Graphviz" packages - the actual "doit" software (on a terminal command line type dot -V and the python interface that is also named graphviz (https://pypi.org/project/graphviz/) – sroush May 25 '22 at 18:05
  • I was able to resolve the issue by editing the path as suggested here: https://stackoverflow.com/questions/28312534/graphvizs-executables-are-not-found-python-3-4 now I'm getting "dot: graph is too large for cairo-renderer bitmaps. Scaling by 0.324307 to fit" any suggesions to get a full image? –  May 26 '22 at 08:23
  • Have you tried SVG output? – sroush May 26 '22 at 15:01

1 Answers1

0

Try Installing the Graphviz in a proper directory

you can install in Anaconda from conda-command-prompt using the below command -

conda install -c conda-forge python-graphviz

and replace the previously installed graphviz directory this might help you with the problem