0

im try to visualize decision tree with using mode from image data in python without graphviz using DecisionTreeClassifier but im keep getting error

sklearn.exceptions.NotFittedError: This DecisionTreeClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.

even i try in google colab and VScode it still getting error. My dataset only have 2 columns is ModusH and Index.

Here my dataset examples Dataset

And Here the Code

datapisang= pd.read_csv('DataModusdiperbaiki.csv')           
X= datapisang[['ModusH']]                                    
Y= datapisang[['Index']]                                     
X_train, X_test, Y_train, Y_test = train_test_split(X, Y)    
# Model
DT_model= DecisionTreeClassifier()                            
DT_model.fit(X_train,Y_train)                               
DT_model.print_tree()                                       
data = [Modus_citra] # Mode Image                                        
hasilprediksi = DT_model.predict([data])                     

fn = ['ModusH'] 
cn = ['Index'] 

fig, axes = plt.subplots(nrows = 1,ncols = 1,figsize = (4,4), dpi=300)

tree.plot_tree(DT_model,
           feature_names = fn, 
           class_names=cn,
           filled = True);

fig.savefig('imagename.png')

Im try to get the visualize but it keep error everytime even using graphviz. Im new to this topic, can someone help me ? i appreciate every help.

1 Answers1

0

You should do it as below, as shown in documentation

from sklearn.datasets import load_iris
from sklearn import tree

clf = tree.DecisionTreeClassifier(random_state=0)
iris = load_iris()

clf = clf.fit(iris.data, iris.target)

tree.plot_tree(clf)

Based on edit:

I tried to regenerate your situation by creating a new dataframe; here is the solution:

df = pd.DataFrame(np.random.randint(0,100,size=(100, 1)), columns=["ModusH"])
df['Index'] = np.random.choice(  a=[0, 1, 2,3,4],size=df.shape[0])   

clf = tree.DecisionTreeClassifier(random_state=0)

clf = clf.fit(df.ModusH.to_numpy().reshape(-1, 1), df.Index)

tree.plot_tree(clf)
NZJL
  • 115
  • 9
  • ok this is work , but im getting error DecisionTreeClassifier.__init__() got an unexpected keyword argument 'random_state. My csv file only 2 column data, so can be visualize ? – Doni Fidomen Dec 27 '22 at 11:57
  • @DoniFidomen please update your question, either share your data or implement your solution with `iris` dataset to have a complete picture and helping you with the problem. – NZJL Dec 27 '22 at 12:03
  • how do i share my dataset into stack ? – Doni Fidomen Dec 27 '22 at 12:11
  • Do not share the whole dataset, share some samples like `pandas` dataframe. [pandas sharing](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – NZJL Dec 27 '22 at 12:19
  • i have share some example mydataset, it in the picture – Doni Fidomen Dec 27 '22 at 12:46
  • witch one is index list ? Because error list index out of range – Doni Fidomen Dec 27 '22 at 13:22
  • And also witch one is X and Y ? – Doni Fidomen Dec 27 '22 at 13:26
  • Also in my dataframe i have 5 categories in ModusH : first one is immature (range 32 -60), second is half ripe (range 25-30), Third one is ripe (range 22- 24), fourth one is Very ripe (range 17 -19) , last one is rotten (Hue 8 -10) . Can you help me with that ? – Doni Fidomen Dec 27 '22 at 13:31
  • @DoniFidomen this example I mentioned is based on random data, you should customized based on your needs. If it is not clear I recommend you to reading some articles about basic of machine learning. Thanks :) – NZJL Dec 27 '22 at 13:35