Is it possible to change the label of the node in sklearn Decision Tree?

Question

I am building the Decision Tree model using scikit-learn and after I want to do some leaf rewriting. Basically, I want to change the label of specific leaf nodes.

I am looping over the leaves and based on the tree.DecisionTreeClassifier.tree_ I can get the tree_.value in order to calculate the label of the node. I got it from here. My question is if I can and how force the change of the label for node of the decision tree?

For now, I tried to manually change the values in the tree_.value

from sklearn import tree
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np

df = pd.read_csv("voting.csv", header=0)
y = pd.DataFrame(df.target)
feature_names = []
for col in df.columns:
    if col != 'target':
        feature_names.append(col)

y = df.target
df = df.drop("target", 1)

thr = 0.9
X_train, X_test, y_train, y_test = train_test_split(df, y, test_size=0.2)
clf = tree.DecisionTreeClassifier(min_samples_leaf=3)
clf.fit(X_train, y_train)

node_count = clf.tree_.node_count
class_label = 0
for index in range(node_count):
    # check if it is a leaf
    if clf.tree_.children_right[index] == -1 and clf.tree_.children_left[index] == -1:
    # number of samples in the leaf (correctly classified and misclassified)
    print("Values: ", clf.tree_.value[index])
    # Finding node label
    node_label = clf.classes_[np.argmax(clf.tree_.value[index])]
    values = clf.tree_.value[index]
    correct_samples = values[0][node_label]
    misclassified_samples = np.sum(clf.tree_.value[index]) - correct_samples
    # Change the label if number of misclassified samples is more than 0
    if misclassified_samples > 0 and node_label != class_label:
        clf.tree_.value[index][class_label] = clf.tree_.value[index][class_label] + correct_samples
        print("New values: ", clf.tree_.value[index])

But this results in changing both values, even for correctly classified. And then the label of the node remains the same. For example, before the operation: Values: [[1. 2.]] and after the operation: New values: [[3. 4.]]

Thanks!

can you show what have you tried so far to solve the stated problem? — mnm, Jun 18 '19 at 14:36
https://stackoverflow.com/questions/32530283/how-to-add-feature-names-to-output-of-decision-tree-in-scikit-learn — PV8, Jun 19 '19 at 05:47
I edited my question... I added the code that which I used to try to force the change of the node label. I didn't find if there is other way to try. — nedRad88, Jun 25 '19 at 14:09

Is it possible to change the label of the node in sklearn Decision Tree?

0 Answers0