I was looking for this information in the tensorflow_decision_forests
docs (https://github.com/tensorflow/decision-forests) (https://www.tensorflow.org/decision_forests/api_docs/python/tfdf/keras/wrappers/CartModel) and yggdrasil_decision_forests
docs (https://github.com/google/yggdrasil-decision-forests).
I've also taken a look at the code of these two libraries, but I didn't find that information. I'm also curious if I can specify an impurity index to use.
I'm looking for some analogy to sklearn decision tree, where you can specify the impurity index with criterion
parameter.
https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html
For TensorFlow Random Forest i found only a parameter uplift_split_score
:
uplift_split_score: For uplift models only. Splitter score i.e. score optimized by the splitters. The scores are introduced in "Decision trees for uplift modeling with single and multiple treatments", Rzepakowski et al. Notation:
p
probability / average value of the positive outcome,q
probability / average value in the control group. -KULLBACK_LEIBLER
orKL
: - p log (p/q) -EUCLIDEAN_DISTANCE
orED
: (p-q)^2 -CHI_SQUARED
orCS
: (p-q)^2/q Default: "KULLBACK_LEIBLER".
I'm not sure if it's a good lead.