1

I made my own, sort of, "decision tree" to find the optimal segmentation of a set of transactions. Using a ttest and a recurring function that splits the data into two using the most significant segmentation variable found in a set of possible segmentation variables at that instance.

The output is this:

('Blue', (('Mid', (42.0, ('Low', (11.64, ('High', (13.55, ('Very Low', (0.0, ('Very High', (3.29, 6.25)))))))))), ('Yellow', (('Mid', (44.39, ('Mid High', (31.61, 13.25)))), ('Mid Low', (47.89, ('Mid', (16.36, ('Very Low', (0.24, ('Low', (6.23, ('Red', (('Mid High', (1.15, ('JA', (0.0, ('Very High', (3.91, ('Low High', (3.76, ('High', (3.21, 1.89)))))))))), ('Low low', (25.33, ('High High', (8.92, ('Mid Mid', (6.28, 3.35))))))))))))))))))))

How could I make a visualization of this? I guess, something similar to a Decision Tree but have no idea how to execute.

Tom Johnson
  • 1,793
  • 1
  • 13
  • 31

1 Answers1

2

Using ete3, you can plot your tree after converting it to Newick format:

dtree = ('Blue', (('Mid', (42.0, ('Low', (11.64, ('High', (13.55, ('Very Low', (0.0, ('Very High', (3.29, 6.25)))))))))), ('Yellow', (('Mid', (44.39, ('Mid High', (31.61, 13.25)))), ('Mid Low', (47.89, ('Mid', (16.36, ('Very Low', (0.24, ('Low', (6.23, ('Red', (('Mid High', (1.15, ('JA', (0.0, ('Very High', (3.91, ('Low High', (3.76, ('High', (3.21, 1.89)))))))))), ('Low low', (25.33, ('High High', (8.92, ('Mid Mid', (6.28, 3.35))))))))))))))))))))
from ete3 import Tree

def newick(t):
    if type(t) != tuple:
        return f'{t}'
    if type(t[0]) == str:
        return f'({newick(t[1])} {t[0]})'
    return f'({newick(t[0])}, {newick(t[1])})'

t = Tree(f'{newick(dtree)};', format=1)
print(t.get_ascii(show_internal=True))

Result:

enter image description here

There is also a graphical viewer but I wasn't able to get internal node labels to show up so we will have to make do with ASCII art for now.

xjcl
  • 12,848
  • 6
  • 67
  • 89
  • This looks really just like what i need. I'll check it out with all my segmentation variables and come back to you. Thanks man! – Santiago Alsua Apr 30 '21 at 15:25