1

I have created a tree using the party package in R. The tree is fine, but it is very large (27 terminal nodes). Attempts to print it result in unreadable files, with oval overlapping each other.

How can I create something readable?

Can I print subsections of the tree? E.g. a first page with the top few nodes and then subsequent pages starting at granddaughter nodes?

Any other suggestions for visualizing such a tree?

EDIT Stephen asked for a reproducible example; that's reasonable but hard here; the interrelationships of the variables are key to there being a tree, and those are hard (at least for me!) to simulate. Also, some variables are categorical and some continuous. But here is a version of the output from printing the tree structure in R. (I substituted V1 etc for variable names)

1) v1 == {AS, IT, LS, NS}; criterion = 1, statistic = 106.413
  2) v2 <= 0; criterion = 1, statistic = 37.039
    3) v1 == {NS}; criterion = 1, statistic = 34.458
      4)*  weights = 316 
    3) v1 == {AS, IT, LS}
      5) V3 <= 2; criterion = 1, statistic = 28.409
        6) V4 <= 0; criterion = 0.997, statistic = 15.053
          7) v5 == {A: Maste, B: Bache, C: Assoc}; criterion = 0.964, statistic = 15.43
            8) V6 <= 24.1068; criterion = 0.98, statistic = 11.242
              9)*  weights = 259 
            8) V6 > 24.1068
              10)*  weights = 886 
          7) v5 == {D: Plus2}
            11)*  weights = 38 
        6) V4 > 0
          12) V7 <= 0; criterion = 1, statistic = 22.142
            13)*  weights = 440 
          12) V7 > 0
            14) V8 <= 1; criterion = 0.999, statistic = 17.06
              15)*  weights = 88 
            14) V8 > 1
              16)*  weights = 59 
      5) V3 > 2
        17)*  weights = 100 
  2) v2 > 0
    18)*  weights = 41 
1) v1 == {, BM, CJ, HS}
  19) V4 <= 0; criterion = 1, statistic = 60.5
    20) V7 <= 0; criterion = 1, statistic = 41.949
      21) V9 <= 0; criterion = 0.985, statistic = 15.936
        22)*  weights = 376 
      21) V9 > 0
        23) V8 <= 1; criterion = 1, statistic = 30.046
          24) V10Wks <= 7.142857; criterion = 1, statistic = 19.078
            25) v11 <= 3.5738; criterion = 0.989, statistic = 12.966
              26)*  weights = 524 
            25) v11 > 3.5738
              27)*  weights = 853 
          24) V10Wks > 7.142857
            28)*  weights = 27 
        23) V8 > 1
          29) v12 <= 0; criterion = 1, statistic = 27.748
            30)*  weights = 38 
          29) v12 > 0
            31)*  weights = 88 
    20) V7 > 0
      32) V14 <= 0; criterion = 1, statistic = 25.564
        33) V8 <= 1; criterion = 0.98, statistic = 13.9
          34)*  weights = 115 
        33) V8 > 1
          35)*  weights = 48 
      32) V14 > 0
        36) V13 <= 2; criterion = 0.983, statistic = 11.504
          37)*  weights = 96 
        36) V13 > 2
          38)*  weights = 91 
  19) V4 > 0
    39) V8 <= 1; criterion = 1, statistic = 25.961
      40) V3 <= 0; criterion = 0.999, statistic = 17.093
        41) V14 <= 0; criterion = 0.965, statistic = 10.183
          42)*  weights = 127 
        41) V14 > 0
          43)*  weights = 480 
      40) V3 > 0
        44)*  weights = 172 
    39) V8 > 1
      45) v15 <= 0; criterion = 0.995, statistic = 14.604
        46) V9 <= 0; criterion = 0.987, statistic = 12.104
          47) v1 == {HS}; criterion = 1, statistic = 21.895
            48)*  weights = 43 
          47) v1 == {BM, CJ}
            49) v16 <= 0; criterion = 0.979, statistic = 15.049
              50)*  weights = 30 
            49) v16 > 0
              51)*  weights = 14 
        46) V9 > 0
          52)*  weights = 34 
      45) v15 > 0
        53)*  weights = 141 

I hope that gives some idea of the structure; lots of nodes!

By default, plot in party puts each split in an ellipse and puts additional info for the terminal nodes. But that won't fit on a page.

Peter Flom
  • 2,008
  • 4
  • 22
  • 35
  • 2
    Could you post a working example, some simulated data (using `set.seed()` for reproducibility), that will yield a reproducible tree of similar size? – Stephan Kolassa May 26 '14 at 18:39

1 Answers1

1

Have you tried increasing the image size? I outlined an example here with the partykit package, but it works the same with the party package (I used party for a while until my data set size started crashing party):

How do I jitter the node split strings in plotting ctree output from partykit?

Community
  • 1
  • 1
Mike Burr
  • 146
  • 1
  • 7