0

This is a follow-up question from : Creating treechart from tabbed text in R

I am using following function:

treechart = function(){
library(psych)
fields <- max(count.fields(textConnection(readClipboard()), sep = "\t"))
dat = read.table(text = readClipboard(), sep="\t",col.names = paste0("V", sequence(fields)), header=FALSE, fill=TRUE, strip.white=TRUE, stringsAsFactors=FALSE, na.strings="")

library(zoo)
library(igraph)
# To prepare the data
# carry forward the last value in columns if lower level (col to the right)
# is non-missing
dat[1] <- na.locf(dat[1], na.rm=FALSE)
for(i in ncol(dat):2)  {
  dat[[i-1]] <-  ifelse(!is.na(dat[[i]]), na.locf(dat[[i-1]], na.rm=F), dat[[i-1]])
}            

# get edges for graph
edges <- rbind(na.omit(dat[1:2]),
            do.call('rbind',
                    lapply(1:(ncol(dat)-2), function(i) 
                    na.omit(setNames(dat[(1+i):(2+i)],
                    names(dat[1:2])))))
                       )

# create graph
g <- graph.data.frame(edges)
# Plot graph
E(g)$curved <- 0
plot.igraph(g, vertex.size=0, edge.arrow.size=0 , layout=-layout.reingold.tilford(g)[,2:1])
}

I am using following example data (separated by tabs in text editor or from spreadsheet) which I select and copy with control-C:

AAA 
    BBB
    CCC
    DDD
        III
        JJJ
            LLL
    EEE
        KKK
    FFF
    GGG

Then on running the command 'treechart()' I get following chart: enter image description here

Here DDD and EEE are coming higher than BBB, CCC. Similarly, JJJ is coming before III. How can I correct the function treechart() for this order to be always correct? Thanks for your help.

Community
  • 1
  • 1
rnso
  • 23,686
  • 25
  • 112
  • 234

1 Answers1

1

It's not that the layout is incorrect, it's just that you asked for the layout.reingold.tilford layout and that's what you got. As you can see, it likes to move more complex branches to one side. It does not consider the order that the vertices were specified. I tried writing a new layout function that would preserve order

layout.tree.order <- function(g, vseq=V(g)$name, root=vseq[1]) {
    leaves <- vseq[sapply(V(g)[vseq], function(x) 
        length(unique(neighbors(g, x, mode="out"))))==0]
    ypos <- rep(NA, vcount(g))
    ypos[match(leaves, V(g)$name)]<-rev(seq(0,1,length.out=length(leaves)))

    calcypos<-function(g, vx) {
        if (!is.na(ypos[vx])) {
            p <- ypos[vx]
        } else {
            nb <- unique(neighbors(g, V(g)[vx]))
            p <- mean(sapply(nb, function(x) calcypos(g,x)))
        }
        ypos[vx] <<- p
        return(invisible(p))
    }
    calcypos(g, which(V(g)$name == root))
    xpos <- c(shortest.paths(g, V(g)[which(vseq == root)], V(g), mode="out"))

    cbind(xpos, ypos)
}

Then you just want to change the plot line in your treemap function to add one additional line and change the layout

vseq <- apply(dat, 1, function(x) na.omit(rev(x))[1])
plot.igraph(g, vertex.size=0, edge.arrow.size=0, 
    layout=layout.tree.order(g, vseq))

So the vseq here is what specifies the top-down ordering. Here we use the values in the order that they appear in your dat data frame. This will produce the following plot

enter image description here

MrFlick
  • 195,160
  • 17
  • 277
  • 295