1

in R I have a list of 100 phlyo objects called called Newick1, Newick2, Newick3, etc. I want to do pairwise comparisons between the trees (e.g. all.equal.phylo(Newick1, Newick2)) but am having difficulty figuring out how to do this efficiently since each file has a different name.

I think something like the for loop below will work, but how do I designate a different file for each iteration of the loop? For obvious reasons the [i] and [j] I put in the code below don't work, but I don't know what to replace them with.

Thank you very much!

for (i in 1:99) {
    for (j in i+1:100) {
        all.equal.phylo(Newick[i], Newick[j]) -> output[i,j] 
} }
Jautis
  • 459
  • 6
  • 13
  • can't you reference them by index? working example: `Newick <- as.list(sample(100, replace = TRUE)); cc <- combn(100, 2); sapply(1:ncol(cc), function(x) all.equal(Newick[[cc[1, x]]], Newick[[cc[2, x]]]))` – rawr Mar 01 '16 at 21:05
  • So I've tried using an index and it seems to be working well except that the `Newick[[cc[1, x]]]` portions come out as an atomic vector rather than referring to the object `Newick[[i]]`. I've tried to correct this by removing the quotes, but that doesn't help. Do you have any ideas on how to resolve this? – Jautis Mar 01 '16 at 23:02
  • I'm not sure what you mean, try `lapply` instead of `sapply`? – rawr Mar 02 '16 at 03:09
  • The issue isn't in the apply function, but in how `[[Newick[[cc[1,x]]]` is interpretted. `Newick1` directs to the phylo object; `Newick[[cc[1,1]]]` outputs as the atomic vector `"Newick1"`. – Jautis Mar 02 '16 at 14:59
  • so try `Newick[cc[1,1]]`? it's hard to tell without a reproducible example – rawr Mar 02 '16 at 16:00

2 Answers2

0

try mget() to reference multiple objects by name

> x1 <- x2 <- x3 <-1
> mget(paste0("x",1:3))
$x1
[1] 1

$x2
[1] 1

$x3
[1] 1
hedgedandlevered
  • 2,314
  • 2
  • 25
  • 54
  • So use something like `mget(paste0("Newick",i)) -> x ; mget(paste0("Newick",j)) -> j ; all.equal(x,j)` in the for loop? – Jautis Mar 01 '16 at 21:17
  • On second though, that doesn't work at all. I get an object with whatever I direct mget() to and then I can't specify that I am interested in the $x component – Jautis Mar 01 '16 at 22:12
  • huh? what would you want with mget other than an object? it returns a list, which you can refer to by elements instead of names, such as `Newick <-mget(paste0("x",1:3))` `Newick[[1]]` `Newick[[2]]` etc. Then you can use your for loop as you described in the OP – hedgedandlevered Mar 02 '16 at 18:24
  • for your example use `x <- mget(paste0("Newick",1:100))` then refer to elements of x or `NewickVec <- unlist(x)` and you can `all.equal()` NewickVec like any other vector – hedgedandlevered Mar 02 '16 at 21:21
  • or once you have the list and you want to compare columns, transform x to a single matrix from a list in the normal way. http://stackoverflow.com/questions/28566588/r-rbind-data-frames-with-a-different-column-name – hedgedandlevered Mar 02 '16 at 21:30
  • What's the problem with this answer? – hedgedandlevered Apr 20 '16 at 17:50
0

You can try a variation on the following:

# make a two column dataframe
# and filter the identical values
df        <- expand.grid(1:100,1:100)
names(df) <- c('i','j')
df        <- df[!df$i == df$j,] 

# example function that takes two parameters
addtwo    <- function(i,j){i + j}

# apply that function across rows of the dataframe
results   <- mapply(addtwo, df$i, df$j)

# using the same logic, 
# your function would look something like this
getdistance <- function(i,j, newicks=NEWICKS) {
    all.equal.phylo(newicks[i], newicks[j]) 
}

# and apply it like this
results   <- mapply(getdistance, df$i, df$j)

Key concepts:

  • expand.grid()
  • mapply()
zach
  • 29,475
  • 16
  • 67
  • 88