2

I am trying to retrieve the index of a newly-added row, added via a for loop.

Starting from the beginning, I have a list of matrices of p-values, each with a variable number of rows and columns. This is because not all groups have an adequate number of treated individuals to run t-tests. The following is what prints to the console when I access this sample list:

$Group1
                              Normal  Treatment 1  Treatment 2  
Treatment 1                        1           NA           NA
Treatment 2                        1            1           NA
Treatment 3                        1            1            1

$Group2
                              Normal  Treatment 2   
Treatment 2                        1           NA      
Treatment 4                        1            1     

I would like every group to have the same number of rows and columns, in the correct order, with the missing values just filled in with NAs. This is a sample of what I would like:

$Group1
                              Normal  Treatment 1  Treatment 2  Treatment 3 
Treatment 1                        1           NA           NA           NA
Treatment 2                        1            1           NA           NA
Treatment 3                        1            1            1           NA
Treatment 4                       NA           NA           NA           NA

$Group2
                              Normal  Treatment 1  Treatment 2  Treatment 3  
Treatment 1                       NA           NA           NA           NA
Treatment 2                        1           NA           NA           NA
Treatment 3                       NA           NA           NA           NA
Treatment 4                        1            1           NA           NA

Here is the code I have so far:

fix.results.row <- function(x, factors) {
  results.matrix <- x
  num <- 1
  for (i in factors){
    if (!i %in% rownames(results.matrix)) {
      results.matrix <- rbind(results.matrix, NA)
      rownames(results.matrix)[num] <- i
     } 
    num <- num + 1
  }
  rownames(results.matrix) <- results.matrix[rownames(factors),,drop=FALSE]
  return(results.matrix)
}

In the function above, x would be my list of matrices, and factors would be a list of all the factors in the order I want them. I have a similar function for adding columns.

My problem, as I see it, is in Group 2. If it sees that I'm missing Treatment 1, it will replace the rowname Treatment 2 with the rowname Treatment 1, so the data for Treatment 2 is now mislabeled Treatment 1. Then it reorders the variables the way I want them, but the data are already mislabeled!

If I could access the index of the newly-added row, which changes from group to group, then I could just change that specific row name. Any suggestions? Please let me know if there's any more information I need to provide. I tried to cover everything but I'm not sure if there's anything else you all need.

Tropictuco
  • 45
  • 5

2 Answers2

2

This isn't very elegant, but it might work better than using two functions to fill in the rows and columns separately.

Here, x is a list of all your matrices; factor is an optional list of desired row and column names

fix_rc <- function(x, factors) {
  f <- function(x) factor(ul <- unique(unlist(x)), levels = sort(ul))
  if (missing(factors))
    factors <- list(f(sapply(x, rownames)),
                    f(sapply(x, colnames)))

  template <- matrix(NA, length(factors[[1]]), length(factors[[2]]),
                     dimnames = factors)

  lapply(x, function(xx) {
    ## original
    # xx <- rbind(xx, template[, colnames(xx)])
    # xx <- cbind(xx, template[rownames(xx), ])
    # xx[rownames(template), colnames(template)]
    ## better  http://stackoverflow.com/questions/31050787/r-how-to-match-join-2-matrices-of-different-dimensions-nrow-ncol/31051218#31051218
    xx <- as.data.frame.table(xx)
    template[as.matrix(xx[, 1:2])] <- xx$Freq
    template
  })
}

Here is the data I am using

l <- list(Group1 = matrix(c(1,1,1,NA,1,1,NA,NA,1), 3, 3,
                          dimnames = list(paste('Treatment', 1:3),
                                          c('Normal', paste('Treatment', 1:2)))),
          Group2 = matrix(c(1,1,NA,1), 2, 2,
                          dimnames = list(paste('Treatment', c(2,4)),
                                          c('Normal','Treatment 2'))))

# $Group1
#             Normal Treatment 1 Treatment 2
# Treatment 1      1          NA          NA
# Treatment 2      1           1          NA
# Treatment 3      1           1           1
# 
# $Group2
#             Normal Treatment 2
# Treatment 2      1          NA
# Treatment 4      1           1

And you can use it like this. Note that when you don't supply factors, the function will get all the row and column names from your list of matrices

fix_rc(l)

# $Group1
#             Normal Treatment 1 Treatment 2
# Treatment 1      1          NA          NA
# Treatment 2      1           1          NA
# Treatment 3      1           1           1
# Treatment 4     NA          NA          NA
# 
# $Group2
#             Normal Treatment 1 Treatment 2
# Treatment 1     NA          NA          NA
# Treatment 2      1          NA          NA
# Treatment 3     NA          NA          NA
# Treatment 4      1          NA           1

I'm not sure where treatment 3 in the columns in your desired output came from, but you can get that here if you want like so

fix_rc(l, factors = list(paste('Treatment', 1:6),
                         c('Normal', paste('Treatment', 1:3))))

# $Group1
#             Normal Treatment 1 Treatment 2 Treatment 3
# Treatment 1      1          NA          NA          NA
# Treatment 2      1           1          NA          NA
# Treatment 3      1           1           1          NA
# Treatment 4     NA          NA          NA          NA
# Treatment 5     NA          NA          NA          NA
# Treatment 6     NA          NA          NA          NA
# 
# $Group2
#             Normal Treatment 1 Treatment 2 Treatment 3
# Treatment 1     NA          NA          NA          NA
# Treatment 2      1          NA          NA          NA
# Treatment 3     NA          NA          NA          NA
# Treatment 4      1          NA           1          NA
# Treatment 5     NA          NA          NA          NA
# Treatment 6     NA          NA          NA          NA
rawr
  • 20,481
  • 4
  • 44
  • 78
0

Not a complete solution, but if you used data frames: wouldn't it be easier to get there?

df1 <- data.frame(normal=c(1,1,1)
, treatment1=c(NA, 1,1)
, treatment2=c(NA,NA,1)
, row.names=c("Treatment1", "Treatment2", "Treatment3")
)

df2 <- data.frame(normal=c(1,1)
    , treatment2=c(NA,1)
    , row.names=c("Treatment2", "Treatment4")
)

df1$names <- rownames(df1)
df2$names <- rownames(df2)

df3 <- merge(df1,df2, by="names", all=TRUE)

df3

       names normal.x treatment1 treatment2.x normal.y treatment2.y
1 Treatment1        1         NA           NA       NA           NA
2 Treatment2        1          1           NA        1           NA
3 Treatment3        1          1            1       NA           NA
4 Treatment4       NA         NA           NA        1            1

Now all you have to do is combine columns based on their names

David Wagle
  • 141
  • 5