1

This post got me started, but I haven't been able to manipulate the expression to sufficiently generate the desired output. As a simplified version of the file contents, let's say I create the following matrix in R:

set.seed(14)
B = matrix(sample(1:100, 9),
      nrow=3,
      ncol=3)

colnames(B) <- c("sam1", "sam2", "sam3")
rownames(B) <- c("obs1", "obs2", "obs3")

It should look something like this:

        sam1    sam2    sam3
obs1    26      54      88
obs2    64      95      40 
obs3    94      49      45

What I'd like to be able to do is to loop through this matrix to calculate the maximum value in each column, then print out a new file which incorporates the value as well as the row name and column name. Thus the desired output would be a new file structured as follows:

sam1    94    obs3
sam2    95    obs2
sam3    88    obs1

If it helps, the file itself need not be a matrix. Rather, it could also be structured as a simple .csv file where obs are themselves are the first column (rather than rowname), and sam are elements across the first row (less the first column).

Thank you for your consideration

Devon O'Rourke
  • 237
  • 2
  • 11
  • Am sorry, I just checked and realized I used `rows` instead of `columns`. I have edited the work and it is now good for the task – Onyambu Jan 29 '18 at 06:26

3 Answers3

1
data.frame(w=colnames(B),x=B[cbind(n<-max.col(B),1:ncol(B))],y=rownames(B)[n])
     w  x    y
1 sam1 94 obs3
2 sam2 95 obs2
3 sam3 88 obs1
Onyambu
  • 67,392
  • 3
  • 24
  • 53
1

@Onyambu beat me to the punch but here is my solution using apply:

C <- data.frame(row.names = colnames(B),
                    MaxVal = apply(B, 2, max),
                    WhichMax = apply(B, 2, which.max))
C
     MaxVal WhichMax
sam1     94        3
sam2     95        2
sam3     88        1
JBGruber
  • 11,727
  • 1
  • 23
  • 45
  • Thanks to both @Onyambu and JonGrub. Both solutions worked for my toy example provided here. Curiously I received an error with Onymabu's solution when I used my real data (not this toy example): Error in reads.mat[cbind(n <- max.col(reads.mat), seq(nrow(reads.mat)))] : subscript out of bounds In the case of JonGrub's solution my real data prints without error, but the data.frame produced prints the number of the list (3, 2, 1 in the toy example); the intention was to print the name itself. Nevertheless, this solution works, and a bit more data wrangling gets me through. – Devon O'Rourke Jan 28 '18 at 18:27
0

with data.table you could do:

library(data.table)
B <- setDT(as.data.frame(B))
B[,name := c("obs1", "obs2", "obs3")]

B loks like

   sam1 sam2 sam3 name
1:   26   54   88 obs1
2:   64   95   40 obs2
3:   94   49   45 obs3

Then you simply melt and take the max value for each variable group

melt(B)[,.SD[value == max(value),.(value,name)],by = variable]

   variable value name
1:     sam1    94 obs3
2:     sam2    95 obs2
3:     sam3    88 obs1
denis
  • 5,580
  • 1
  • 13
  • 40