-1

I have a data set that has 655 Rows, and 21 Columns. I'm currently looping through each column and need to find the top ten of each, but when I use the head() function, it doesn't keep the labels (they are names of bacteria, each column is a sample). Is there a way to create sorted subset of data that sorts the row name along with it?

right now I am doing

topten <- head(sort(genuscounts[,c(1,i)], decreasing = TRUE) n = 10)

but I am getting an error message since column 1 is the list of names.

Thanks!

Hannah
  • 15
  • 1
  • 6
  • Welcome to Stack Overflow. Please provide reproducible example along with expected results. For more info have a [look at this link](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Sotos May 27 '16 at 14:05
  • 1
    maybe `lapply(2:11, function(i) head(mtcars[, c(1, i)][order(mtcars[, i], decreasing = TRUE), ], 10))` of sorts? `sort` returns the sorted vector, `order` returns the indices which you can use to sort the two columns together – rawr May 27 '16 at 14:17
  • Thanks, that works in giving me a "list of 2", is there any way I could extract the entirety of one column now? using that, when I try to do topten[,1] i get an error, although that should be the list of bacteria. – Hannah May 27 '16 at 14:45

1 Answers1

0

Because sort() applies to vectors, it's not going to work with your subset genuscounts[,c(1,i)], because the subset has multiple columns. In base R, you'll want to use order():

thisColumn <- genuscounts[,c(1,i)]
topten <- head(thisColumn[order(thisColumn[,2],decreasing=T),],10)

You could also use arrange_() from the dplyr package, which provides a more user-friendly interface:

library(dplyr)
head(arrange_(genuscounts[,c(1,i)],desc(names(genuscounts)[i])),10)

You'd need to use arrange_() instead of arrange() because your column name will be a string and not an object.

Hope this helps!!

Toby Penk
  • 126
  • 7