9

I want to sort a data.frame by multiple columns, ideally using base R without any external packages (though if necessary, so be it). Having read How to sort a dataframe by column(s)?, I know I can accomplish this with the order() function as long as I either:

  1. Know the explicit names of each of the columns.
  2. Have a separate object representing each individual column by which to sort.

But what if I only have one vector containing multiple column names, of length that's unknown in advance?

Say the vector is called sortnames.

data[order(data[, sortnames]), ] won't work, because order() treats that as a single sorting argument.

data[order(data[, sortnames[1]], data[, sortnames[2]], ...), ] will work if and only if I specify the exact correct number of sortname values, which I won't know in advance.

Things I've looked at but not been totally happy with:

  1. eval(parse(text=paste("data[with(data, order(", paste(sortnames, collapse=","), ")), ]"))). Maybe this is fine, but I've seen plenty of hate for using eval(), so asking for alternatives seemed worthwhile.
  2. I may be able to use the Deducer library to do this with sortData(), but like I said, I'd rather avoid using external packages.

If I'm being too stubborn about not using external packages, let me know. I'll get over it. All ideas appreciated in advance!

Community
  • 1
  • 1
MDe
  • 2,478
  • 3
  • 22
  • 27

1 Answers1

9

You can use do.call:

data<-data.frame(a=rnorm(10),b=rnorm(10)) 
data<-data.frame(a=rnorm(10),b=rnorm(10),c=rnorm(10))
sortnames <- c("a", "b")
data[do.call("order", data[sortnames]), ]

This trick is useful when you want to pass multiple arguments to a function and these arguments are in convenient named list.

flodel
  • 87,577
  • 21
  • 185
  • 223
mpiktas
  • 11,258
  • 7
  • 44
  • 57
  • That's great - I've read over the help file for do.call in the past, and always thought this must be really helpful, but never came up with a good reason to use it. Thanks! – MDe May 08 '13 at 13:38
  • 1
    Note that this will work for data.frame. For matrices you must convert them to lists: `lapply(sortnames,function(x)data[,x])` – mpiktas May 08 '13 at 13:42