I am creating a function to sort data.frames (Why? Because of reasons). Some of the criteria:
- Works on data.frames
- Aimed at non-interactive use
- Uses base R only
- No dependency on any non-base package
The function looks like this now:
#' @title Sort a data.frame
#' @description Sort a data.frame based on one or more columns
#' @param x A data.frame class object
#' @param by A column in the data.frame. Defaults to NULL, which sorts by all columns.
#' @param decreasing A logical indicating the direction of sorting.
#' @return A data.frame.
#'
sortdf <- function(x,by=NULL,decreasing=FALSE) {
if(!is.data.frame(x)) stop("Input is not a data.frame.")
if(is.null(by)) {
ord <- do.call(order,x)
} else {
if(any(!by %in% colnames(x))) stop("One or more items in 'by' was not found.")
if(length(by) == 1) ord <- order(x[ , by])
if(length(by) > 1) ord <- do.call(order, x[ , by])
}
if(decreasing) ord <- rev(ord)
return(x[ord, , drop=FALSE])
}
Examples
sortdf(iris)
sortdf(iris,"Petal.Length")
sortdf(iris,"Petal.Length",decreasing=TRUE)
sortdf(iris,c("Petal.Length","Sepal.Length"))
sortdf(iris,"Petal.Length",decreasing=TRUE)
What works so far
- Sort data.frame by one or more columns
- Adjust overall direction of sort
But, I need one more feature: The ability to set sorting direction for each column separately by passing a vector of directions for each column specified in by. For example;
sortdf(iris,by=c("Sepal.Width","Petal.Width"),dir=c("up","down"))
Any ideas/suggestions on how to implement this?
Update
Benchmark of answers below:
library(microbenchmark)
library(ggplot2)
m <- microbenchmark::microbenchmark(
"base 1u"=iris[order(iris$Petal.Length),],
"Maël 1u"=sortdf(iris,"Petal.Length"),
"Mikko 1u"=sortdf1(iris,"Petal.Length"),
"arrange 1u"=dplyr::arrange(iris,Petal.Length),
"base 1d"=iris[order(iris$Petal.Length,decreasing=TRUE),],
"Maël 1d"=sortdf(iris,"Petal.Length",dir="down"),
"Mikko 1d"=sortdf1(iris,"Petal.Length",decreasing=T),
"arrange 1d"=dplyr::arrange(iris,-Petal.Length),
"base 2d"=iris[order(iris$Petal.Length,iris$Sepal.Length,decreasing=TRUE),],
"Maël 2d"=sortdf(iris,c("Petal.Length","Sepal.Length"),dir=c("down","down")),
"Mikko 2d"=sortdf1(iris,c("Petal.Length","Sepal.Length"),decreasing=T),
"arrange 2d"=dplyr::arrange(iris,-Petal.Length,-Sepal.Length),
"base 1u1d"=iris[order(iris$Petal.Length,rev(iris$Sepal.Length)),],
"Maël 1u1d"=sortdf(iris,c("Petal.Length","Sepal.Length"),dir=c("up","down")),
"Mikko 1u1d"=sortdf1(iris,c("Petal.Length","Sepal.Length"),decreasing=c(T,F)),
"arrange 1u1d"=dplyr::arrange(iris,Petal.Length,-Sepal.Length),
times=1000
)
autoplot(m)+theme_bw()
R 4.1.0
dplyr 1.0.7