The general notation for sorting using the order function is this:
myData.sorted = myData[ order(-myData[,date.idx],-myData[,(1+date.idx)]), ];
I want to sort on a variable number of numeric cols (ncols) in the order they are passed into a function each with their own potential direction
sortDataFrameByNumericColumns = function (ddf, mycols, direction="DESC")
{
n.cols = length(mycols);
n.dirs = length(direction);
sdf = ddf;
vecs = matrix(NA, nrow=dim(sdf)[1],ncol=n.cols);
for(i in 1:n.cols)
{
idx = which( names(sdf)== mycols[i] );
dir = if(n.dirs==1) { direction } else { direction[i]};
if(dir == "ASC")
{
vecs[,i] = sdf[,idx];
} else {
# DESC
vecs[,i] = -sdf[,idx];
}
}
#########################################
## how I want it, doesn't work
#fdf = sdf[order(vecs), ];
#########################################
## non-variadic approach, does work
fdf = sdf[order( vecs[,1],vecs[,2],vecs[,3] ), ];
fdf;
}
# basic usage
mycols = c("year","week","day");
fdf = sortDataFrameByNumericColumns (ddf,mycols,"ASC"); # sort all cols ASC
md5_email year week day V01
7 1768a550126bbf820dd89edecb92895c 2008 29 207 2.6
5 15712907fc659a6714e06659256aa0a2 2009 35 244 2.6
6 3ec0f0a866eeb8e0b419cccd6ea807b5 2010 9 60 4.2
8 8f2a765187594755f64c8d11bf34a3cc 2010 10 67 3.4
10 3b87bffacdd35679a992eadf816120a2 2010 31 216 3.4
2 db539502caf70a3074ac646d21198f5a 2011 16 111 3.4
4 4ee5096244e139d1d87eeaa0bef29d71 2011 21 143 1.0
9 3605e776744be0d11583305b0ede6419 2013 40 280 4.2
1 06da8174757feffd764c7232f965cd7a 2015 4 28 3.4
3 c29e24b16f1c8c6e897b42b45dee9297 2019 2 17 5.0
# basic usage
fdf = sortDataFrameByNumericColumns (ddf,mycols,"DESC"); # sort all cols DESC
md5_email year week day V01
3 c29e24b16f1c8c6e897b42b45dee9297 2019 2 17 5.0
1 06da8174757feffd764c7232f965cd7a 2015 4 28 3.4
9 3605e776744be0d11583305b0ede6419 2013 40 280 4.2
4 4ee5096244e139d1d87eeaa0bef29d71 2011 21 143 1.0
2 db539502caf70a3074ac646d21198f5a 2011 16 111 3.4
10 3b87bffacdd35679a992eadf816120a2 2010 31 216 3.4
8 8f2a765187594755f64c8d11bf34a3cc 2010 10 67 3.4
6 3ec0f0a866eeb8e0b419cccd6ea807b5 2010 9 60 4.2
5 15712907fc659a6714e06659256aa0a2 2009 35 244 2.6
7 1768a550126bbf820dd89edecb92895c 2008 29 207 2.6
# basic usage
mydirs = c("ASC","DESC","ASC");
fdf = sortDataFrameByNumericColumns (ddf,mycols,mydirs); # custom direction on each column ...
md5_email year week day V01
7 1768a550126bbf820dd89edecb92895c 2008 29 207 2.6
5 15712907fc659a6714e06659256aa0a2 2009 35 244 2.6
10 3b87bffacdd35679a992eadf816120a2 2010 31 216 3.4
8 8f2a765187594755f64c8d11bf34a3cc 2010 10 67 3.4
6 3ec0f0a866eeb8e0b419cccd6ea807b5 2010 9 60 4.2
4 4ee5096244e139d1d87eeaa0bef29d71 2011 21 143 1.0
2 db539502caf70a3074ac646d21198f5a 2011 16 111 3.4
9 3605e776744be0d11583305b0ede6419 2013 40 280 4.2
1 06da8174757feffd764c7232f965cd7a 2015 4 28 3.4
3 c29e24b16f1c8c6e897b42b45dee9297 2019 2 17 5.0
I am using the order
function as the engine. From my understanding on other posts, it is the fastest way to perform the operation. The manual states that the value I am passing in (currently a matrix vecs
) needs to be a sequence of vectors. What does that mean?
?order
...
a sequence of numeric, complex, character or logical vectors, all of the same length, or a classed R object.
It needs a sequence of equal-length vectors... I have a matrix vecs
... How do I cast them to sequence of vectors? That is the primary question.
So this works ... but is not variadic.
fdf = sdf[order(vecs[,1],vecs[,2],vecs[,3]), ];
If I could somehow cast vecs
as vecs[,1],vecs[,2],vecs[,3]
variadically, that would be the solution. I recognize do.call
may be another approach, but I am specifically try to understand the ...
notation of the base::order
function.
Here is a sample test case of the data frame:
x = sdf[sample(1:838,10),1:5];
x
md5_email year week day V01
733 06da8174757feffd764c7232f965cd7a 2015 4 28 3.4
546 db539502caf70a3074ac646d21198f5a 2011 16 111 3.4
811 c29e24b16f1c8c6e897b42b45dee9297 2019 2 17 5.0
585 4ee5096244e139d1d87eeaa0bef29d71 2011 21 143 1.0
249 15712907fc659a6714e06659256aa0a2 2009 35 244 2.6
344 3ec0f0a866eeb8e0b419cccd6ea807b5 2010 9 60 4.2
96 1768a550126bbf820dd89edecb92895c 2008 29 207 2.6
346 8f2a765187594755f64c8d11bf34a3cc 2010 10 67 3.4
717 3605e776744be0d11583305b0ede6419 2013 40 280 4.2
410 3b87bffacdd35679a992eadf816120a2 2010 31 216 3.4
And in text format (run the command below, then Cntrl+C this text, then run the command below again):
"md5_email"|"year"|"week"|"day"|"V01"
"06da8174757feffd764c7232f965cd7a"|2015|4|28|3.4
"db539502caf70a3074ac646d21198f5a"|2011|16|111|3.4
"c29e24b16f1c8c6e897b42b45dee9297"|2019|2|17|5
"4ee5096244e139d1d87eeaa0bef29d71"|2011|21|143|1
"15712907fc659a6714e06659256aa0a2"|2009|35|244|2.6
"3ec0f0a866eeb8e0b419cccd6ea807b5"|2010|9|60|4.2
"1768a550126bbf820dd89edecb92895c"|2008|29|207|2.6
"8f2a765187594755f64c8d11bf34a3cc"|2010|10|67|3.4
"3605e776744be0d11583305b0ede6419"|2013|40|280|4.2
"3b87bffacdd35679a992eadf816120a2"|2010|31|216|3.4
where you can read from clipboard...
x = read.table(file = "clipboard", sep = "|", header=TRUE);