0

I am trying to get values of ranked variables in R. I am calculating annualized standard deviations for a number of commodities. I then rank the standard deviations across years for each commodity. Although I understand the output, I am looking for a better way to associate the year values to the ranked output. My code is below:

annualizedSD <- function(x)
{
annSD = sd(x) * sqrt(length(x))
}
sdByContractByYear <- summaryBy(Settle~contract+yr,data=commodityData, FUN=annualizedSD)
rankSDByContractByYear <- summaryBy(-Settle.annualizedSD~contract, data=sdByContractByYear, FUN=rank)

The output rankings for each year are labeled "Settle.annualizedSD.FUN1, ...FUN2, ... FUN3, ... etc." What I am looking for is the 'yr' or year value, e.g. 1995, 1996, 1997, etc, instead of FUN1, FUN2, etc...

How do I get R's 'rank' function to provide the label of the ranks by year?

fibrou
  • 313
  • 1
  • 5
  • 15
  • 3
    first of all you should add the package you are using, second please give a reproducible example: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example ; after you have done this I might be able to help you, – grrgrrbla Jun 23 '15 at 10:58
  • package used "doBy" truncated output: > rankSDByContractByYear contract -Settle.annualizedSD.FUN1 -Settle.annualizedSD.FUN2 1 coffee 11 10 2 corn 20 10 3 cotton 9 19 4 crudeoil 19 10 5 heatingoil 18 11 6 naturalGas 16 3 .... – fibrou Jun 23 '15 at 13:33
  • @fibrou: This is not a reproducible example. Please read https://stackoverflow.com/help/mcve and provide something that can be copied and pasted into R. – Rob Hall Jun 23 '15 at 13:56

1 Answers1

0

I did not try summaryBy, also see some inconsistency in case of NAs. Using native R you could get it:

namrank=function(z,df,dec=FALSE){
  if(dec)dc=-1 else dc=1 # for ascending or not
  rk=df$Settle.annualizedSD
  names(rk)=df$yr
  data.frame(Cont=z,t(rank(rk*dc,na.last = "keep")))
}

#reproducible example
n=100
set.seed(1234)
commodityData=data.frame(Settle=rnorm(n,5,4),contract=sample(5,n,TRUE),yr=sample(2008:2012,n,TRUE))
sdByContractByYear <- summaryBy(Settle~contract+yr,data=commodityData, FUN=annualizedSD)
rankSDByContractByYear <- summaryBy(-Settle.annualizedSD~contract, data=sdByContractByYear, FUN=rank,na.last = "keep" )

#solution
cont=split(sdByContractByYear,sdByContractByYear$contract)
lisdfs=lapply(1:length(cont),function(z)namrank(z,cont[[z]],dec=TRUE))
rankcby<-Reduce(function(...) merge(..., all=T), lisdfs)
rankcby[,order(names(rankcby))]

  Cont X2008 X2009 X2010 X2011 X2012
1    1    NA     3     1    NA     2
2    2     3     1     2     4     5
3    3     5     4     1     3     2
4    4     4     2     1     3     5
5    5     3    NA     4     2     1
Robert
  • 5,038
  • 1
  • 25
  • 43