1

As mentioned in the subject I am trying my best to group the bonds into 10 decile each row.

So currently my data looks like as below. The columns are IDs of the bonds and they are ranked *each month. *Now I am trying to group these ranks into 10 deciles (e.g. assume in 05/2015 there were 100 valid bonds, then ranks 1~10 will be group 1, ranks 11~20 group 2 and so on... )

    date       a        b        c        d        e         f        g       h ...
    05/15      8        6        3        2        1         4        5       7 ...
    06/15      7        8        4        3        2         6        5       1 ...
    07/15      3        5        6        7        2         1        4       8 ...
    ...      

and this is what I have been trying to do.

     for (i in 1:116) 
     {for (j in 1:283) {if (j <=max.col(i)/10){print j = 1} 
     else if ( max.col(i)/10<j<=max.col(i)*2/10){print j = 2}
     else if ( max.col(i)*2/10<j<=max.col(i)*3/10){print j = 3}
     else if ( max.col(i)*3/10<j<=max.col(i)*4/10){print j = 4}
     else if ( max.col(i)*4/10<j<=max.col(i)*5/10){print j = 5}
     else if ( max.col(i)*5/10<j<=max.col(i)*6/10){print j = 6}
     else if ( max.col(i)*6/10<j<=max.col(i)*7/10){print j = 7}
     else if ( max.col(i)*7/10<j<=max.col(i)*8/10){print j = 8}
     else if ( max.col(i)*8/10<j<=max.col(i)*9/10){print j = 9}
     else if ( max.col(i)*9/10<j<=max.col(i)*10/10){print j = 10}
      }}

I am new to R and this was the best I could think of. Please help. Can I use quantile function to do this....?

Thanks, hk

[update]

I have tried modifying the loop and still gets an error

     for (i in 1:dim(bondsr3)[1])
     {for (j in 1:dim(bondsr3)[2]) {
     if (is.na(bondsr3[i,j])) { }
     else if (as.numeric(bondsr3[i,j]) <= max.col(i)%/%10){bondsr4[i,j]<-2}
     else if ( max.col(i)%/%10<as.numeric(bondsr3[i,j]) & as.numeric(bondsr3[i,j])<=(max.col(i)%*%2)%/%10){bondsr4[i,j]<-2}
     else if ( (max.col(i)%*%2)%/%10<as.numeric(bondsr3[i,j]) & as.numeric(bondsr3[i,j])<=(max.col(i)%*%3)%/%10){bondsr4[i,j]<-3}
     else if ( (max.col(i)%*%3)%/%10<as.numeric(bondsr3[i,j]) & as.numeric(bondsr3[i,j])<=(max.col(i)%*%4)%/%10){bondsr4[i,j]<-4}
     else if ( (max.col(i)%*%4)%/%10<as.numeric(bondsr3[i,j]) & as.numeric(bondsr3[i,j])<=(max.col(i)%*%5)%/%10){bondsr4[i,j]<-5}
     else if ( (max.col(i)%*%5)%/%10<as.numeric(bondsr3[i,j]) & as.numeric(bondsr3[i,j])<=(max.col(i)%*%6)%/%10){bondsr4[i,j]<-6}
     else if ( (max.col(i)%*%6)%/%10<as.numeric(bondsr3[i,j]) & as.numeric(bondsr3[i,j])<=(max.col(i)%*%7)%/%10){bondsr4[i,j]<-7}
     else if ( (max.col(i)%*%7)%/%10<as.numeric(bondsr3[i,j]) & as.numeric(bondsr3[i,j])<=(max.col(i)%*%8)%/%10){bondsr4[i,j]<-8}
     else if ( (max.col(i)%*%8)%/%10<as.numeric(bondsr3[i,j]) & as.numeric(bondsr3[i,j])<=(max.col(i)%*%9)%/%10){bondsr4[i,j]<-9}
     else if ( (max.col(i)%*%9)%/%10<as.numeric(bondsr3[i,j]) & as.numeric(bondsr3[i,j])<=max.col(i)){bondsr4[i,j]<- 10}}}

but I still get this error message...

      Warning message:
      In `[<-.factor`(`*tmp*`, iseq, value = 10) :
      invalid factor level, NA generated

My desired output is to get 1 for all rankings 1to 10 and 2 from 11 to 20 ... 10 from 91 to 100 where valid number of bonds at a certain month is 100.

Any help would be appreciated.

hk824
  • 21
  • 6
  • What's your desired output? – Heroka Mar 14 '16 at 20:54
  • You should probably reshape your data to long format – talat Mar 14 '16 at 20:55
  • @Heroka I want the ranks to be grouped in 10 decile portfolio. So as I mentioned above, If there is 100 bonds I want rankings 1-10 to be all shown as 1, 11-20 to be 2, 21-30 to be 3....91-100 as 10. – hk824 Mar 14 '16 at 20:57
  • @docendo I also do have the long formatted version. I just thought this way would be easier since I am trying to group them by each month..? – hk824 Mar 14 '16 at 21:00
  • Take a look at https://stackoverflow.com/questions/4126326 and https://stackoverflow.com/questions/2185252/ – talat Mar 14 '16 at 21:06
  • @docendo Thank you. I have already looked at the questions and I cannot figure out how to use "temp$quartile <- with(temp, cut(value, breaks=quantile(value, probs=seq(0,1, by=0.25), na.rm=TRUE), include.lowest=TRUE)) " function when I have to repeat this by each month – hk824 Mar 14 '16 at 21:12
  • Does this work `library(tidyr);library(dplyr);gather(df, bond, value, -date) %>% group_by(date) %>% mutate(decile = ntile(value, 10))`? – talat Mar 14 '16 at 21:13
  • @docendo It doesn't work.. – hk824 Mar 15 '16 at 00:03
  • 2
    Can you show the desired output and be a bit more descriptive than just saying "it doesn't work"? – talat Mar 15 '16 at 11:07
  • @docendo I am trying to group rankings into 10 buckets. My current dataset is ranked each row; so all the bond IDs are ranked each month. What I want to do is to put these bonds into 10 decile buckets based on their rankings **each month**. I have tried the loop above but it keeps showing the warning mentioned above. I also tried using `bondsr4 <- with(bondsr2, cut(rank, breaks=quantile(rank, probs=seq(0,1,0.1), na.rm=TRUE, include.lowest=TRUE, group_by=date)))` but it also shows an error that says `no applicable method for 'group_by_' applied to an object of class "Date"`. – hk824 Mar 15 '16 at 17:59
  • @docendo when I try `bondsr2 %>% gather(ID, rank, -date) %>% group_by(date) %>% mutate(decile = ntiles(bondsr2, rank, bins=10))`, `Error: attempt to select less than one element` keeps showing – hk824 Mar 15 '16 at 18:00
  • The code you tried is incorrect. Try the code from my comment AND provide your desired output in your question – talat Mar 15 '16 at 18:10
  • @docendo oh my god it works!!!! thanks so much – hk824 Mar 15 '16 at 18:31

0 Answers0