1

I have a data set that has 200,00 rows what I want to do is simple, but I haven't found an answer to how. I tried this:

    data$DCRank<-cut(data$DC,quantile(data$DC,(0:10)/10),include.lowest=TRUE)

But that doesn't give me a 1:10 result.

Ally Kat
  • 191
  • 2
  • 2
  • 10
  • I get an error: data$DCRank<-cut(data$DC=(1:200)%%10+1, breaks=0:10, labels=0:9) Error: unexpected '=' in "data$DCRank<-cut(data$DC= – Ally Kat Jan 13 '16 at 16:39
  • Try this: `breaks <- quantile(data$DC,probs=(seq(0,1,0.1))) ; cut(data$DC, breaks=breaks)` – Ram Narasimhan Jan 13 '16 at 16:42
  • So a couple things I need to keep it in the same dataframe as a new column, and that is still giving me a result that is not a 1:10 result. I can do this in excel, but with a data set this big the time to calc is no good. – Ally Kat Jan 13 '16 at 16:45
  • You can add labels, and store them as a new column in your `data` data frame: `breaks <- quantile(data$DC,probs=(seq(0,1,0.1))) ; data$DCrank <- cut(data$DC, breaks=breaks, labels=0:9)` – Ram Narasimhan Jan 13 '16 at 16:47
  • That works... Mostly. It gives 0 an na rank – Ally Kat Jan 13 '16 at 16:49
  • To take care of 0, you should add `include.lowest=T` in the `cut` statement – Ram Narasimhan Jan 13 '16 at 16:54
  • AWESOME!! thank you. – Ally Kat Jan 13 '16 at 16:59
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/100610/discussion-between-ally-kat-and-ram-narasimhan). – Ally Kat Jan 13 '16 at 17:42

1 Answers1

1

This should work:

DCbreaks <- quantile(data$DC,probs=(seq(0,1,0.1))) 
data$DCrank <- cut(data$DC, breaks=DCbreaks, labels=0:9+1, include.lowest=TRUE)
Ram Narasimhan
  • 22,341
  • 5
  • 49
  • 55
  • This works most of the time however on some of my data sets I get this error ; Error in cut.default(data$Trades, breaks = breaks, labels = 0:9, include.lowest = T) : 'breaks' are not unique – Ally Kat Jan 13 '16 at 17:30
  • This is happening because some of your quantiles are the same. You need to make them unique. Try the suggestion here: http://stackoverflow.com/questions/16184947/cut-error-breaks-are-not-unique – Ram Narasimhan Jan 13 '16 at 18:28